7 computational models of consciousness: a...

P1: JzG0521857430c07 CUFX049/Zelazo 0 521 85743 0 printer: cupusbw November 6, 2006 16:21

C H A P T E R 7

Computational Models of Consciousness:A Taxonomy and Some Examples

Ron Sun and Stan Franklin

Abstract

This chapter aims to provide an overview ofexisting computational (mechanistic) mod-els of cognition in relation to the study ofconsciousness, on the basis of psychologi-cal and philosophical theories and data. Itexamines various mechanistic explanationsof consciousness in existing computationalcognitive models. Serving as an example forthe discussions, a computational model ofthe conscious/unconscious interaction, uti-lizing the representational difference expla-nation of consciousness, is described briefly.As a further example, a software agentmodel that captures another explanationof consciousness (the access explanation ofconsciousness) is also described. The discus-sions serve to highlight various possibilitiesin developing computational models of con-sciousness and in providing computationalexplanations of conscious and unconsciouscognitive processes.

Introduction

In this chapter, we aim to present a short sur-vey and a brief evaluation of existing compu-tational (mechanistic) models of cognitionin relation to the study of consciousness.The survey focuses on their explanationsof the difference between conscious andunconscious cognitive processes on the basisof psychological and philosophical theo-ries and data, as well as potential practicalapplications.

Given the plethora of models, theories,and data, we try to provide in this chapter anoverall (and thus necessarily sketchy) exam-ination of computational models of con-sciousness in relation to the available psy-chological data and theories, as well as theexisting philosophical accounts. We come tosome tentative conclusions as to what a plau-sible computational account should be like,synthesizing various operationalized psycho-logical notions related to consciousness.

151


152 the cambridge handbook of consciousness

We begin by examining some foun-dational issues concerning computationalapproaches toward consciousness. Then, var-ious existing models and their explanationsof the conscious/unconscious distinction arepresented. After examining a particularmodel embodying a two-system approach,we look at one embodying a unified (one-system) approach and then at a few addi-tional models.

Computational Explanationsof Consciousness

Work in the area of computational model-ing of consciousness generally assumes thesufficiency and the necessity of mechanis-tic explanations. By mechanistic explana-tion, we mean any concrete computationalprocesses, in the broadest sense of the term“computation.” In general, computation isa broad term that can be used to denoteany process that can be realized on genericcomputing devices, such as Turing machines(or even beyond if there is such a possi-bility). Thus, mechanistic explanations mayutilize, in addition to standard computa-tional notions, a variety of other concep-tual constructs ranging, for example, fromchaotic dynamics (Freeman, 1995), to “Dar-winian” competition (Edelman, 1989), andto quantum mechanics (Penrose, 1994). (Weleave out the issue of complexity for now.)

In terms of the sufficiency of mechanisticexplanations, a general working hypothesis issuccinctly expressed by the following state-ment (Jackendoff, 1987):

Hypothesis of computational sufficiency:every phenomenological distinction iscaused by/supported by/projected from acorresponding computational distinction.

For the lack of a clearly better alterna-tive, this hypothesis remains a viable work-ing hypothesis in the area of computationalmodels of consciousness, despite variouscriticisms (e.g., Damasio, 1994 ; Edelman,1989; Freeman, 1995 ; Penrose, 1994 ; Searle,1980).

On the other hand, the necessity of mech-anistic explanations, according to the fore-going definition of mechanistic processes,should be intuitively obvious to anyone whois not a dualist. If one accepts the universal-ity of computation, then computation, in itsbroadest sense, can be expected to includethe necessary conditions for consciousness.

On the basis of such intuition, we needto provide an explanation of the compu-tational/mechanistic basis of consciousnessthat answers the following questions. Whatkind of mechanism leads to conscious pro-cesses, and what kind of mechanism leadsto unconscious processes? What is the func-tional role of conscious processes (Baars,1988, 2002 ; Sun, 1999a, b)? What is thefunctional role of unconscious processes?There have been many such explanations incomputational or mechanistic terms. Thesecomputational or mechanistic explanationsare highly relevant to the science of con-sciousness as they provide useful theoreticalframeworks for further empirical work.

Another issue we need to address beforewe move on to details of computationalwork is the relation between biological/phy-siological models and computational mod-els in general. The problem with biologi-cally centered studies of consciousness ingeneral is that the gap between phenomenol-ogy and physiology/biology is so great thatsomething else may be needed to bridge it.Otherwise, if we rush directly into com-plex neurophysiological thickets (Edelman,1989; Crick & Koch, 1990; Damasio et al.,1990; LeDoux, 1992 ,), we may lose sightof the forests. Computation, in its broad-est sense, can serve to bridge the gap. Itprovides an intermediate level of explana-tion in terms of processes, mechanisms, andfunctions and helps determine how variousaspects of conscious and unconscious pro-cesses should figure into the architecture ofthe mind (Anderson & Lebiere, 1998; Sun,2002). It is possible that an intermediatelevel between phenomenology and physi-ology/neurobiology might be more apt tocapture fundamental characteristics of con-sciousness (Coward & Sun, 2004). This no-tion of an intermediate level of explanation


computational models of consciousness: a taxonomy and some examples 153

has been variously expounded recently; forexample, in terms of virtual machines bySloman and Chrisley (2003).

Different Computational Accountsof Consciousness

Existing computational explanations of theconscious/unconscious distinction may becategorized based on the following differ-ent emphases: (1) differences in knowledgeorganization (e.g., the SN+PS view, to bedetailed later), (2) differences in knowledge-processing mechanisms (e.g., the PS+SNview), (3) differences in knowledge content(e.g., the episode+activation view), (4) dif-ferences in knowledge representation (e.g.,the localist+distributed view), or (5) dif-ferent processing modes of the same sys-tem (e.g., the attractor view or the thresholdview).

Contrary to some critics, the debateamong these differing views is not analogousto a debate between algebraists and geome-ters in physics (which would be irrelevant).It is more analogous to the wave vs. parti-cle debate in physics concerning the natureof light, which was truly substantive. Let usdiscuss some of the better known views con-cerning computational accounts of the con-scious/unconscious distinction one by one.

First of all, some explanations are basedon recognizing that there are two sepa-rate systems in the mind. The differencebetween the two systems can be explainedin terms of differences in either knowledgeorganization, knowledge-processing mech-anisms, knowledge content, or knowledgerepresentation:� The SN+PS view: an instance of the expla-

nations based on differences in knowl-edge organization. As originally proposedby Anderson (1983) in his ACT* model,there are two types of knowledge: Declar-ative knowledge is represented by seman-tic networks (SN), and it is consciouslyaccessible, whereas procedural knowl-edge is represented by rules in a produc-tion system (PS), and it is inaccessible.

The difference lies in the two differentways of organizing knowledge – whetherin an action-centered way (proceduralknowledge) or in an action-independentway (declarative knowledge). Computa-tionally, both types of knowledge arerepresented symbolically (using eithersymbolic semantic networks or symbolicproduction rules).1 The semantic net-works use parallel spreading activation(Collins & Loftus, 1975) to activate rel-evant nodes, and the production rulescompete for control through parallelmatching and firing. The models embody-ing this view have been used for model-ing a variety of psychological tasks, espe-cially skill learning tasks (Anderson, 1983 ,Anderson & Lebiere, 1998).

� The PS+SN view: an instance of theexplanations based on differences inknowledge-processing mechanisms. Asproposed by Hunt and Lansman (1986),the “deliberate” computational process ofproduction matching and firing in a pro-duction system (PS), which is serial inthis case, is assumed to be a consciousprocess, whereas the spreading activationcomputation (Collins & Loftus, 1975) insemantic networks (SN), which is mas-sively parallel, is assumed to be an uncon-scious process. The model based on thisview has been used to model controlledand automatic processing data in theattention-performance literature (Hunt& Lansman, 1986). Note that this view isthe exact opposite of the view advocatedby Anderson (1983), in terms of the rolesof the two computational mechanismsinvolved. Note also that the emphasis inthis view is on the processing differenceof the two mechanisms, serial vs. parallel,and not on knowledge organization.

� The algorithm + instance view: anotherinstance of the explanations based on dif-ferences in knowledge-processing mech-anisms. As proposed by Logan (1988)and also by Stanley et al. (1989), thecomputation involved in retrieval anduse of instances of past experience isconsidered to be unconscious (Stanley



et al., 1989) or automatic (Logan 1988),whereas the use of “algorithms” involvesconscious awareness. Here the term“algorithm” is not clearly defined andapparently refers to computation morecomplex than instance retrieval/use.Computationally, it was suggested thatthe use of an algorithm is under tightcontrol and carried out in a serial,step-by-step way, whereas instances canbe retrieved in parallel and effortlessly(Logan, 1988). The emphasis here isagain on the differences in process-ing mechanisms. This view is also sim-ilar to the view advocated by Nealand Hesketh (1997), which emphasizesthe unconscious influence of what theycalled episodic memory. Note that theviews by Logan (1988), Stanley et al.(1989), and Neal and Hesketh (1997) arethe exact opposite of the view advo-cated by Anderson (1983) and Bower(1996), in which instances/episodes areconsciously accessed rather than uncon-sciously accessed.

� The episode+activation view: an instanceof the explanations based on differencesin knowledge content. As proposed byBower (1996), unconscious processes arebased on activation propagation throughstrengths or weights (e.g., in a connec-tionist fashion) between different nodesrepresenting perceptual or conceptualprimitives, whereas conscious processesare based on explicit episodic memoryof past episodes. What is emphasizedin this view is the rich spatial-temporalcontext in episodic memory (i.e., the adhoc associations with contextual infor-mation, acquired on an one-shot basis),which is termed type-2 associations asopposed to regular type-1 associations(which are based on semantic related-ness). This emphasis somewhat distin-guishes this view from other views con-cerning instances/episodes (Logan, 1988;Neal & Hesketh, 1997; Stanley et al.1989).2 The reliance on memory of spe-cific events in this view bears some resem-blance to some neurobiologically moti-

vated views that rely on the interplayof various memory systems, such as thatadvocated by Taylor (1997) and McClel-land et al. (1995).

� The localist+distributed representationview: an instance of the explanationsbased on differences in knowledge rep-resentation. As proposed in Sun (1994 ,2002), different representational formsused in different components may beused to explain the qualitative differencebetween conscious and unconsciousprocesses. One type of representation issymbolic or localist, in which one distinctentity (e.g., a node in a connectionistmodel) represents a concept. The othertype of representation is distributed, inwhich a non-exclusive set of entities(e.g., a set of nodes in a connectionistmodel) are used for representing oneconcept, and the representations ofdifferent concepts overlap each other;in other words, a concept is representedas a pattern of activations over a set ofentities (e.g., a set of nodes). Conceptualstructures (e.g., rules) can be imple-mented in the localist/symbolic systemin a straightforward way by connectionsbetween relevant entities. In distributedrepresentations, such structures (includ-ing rules) are diffusely duplicated in away consistent with the meanings of thestructures (Sun, 1994), which capturesunconscious performance. There maybe various connections between corre-sponding representations across the twosystems. (A system embodying this view,CLARION, is described later.)

In contrast to these two-systems views,there exist some theoretical views that insiston the unitary nature of the conscious andthe unconscious. That is, they hold that con-scious and unconscious processes are differ-ent manifestations of the same underlyingsystem. The difference between consciousand unconscious processes lies in the differ-ent processing modes for conscious versusunconscious information within the same



system. There are several possibilities in thisregard:

� The threshold view: As proposed by var-ious researchers, including Bowers et al.(1990), the difference between con-scious and unconscious processes can beexplained by the difference between acti-vations of mental representations abovea certain threshold and activations ofsuch representations below that thresh-old. When activations reach the thresholdlevel, an individual becomes aware of thecontent of the activated representations;otherwise, although the activated repre-sentations may influence behavior, theywill not be accessible consciously.

� The chunking view: As in the mod-els described by Servan-Schreiber andAnderson (1987) and by Rosenbloomet al. (1993), a chunk is considered a uni-tary representation and its internal work-ing is oblique (although its input/outputare accessible). A chunk can be a pro-duction rule (as in Rosenbloom et al.,1993) or a short sequence of perceptual-motor elements (as in Servan-Schreiber& Anderson, 1987). Because of the lackof transparency of the internal workingof a chunk, it is equated with implicitlearning (Servan-Schreiber & Anderson,1987) or automaticity (Rosenbloom et al.,1993). According to this view, the differ-ence between conscious and unconsciousprocesses is the difference between usingmultiple (simple) chunks (involving someconsciousness) and using one (complex)chunk (involving no consciousness).

� The attractor view: As suggested by themodel of Mathis and Mozer (1996), beingin a stable attractor of a dynamic system(a neural network in particular) leads toconsciousness. The distinction betweenconscious and unconscious processes isreduced to the distinction of being in astable attractor and being in a transientstate. O’Brien and Opie (1998) proposedan essentially similar view. This view maybe generalized to a general coherenceview – the emphasis may be placed on the

role of internal consistency in producingconsciousness. There has been supportfor this possibility from neuroscience, forexample, in terms of a coherent “thalamo-cortical core” (Edelman & Tononi, 2000).

� The access view: As suggested by Baars(1988), consciousness is believed to helpmobilize and integrate mental functionsthat are otherwise disparate and inde-pendent. Thus, consciousness is aimed atsolving the relevance problem – findingthe exact internal resources needed todeal with the current situation. Some evi-dence has been accumulated for this view(Baars, 2002). A computational imple-mentation of Baars’ theory in the formof IDA (a running software agent system;Franklin et al., 1998) is described in detaillater. See also Coward and Sun (2004).

The coexistence of these various viewsof consciousness seems quite analogous tothe parable of the Blind Men and the Ele-phant. Each of them captures some aspect ofthe truth about consciousness, but the por-tion of the truth captured is limited by theview itself. None seems to capture the wholepicture.

In the next two sections, we look intosome details of two representative compu-tational models, exemplifying either two-system or one-system views. The modelsillustrate what a plausible computationalmodel of consciousness should be like, syn-thesizing various psychological notions andrelating to various available psychologicaltheories.

A Model Adopting theRepresentational Difference View

Let us look into the representational dif-ference view as embodied in the cogni-tive architecture Clarion (which stands forConnectionist Learning with Rule InductionON-line; Sun 1997, 2002 , 2003), as an exam-ple of the two-system views for explainingconsciousness.



.

Top Level

Bottom Level

explicit representation

implicit representation

implicit knowledge

explicit knowledge

explicit learning process

implicit learning process

Figure 7.1. The CLARION model.

The important premises of subsequentdiscussions are the direct accessibility of con-scious processes and the direct inaccessibilityof unconscious processes. Conscious pro-cesses should be directly accessible – thatis, directly verbally expressible – withoutinvolving intermediate interpretive or trans-formational steps, which is a requirementprescribed and/or accepted by many the-oreticians (see, e.g., Clark, 1992 ; Hadley,1995).3 Unconscious processes should be,in contrast, inaccessible directly (but theymight be accessed indirectly through some

Figure 7.2 . Comparisons of the two levels of the CLARION architecture.

interpretive processes), thus exhibiting dif-ferent psychological properties (see, e.g.,Berry & Broadbent, 1988; Reber, 1989; morediscussions later).

An example model in this regard isClarion, which is a two-level model thatuses the localist and distributed represen-tations in the two levels, respectively, andlearns using two different methods in thetwo levels, respectively. In developing themodel, four criteria were hypothesized (seeSun, 1994), on the basis of the aforemen-tioned considerations: (1) direct accessibilityof conscious processes; (2) direct inaccessi-bility of unconscious processes; and further-more, (3) linkages from localist concepts todistributed features: once a localist conceptis activated, its corresponding distributedrepresentations (features) are also activated,as assumed in most cognitive models, rang-ing from Tversky (1977) to Sun (1995);4

and (4) linkages from distributed featuresto localist concepts: under appropriate cir-cumstances, once some or most of the dis-tributed features of a concept are activated,the localist concept itself can be activated to“cover” these features (roughly correspond-ing to categorization; Smith & Medin, 1981).

The direct inaccessibility of unconsciousknowledge can be best captured by a “sub-symbolic” distributed representation such asthat provided by a backpropagation network(Rumelhart et al., 1986), because represen-tational units in a distributed representation



representation

reinforcement

goal setting

filtering

selection

regulation

goal structure

drives

MS MCS

NACSACS

action–centered


action–centered implicit non–action–centered

non–action–centered


implicit representation

Figure 7.3 . The implementation of CLARION. ACS denotes the action-centered subsystem, NACSthe non-action-centered subsystem, MS the motivational subsystem, and MCS the metacognitivesubsystem. The top level contains localist encoding of concepts and rules. The bottom level containsmultiple (modular) connectionist networks for capturing unconscious processes. The interaction ofthe two levels and the information flows are indicated with arrows.

are capable of accomplishing tasks butare generally uninterpretable directly (seeRumelhart et al., 1986; Sun, 1994). In con-trast, conscious knowledge can be cap-tured in computational modeling by a sym-bolic or localist representation (Clark &Karmiloff-Smith, 1993 ; Sun & Bookman1994), in which each unit has a clear concep-tual meaning/interpretation (i.e., a semanticlabel). This captures the property of con-scious processes being directly accessible andmanipulable (Smolensky, 1988; Sun, 1994).This difference in representation leads to atwo-level structure whereby each level usesone type of representation (Sun, 1994 , 1995 ,1997; Sun et al., 1996, 1998, 2001). The bot-tom level is based on distributed represen-tation, whereas the top level is based onlocalist/symbolic representation. For learn-

ing, the bottom level uses gradual weighttuning, whereas the top level uses explicit,one-shot hypothesis testing learning, in cor-respondence with the representational char-acteristics of the two levels. There are var-ious connections across the two levels forexerting mutual influences. See Figure 7.1for an abstract sketch of the model. The dif-ferent characteristics of the two levels aresummarized in Figure 7.2 .

Let us look into some implementationaldetails of Clarion. Note that the detailsof the model have been described exten-sively in a series of previous papers, includ-ing Sun (1997, 2002 , 2003), Sun and Peter-son (1998), and Sun et al. (1998, 2001).It has a dual representational structure –implicit and explicit representations being intwo separate “levels” (Hadley, 1995 ; Seger,



1994). Essentially it is a dual-process the-ory of mind (Chaiken & Trope, 1999). Italso consists of a number of functional sub-systems, including the action-centered sub-system, the non-action-centered subsystem,the metacognitive subsystem, and the moti-vational subsystem (see Figure 7.3).

Let us first focus on the action-centeredsubsystem of Clarion. In this subsystem, thetwo levels interact by cooperating in actions,through a combination of the action rec-ommendations from the two levels, respec-tively, as well as cooperating in learningthrough a bottom-up and a top-down pro-cess (to be discussed below). Actions andlearning of the action-centered subsystemmay be described as follows:

1. Observe the current state x.2 . Compute in the bottom level the “values”

of x associated with each of all the possi-ble actions ai’s: Q(x, a1), Q(x, a2 ), . . . . . . ,Q(x, an) (to be explained below).

3 . Find out all the possible actions (b1,b2 , . . . . , bm) at the top level, based on theinput x (sent up from the bottom level)and the rules in place.

4 . Compare or combine the values of theais with those of bjs (sent down fromthe top level), and choose an appropriateaction b.

5 . Perform the action b, and observe the nextstate y and (possibly) the reinforcement r.

6. Update Q-values at the bottom levelin accordance with the Q-Learning-Backpropagation algorithm (to beexplained later).

7. Update the rule network at the top levelusing the Rule-Extraction-Refinement algo-rithm (to be explained later).

8. Go back to Step 1.

In the bottom level of the action-centeredsubsystem, implicit reactive routines arelearned: A Q-value is an evaluation of the“quality” of an action in a given state: Q(x,a) indicates how desirable action a is in statex (which consists of some sensory input).The agent may choose an action in any statebased on Q-values (for example, by choos-

ing the action with the highest Q-value). Toacquire the Q-values, one may use the Q-learning algorithm (Watkins 1989), a rein-forcement learning algorithm. It basicallycompares the values of successive actionsand adjusts an evaluation function on thatbasis. It thereby develops reactive sequentialbehaviors.

The bottom level of the action-centeredsubsystem is modular; that is, a number ofsmall neural networks coexist, each of whichis adapted to specific modalities, tasks, orgroups of input stimuli. This coincides withthe modularity claim (Baars, 1988; Cosmides& Tooby, 1994 ; Edelman, 1987; Fodor, 1983 ;Hirschfield & Gelman, 1994 ; Karmiloff-Smith, 1986) that much processing in thehuman mind is done by limited, encapsu-lated (to some extent), specialized proces-sors that are highly effcient. Some of thesemodules are formed evolutionarily; that is,given a priori to agents, reflecting their hard-wired instincts and propensities (Hirsch-field & Gelman, 1994). Some of them can belearned through interacting with the world(computationally through various decompo-sition methods; e.g., Sun & Peterson, 1999).

In the top level of the action-centeredsubsystem, explicit conceptual knowledgeis captured in the form of rules. Symbolic/localist representations are used. See Sun(2003) for further details of encoding (theyare not directly relevant here).

Humans are clearly able to learn implicitknowledge through trial and error, withoutnecessarily utilizing a priori explicit knowl-edge (Seger, 1994). On top of that, explicitknowledge can be acquired, also from ongo-ing experience in the world, and possibly th-rough the mediation of implicit knowledge(i.e., bottom-up learning; see Karmilof-Smith, 1986; Stanley et al., 1989; Sun, 1997,2002 ; Willingham et al., 1989). The basicprocess of bottom-up learning is as follows(Sun, 2002). If an action decided by thebottom level is successful, then the agentextracts a rule that corresponds to the actionselected by the bottom level and adds therule to the top level. Then, in subsequentinteraction with the world, the agent ver-ifies the extracted rule by considering the



outcome of applying the rule: If the outcomeis not successful, then the rule should bemade more specific and exclusive of the cur-rent case, and if the outcome is successful,the agent may try to generalize the rule tomake it more universal (e.g., Michalski,1983). The details of the bottom-up learningalgorithm (the Rule-Extraction-Refinementalgorithm) can be found in Sun and Peter-son (1998). After rules have been learned, avariety of explicit reasoning methods may beused. Learning explicit conceptual represen-tation at the top level can also be useful inenhancing learning of implicit reactive rou-tines (reinforcement learning) at the bottomlevel.

Although Clarion can learn even whenno a priori or externally provided knowl-edge is available, it can make use of it whensuch knowledge is available (cf. Anderson,1983 ; Schneider & Oliver, 1991). To dealwith instructed learning, externally providedknowledge (in the forms of explicit concep-tual structures, such as rules, plans, routines,categories, and so on) should (1) be com-bined with autonomously generated concep-tual structures at the top level (i.e., internal-ization) and (2) be assimilated into implicitreactive routines at the bottom level (i.e.,assimilation). This process is known as top-down learning. See Sun (2003) for furtherdetails.

The non-action-centered subsystem rep-resents general knowledge about the world,which is equivalent to the notion of seman-tic memory (as in, e.g., Quillian, 1968). Itmay be used for performing various kindsof retrievals and inferences. It is under thecontrol of the action-centered subsystem(through the actions of the action-centeredsubsystem). At the bottom level, associa-tive memory networks encode non-action-centered implicit knowledge. Associationsare formed by mapping an input to an out-put. The regular backpropagation learningalgorithm can be used to establish such asso-ciations between pairs of input and output(Rumelhart et al., 1986).

On the other hand, at the top level ofthe non-action-centered subsystem, a gen-eral knowledge store encodes explicit non-

action-centered knowledge (Sun, 1994). Inthis network, chunks are specified throughdimensional values. A node is set up at thetop level to represent a chunk. The chunknode (a symbolic representation) connectsto its corresponding features (dimension-value pairs) represented as nodes in thebottom level (which form a distributedrepresentation). Additionally, links betweenchunks at the top level encode explicit asso-ciations between pairs of chunks, known asassociative rules. Explicit associative rulesmay be formed (i.e., learned) in a variety ofways (Sun, 2003).

On top of associative rules, similarity-based reasoning may be employed inthe non-action-centered subsystem. Dur-ing reasoning, a known (given or inferred)chunk may be automatically compared withanother chunk. If the similarity betweenthem is sufficiently high, then the latterchunk is inferred (see Sun, 2003 , for details).Similarity-based and rule-based reasoningcan be intermixed. As a result of mixingsimilarity-based and rule-based reasoning,complex patterns of reasoning emerge. Asshown by Sun (1994), different sequencesof mixed similarity-based and rule-basedreasoning capture essential patterns ofhuman everyday (mundane, common-sense)reasoning.

As in the action-centered subsystem,top-down or bottom-up learning may takeplace in the non-action-centered subsystem,either to extract explicit knowledge in thetop level from the implicit knowledge in thebottom level or to assimilate explicit knowl-edge of the top level into implicit knowledgein the bottom level.

The motivational subsystem is concernedwith drives and their interactions (Toates,1986). It is concerned with why an agentdoes what it does. Simply saying that anagent chooses actions to maximizes gains,rewards, or payoffs leaves open the quest-ion of what determines these things. The re-levance of the motivational subsystem to theaction-centered subsystem lies primarily inthe fact that it provides the context in whichthe goal and the payoff of the action-cente-red subsystem are set. It thereby influences



the working of the action-centered subsys-tem and, by extension, the working of thenon-action-centered subsystem.

A bipartite system of motivational rep-resentation is again in place in Clarion.The explicit goals (such as “finding food”)of an agent (which is tied to the workingof the action-centered subsystem) may begenerated based on internal drive states (forexample, “being hungry”). See Sun (2003)for details.

Beyond low-level drives concerning phys-iological needs, there are also higher-leveldrives. Some of them are primary, in thesense of being “hardwired.” For example,Maslow (1987) developed a set of thesedrives in the form of a “need hierarchy.”Whereas primary drives are built-in and rel-atively unalterable, there are also “derived”drives, which are secondary, changeable, andacquired mostly in the process of satisfyingprimary drives.

The metacognitive subsystem is closelytied to the motivational subsystem. Themetacognitive subsystem monitors, controls,and regulates cognitive processes for the sakeof improving cognitive performance (Nel-son, 1993 ; Sloman & Chrisley, 2003 ; Smithet al., 2003). Control and regulation may bein the forms of setting goals for the action-centered subsystem, setting essential param-eters of the action-centered and the non-action-centered subsystem, interrupting andchanging ongoing processes in the action-centered and the non-action-centered sub-system, and so on. Control and regulationmay also be carried out through setting rein-forcement functions for the action-centeredsubsystem on the basis of drive states. Themetacognitive subsystem is also made up oftwo levels: the top level (explicit) and thebottom level (implicit).

Note that in Clarion, there are thus avariety of memories: procedural memory(in the action-centered subsystem) in bothimplicit and explicit forms, general “seman-tic” memory (in the non-action-centeredsubsystem) in both implicit and explicitforms, episodic memory (in the non-action-centered subsystem), working memory (inthe action-centered subsystem), goal struc-

tures (in the action-centered subsystem),and so on. See Sun (2003) for furtherdetails of these memories. As touched uponbefore, these memories are important foraccounting for various forms of consciousand unconscious processes (also see, e.g.,McClelland et al., 1995 ; Schacter, 1990;Taylor, 1997).

Clarion has been successful in account-ing for a variety of psychological data. Anumber of well-known skill learning taskshave been simulated using Clarion;thesespan the spectrum ranging from simple reac-tive skills to complex cognitive skills. Thetasks include serial reaction time (SRT) tasks,artificial grammar learning (AGL) tasks, pro-cess control (PC) tasks, the categorical infer-ence (CI) task, the alphabetical arithmetic(AA) task, and the Tower of Hanoi (TOH)task (see Sun, 2002). Among them, SRT,AGL, and PC are typical implicit learningtasks, very much relevant to the issue of con-sciousness as they operationalize the notionof consciousness in the context of psycho-logical experiments (Coward & Sun, 2004 ;Reber, 1989; Seger, 1994 ; Sun et al., 2005),whereas TOH and AA are typical high-levelcognitive skill acquisition tasks. In addition,extensive work have been done on a com-plex minefield navigation task (see Sun &Peterson, 1998; Sun et al., 2001). Metacogni-tive and motivational simulations have alsobeen undertaken, as have social simulationtasks (e.g., Sun & Naveh, 2004).

In evaluating the contribution of Clarion

to our understanding of consciousness, wenote that the simulations using Clarion

provide detailed, process-based interpreta-tions of experimental data related to con-sciousness, in the context of a broadlyscoped cognitive architecture and a uni-fied theory of cognition. Such interpreta-tions are important for a precise, process-based understanding of consciousness andother aspects of cognition, leading to bet-ter appreciations of the role of consciousnessin human cognition (Sun, 1999a). Clarion

also makes quantitative and qualitative pre-dictions regarding cognition in the areas ofmemory, learning, motivation, metacogni-tion, and so on. These predictions either



have been experimentally tested already orare in the process of being tested (see, e.g.,Sun, 2002 ; Sun et al., 2001, 2005). Becauseof the complex structures and their com-plex interactions specified within the frame-work of Clarion, it has a lot to say aboutthe roles that different types of processes,conscious or unconscious, play in humancognition, as well as their synergy (Sun et al.,2005).

Comparing Clarion with Bower (1996),the latter may be viewed as a special case ofClarion for dealing specifically with implicitmemory phenomena. The type-1 and type-2connections, hypothesized by Bower (1996)as the main explanatory constructs, can beequated roughly to top-level representationsand bottom-level representations, respec-tively. In addition to making the distinc-tion between type-1 and type-2 connections,Bower (1996) also endeavored to specifythe details of multiple pathways of spread-ing activation in the bottom level. Thesepathways were phonological, orthographi-cal, semantic, and other connections thatstore long-term implicit knowledge. In thetop level, associated with type-2 connec-tions, it was claimed on the other handthat rich contextual information was stored.These details nicely complement the speci-fication of Clarion and can thus be incorpo-rated into the model.

The proposal by McClelland et al. (1995)that there are complementary learning sys-tems in the hippocampus and neocortexis also relevant here. According to theiraccount, cortical systems learn slowly, andthe learning of new information destroysthe old, unless the learning of new infor-mation is interleaved with ongoing expo-sure to the old information. To resolve thesetwo problems, new information is initiallystored in the hippocampus, an explicit mem-ory system, in which crisp, explicit repre-sentations are used to minimize interferenceof information (so that catastrophic inter-ference is avoided there). It allows rapidlearning of new material. Then, the newinformation stored in the hippocampus isassimilated into cortical systems. The assim-ilation is interleaved with the assimilation

of all other information in the hippocam-pus and with the ongoing events. Weightsare adjusted by a small amount after eachexperience, so that the overall direction ofweight change is governed by the struc-ture present in the ensemble of events andexperiences, using distributed representa-tions (with weights). Therefore, catastrophicinterference is avoided in cortical systems.This model is very similar to the two-levelidea of Clarion, in that it not only adoptsa two-system view but also utilizes repre-sentational differences between the two sys-tems. However, in contrast to this model,which captures only what may be termedtop-down learning (that is, learning that pro-ceeds from the conscious to the uncon-scious), Clarion can capture both top-downlearning (from the top level to the bottomlevel) and bottom-up learning (from the bot-tom level to the top level). See Sun et al.(2001) and Sun (2002) for details of bottom-up learning.

Turning to the declarative/proceduralknowledge models, ACT* (Anderson, 1983)is made up of a semantic network (for declar-ative knowledge) and a production system(for procedural knowledge). ACT-R is adescendant of ACT*, in which procedurallearning is limited to production formationthrough mimicking, and production firingis based on log odds of success. Clarion

succeeds in explaining two issues that ACTdid not address. First, whereas ACT takesa mostly top-down approach toward learn-ing (i.e, from given declarative knowledgeto procedural knowledge), Clarion can pro-ceed bottom-up. Thus, Clarion can accountfor implicit learning better than ACT (seeSun, 2002 , for details). Second, in ACTboth types of knowledge are represented inexplicit, symbolic forms (i.e., semantic net-works and productions), and thus it doesnot explain, from a representational view-point, the differences in conscious accessibil-ity (Sun, 1999b). Clarion accounts for thisdifference based on the use of two differentforms of representation. Top-level knowl-edge is represented explicitly and thus con-sciously accessible, whereas bottom-levelknowledge is represented implicitly and



thus inaccessible. Thus, this distinction inClarion is intrinsic, instead of assumed asin ACT (Sun, 1999b).

Comparing Clarion with Hunt and Lans-man’s (1986) model, there are similari-ties. The production system in Hunt andLansman’s model clearly resembles the toplevel in Clarion, in that both use explicitmanipulations in much the same way. Like-wise, the spreading activation in the seman-tic network in Hunt and Lansman’s modelresembles the connectionist network in thebottom level of Clarion, because the samekind of spreading activation was used inboth models, although the representation inHunt and Lansman’s model was symbolic,not distributed. Because of the uniformlysymbolic representations used in Hunt andLansman’s model, it does not explain con-vincingly the qualitative difference betweenconscious and unconscious processes (seeSun, 1999b).

An Application of the Access View

Let us now examine an application of theaccess view on consciousness in building apractically useful system. The access view isa rather popular approach in computationalaccounts of consciousness (Baars, 2002), andtherefore it deserves some attention. It is alsopresented here as an example of various one-system views.

Most computational models of cognitiveprocesses are designed to predict experi-mental data. IDA (Intelligent DistributionAgent), in contrast, models consciousnessin the form of an autonomous softwareagent (Franklin & Graesser, 1997). Specif-ically, IDA was developed for Navy appli-cations (Franklin et al., 1998). At the endof each sailor’s tour of duty, he or she isassigned to a new billet in a process calleddistribution. The Navy employs almost 300

people (called detailers) to effect these newassignments. IDA’s task is to play the role ofa detailer.

Designing IDA presents both communi-cation problems and action selection prob-lems involving constraint satisfaction. It

must communicate with sailors via e-mailand in English, understanding the con-tent and producing human-like responses.It must access a number of existing Navydatabases, again understanding the content.It must see that the Navy’s needs are satisfiedwhile adhering to Navy policies. For exam-ple, a particular ship may require a certainnumber of sonar technicians with the req-uisite types of training. It must hold downmoving costs. And it must cater to the needsand desires of the sailor as well as possi-ble. This includes negotiating with the sailorvia an e-mail correspondence in natural lan-guage. Finally, it must authorize the finallyselected new billet and start the writing ofthe sailor’s orders.

Although the IDA model was not initiallydeveloped to reproduce experimental data,it is nonetheless based on psychologicaland neurobiological theories of conscious-ness and does generate hypotheses and qual-itative predictions (Baars & Franklin, 2003 :Franklin et al., 2005). IDA successfullyimplements much of the global workspacetheory (Baars, 1988), and there is a growingbody of empirical evidence supporting thattheory (Baars, 2002). IDA’s flexible cogni-tive cycle has also been used to analyze therelation of consciousness to working mem-ory at a fine level of detail, offering explana-tions of such classical working memory tasksas visual imagery to gain information and therehearsal of a telephone number (Baars &Franklin, 2003 : Franklin et al., 2005).

In his global workspace theory (see Fig-ure 7.4), Baars (1988) postulates that humancognition is implemented by a multitudeof relatively small, special-purpose proces-sors, which are almost always unconscious(i.e., the modularity hypothesis as discussedearlier). Communication between them israre and over a narrow bandwidth. Coali-tions of such processes find their way intoa global workspace (and thereby into con-sciousness). This limited capacity workspaceserves to broadcast the message of thecoalition to all the unconscious proces-sors in order to recruit other processorsto join in handling the current novel sit-uation or in solving the current problem.



The Dominant Context Hierarchy:

Input Processors:

OtherAvailableContexts:

Global Workspace(conscious)

CompetingContexts:

Goal Contexts

Perceptual Contexts

Conceptual Contexts

Figure 7.4. Baars’ global workspace theory.

Thus consciousness, in this theory, allowsus to deal with novel or problematic situa-tions that cannot be dealt with effciently, orat all, by habituated unconscious processes.In particular, it provides access to appro-priately useful resources. Global workspacetheory offers an explanation for the lim-ited capacity of consciousness. Large mes-sages would be overwhelming to tiny pro-cessors. In addition, all activities of theseprocessors take place under the auspices ofcontexts: goal contexts, perceptual contexts,conceptual contexts, and/or cultural con-texts. Though contexts are typically uncon-scious, they strongly influence consciousprocesses.

Let us look into some details of the IDAarchitecture and its main mechanisms. Atthe higher level, the IDA architecture ismodular with module names borrowed frompsychology (see Figure 7.5). There are mod-ules for Perception, Working Memory, Auto-biographical Memory, Transient EpisodicMemory, Consciousness, Action Selection,Constraint Satisfaction, Language Genera-tion, and Deliberation.

In the lower level of IDA, the proces-sors postulated by the global workspace the-ory are implemented by “codelets.” Codeletsare small pieces of code running as indepen-

dent threads, each of which is specialized forsome relatively simple task. They often playthe role of “demons,”5 waiting for a particu-lar situation to occur in response to whichthey should act. Codelets also correspondmore or less to Edelman’s neuronal groups(Edelman, 1987) or Minsky’s agents (Minsky,1985). Codelets come in a number of vari-eties, each with different functions to per-form. Most of these codelets subserve somehigh-level entity, such as a behavior. How-ever, some codelets work on their own, per-forming such tasks as watching for incominge-mail and instantiating goal structures. Animportant type of codelet that works on itsown is the attention codelets that serve tobring information to “consciousness.”

IDA senses only strings of characters,which are not imbued with meaning butwhich correspond to primitive sensations,like, for example, the patterns of activ-ity on the rods and cones of the retina.These strings may come from e-mail mes-sages, an operating system message, or froma database record.

The perception module employs analy-sis of surface features for natural-languageunderstanding. It partially implements per-ceptual symbol system theory (Barsalou,1999); perceptual symbols serve as a uniform



Action Selected

and Taken

(behavior codelets)

Stimulus

from Internal

Environment

Stimulus

from External

Environment

Action Selected

and Taken

(behavior codelets)

Action

Selection

(Behavior Net)

Behavior Codelets

in priming mode

instantiate, bind

activate

Update

Winning

Coalition

Local Associations

Local AssociationsCue

Cue

Consolidation

Percept

Update

Conscious

Broadcast

Procedural Update

Senses

Internal Senses

Sensory

Memory

Perceptual

Memory

(Slipnet)

Codelets

Competition for

Consciousness

Long-termWorkingMemory

Autobiographical

Memory

Working

Memory

TransientEpisodicMemory

Perception Codelets

Attention

Codelets

Figure 7.5 . IDA’s cognitive cycle.

system of representations throughout thesystem. Its underlying mechanism consti-tutes a portion of the Copycat architecture(Hofstadter & Mitchell, 1994). IDA’s per-ceptual memory takes the form of a semanticnet with activation passing, called the slipnet(see Figure 7.6). The slipnet embodies theperceptual contexts and some conceptualcontexts from the global workspace theory.Nodes of the slipnet constitute the agent’sperceptual symbols. Perceptual codelets rec-ognize various features of the incomingstimulus; that is, various concepts. Percep-tual codelets descend on an incoming mes-sage, looking for words or phrases they rec-ognize. When such are found, appropriatenodes in the slipnet are activated. This acti-vation passes around the net until it settles.A node (or several) is selected by its highactivation, and the appropriate template(s)is filled by codelets with selected items fromthe message. The information thus createdfrom the incoming message is then writtento the workspace (working memory, to bedescribed below), making it available to therest of the system.

The results of this process, informationcreated by the agent for its own use, arewritten to the workspace (working mem-ory, not to be confused with Baars’ globalworkspace). (Almost all of IDA’s moduleseither write to the workspace, read from it,or both.)

IDA employs sparse distributed mem-ory (SDM) as its major associative memory(Anwar & Franklin, 2003 ; Kanerva, 1988).SDM is a content-addressable memory.Being content addressable means that itemsin memory can be retrieved by using part oftheir contents as a cue, rather than having toknow the item’s address in memory.

Reads and writes, to and from associativememory, are accomplished through a gate-way within the workspace called the focus.When any item is written to the workspace,another copy is written to the read registersof the focus. The contents of these read reg-isters of the focus are then used as an addressto query associative memory. The results ofthis query – that is, whatever IDA associateswith this incoming information – are writ-ten into their own registers in the focus.



Informationrequest

acceptancepreference

San Diego Miami Norfolk

nornorfolkNorfolk NRFK

Jacksonville

location

Figure 7.6. A portion of the slipnet in IDA.

This may include some emotion and someaction previously taken. Thus associationswith any incoming information, either fromthe outside world or from some part of IDAitself, are immediately available. (Writes toassociative memory are made later and aredescribed below.)

In addition to long-term memory, IDAincludes a transient episodic memory(Ramamurthy, D’Mello, & Franklin, 2004).Long-term, content-addressable, associativememories are not typically capable ofretrieving details of the latest of a longsequence of quite similar events (e.g., whereI parked in the parking garage this morn-ing or what I had for lunch yesterday). Thedistinguishing details of such events tend toblur due to interference from similar events.In IDA, this problem is solved by the addi-tion of a transient episodic memory imple-mented with a sparse distributed memory.This SDM decays so that past sequences ofsimilar events no longer interfere with thelatest such events.

The apparatus for producing “conscious-ness” consists of a coalition manager, a spot-light controller, a broadcast manager, and acollection of attention codelets that recog-nize novel or problematic situations. Atten-

tion codelets have the task of bringing infor-mation to “consciousness.” Each attentioncodelet keeps a watchful eye out for someparticular situation to occur that might callfor “conscious” intervention. Upon encoun-tering such a situation, the appropriateattention codelet will be associated with thesmall number of information codelets thatcarry the information describing the situ-ation. This association should lead to thecollection of this small number of codelets,together with the attention codelet that col-lected them, becoming a coalition. Codeletsalso have activations. The attention codeletincreases its activation in proportion to howwell the current situation fits its particularinterest, so that the coalition might competefor “consciousness,” if one is formed.

In IDA, the coalition manager is respon-sible for forming and tracking coalitions ofcodelets. Such coalitions are initiated on thebasis of the mutual associations betweenthe member codelets. At any given time,one of these coalitions finds it way to “con-sciousness,” chosen by the spotlight con-troller, which picks the coalition with thehighest average activation among its mem-ber codelets. Baars’ global workspace the-ory calls for the contents of “consciousness”



to be broadcast to each of the codelets inthe system, and in particular, to the behav-ior codelets. The broadcast manager accom-plishes this task.

IDA depends on the idea of a behavior net(Maes, 1989; Negatu & Franklin, 2002) forhigh-level action selection in the service ofbuilt-in drives. It has several distinct drivesoperating in parallel, and these drives varyin urgency as time passes and the environ-ment changes. A behavior net is composedof behaviors and their various links. A behav-ior has preconditions as well as additions anddeletions. A behavior also has an activation, anumber intended to measure the behavior’srelevance to both the current environment(external and internal) and its ability to helpsatisfy the various drives it serves.

The activation comes from activationstored in the behaviors themselves, from theexternal environment, from drives, and frominternal states. The environment awardsactivation to a behavior for each of its truepreconditions. The more relevant it is tothe current situation, the more activation itreceives from the environment. (This sourceof activation tends to make the systemopportunistic.) Each drive awards activationto every behavior that, by being active, willhelp satisfy that drive. This source of activa-tion tends to make the system goal directed.Certain internal states of the agent can alsosend activation to the behavior net. Thisactivation, for example, might come from acoalition of codelets responding to a “con-scious” broadcast. Finally, activation spreadsfrom behavior to behavior along links.

IDA’s behavior net acts in consort withits “consciousness” mechanism to selectactions (Negatu & Franklin, 2002). Sup-pose some piece of information is written tothe workspace by perception or some othermodule. Attention codelets watch both itand the resulting associations. One of theseattention codelets may decide that this infor-mation should be acted upon. This codeletwould then attempt to take the informa-tion to “consciousness,” perhaps along withany discrepancies it may find with the helpof associations. If the attempt is success-ful, the coalition manager makes a coalition

of them, the spotlight controller eventuallyselects that coalition, and the contents of thecoalition are broadcast to all the codelets.In response to the broadcast, appropri-ate behavior-priming codelets perform threetasks: an appropriate goal structure is instan-tiated in the behavior net, the codelets bindvariables in the behaviors of that struc-ture, and the codelets send activation to thecurrently appropriate behavior of the struc-ture. Eventually that behavior is chosen to beacted upon. At this point, information aboutthe current emotion and the currently exe-cuting behavior is written to the focus by thebehavior codelets associated with the cho-sen behavior. The current contents of thewrite registers in the focus are then writ-ten to associative memory. The rest of thebehavior codelets associated with the chosenbehavior then perform their tasks. Thus, anaction has been selected and carried out bymeans of collaboration between “conscious-ness” and the behavior net.

This background information on the IDAarchitecture and mechanisms should enablethe reader to understand IDA’s cognitivecycle (Baars & Franklin, 2003 : Franklinet al., 2005). The cognitive cycle specifiesthe functional roles of memory, emotions,consciousness, and decision making in cogni-tion, according to the global workspace the-ory. Below, we sketch the steps of the cogni-tive cycle; see Figure 7.5 for an overview.

1. Perception. Sensory stimuli, external orinternal, are received and interpreted byperception. This stage is unconscious.

2 . Percept to Preconscious Buffer. The perceptis stored in preconscious buffers of IDA’sworking memory.

3 . Local Associations. Using the incomingpercept and the residual contents of thepreconscious buffers as cues, local asso-ciations are automatically retrieved fromtransient episodic memory and from long-term memory.

4 . Competition for Consciousness. Attentioncodelets, whose job is to bring relevant,urgent, or insistent events to conscious-ness, gather information, form coalitions,and actively compete against each other.



(The competition may also include atten-tion codelets from a recent previouscycle.)

5 . Conscious Broadcast. A coalition of code-lets, typically an attention codelet and itscovey of related information codelets car-rying content, gains access to the globalworkspace and has its contents broadcast.The contents of perceptual memory areupdated in light of the current contents ofconsciousness. Transient episodic mem-ory is updated with the current contentsof consciousness as events. (The contentsof transient episodic memory are sepa-rately consolidated into long-term mem-ory.) Procedural memory (recent actions)is also updated.

6. Recruitment of Resources. Relevant behav-ior codelets respond to the consciousbroadcast. These are typically codeletswhose variables can be bound from infor-mation in the conscious broadcast. Ifthe successful attention codelet was anexpectation codelet calling attention to anunexpected result from a previous action,the responding codelets may be those thatcan help rectify the unexpected situation.(Thus consciousness solves the relevancyproblem in recruiting resources.)

7. Setting Goal Context Hierarchy. Therecruited processors use the contents ofconsciousness to instantiate new goalcontext hierarchies, bind their variables,and increase their activation. Emotionsdirectly affect motivation and determinewhich terminal goal contexts receive acti-vation and how much. Other (environ-mental) conditions determine which ofthe earlier goal contexts receive addi-tional activation.

8. Action Chosen. The behavior net choosesa single behavior (goal context). Thisselection is heavily influenced by acti-vation passed to various behaviors influ-enced by the various emotions. Thechoice is also affected by the current situ-ation, external and internal conditions, bythe relation between the behaviors, andby the residual activation values of vari-ous behaviors.

9. Action Taken. The execution of a behav-ior (goal context) results in the behav-ior codelets performing their specializedtasks, which may have external or internalconsequences. The acting codelets alsoinclude an expectation codelet (see Step6) whose task is to monitor the action andto try and bring to consciousness any fail-ure in the expected results.

IDA’s elementary cognitive activitiesoccur within a single cognitive cycle. Morecomplex cognitive functions are imple-mented over multiple cycles. These includedeliberation, metacognition, and voluntaryaction (Franklin, 2000).

The IDA model employs a methodologythat is different from that which is currentlytypical of computational cognitive models.Although the model is based on experimen-tal findings in cognitive psychology and brainscience, there is only qualitative consistencywith experiments. Rather, there are a num-ber of hypotheses derived from IDA as aunified theory of cognition. The IDA modelgenerates hypotheses about human cogni-tion and the role of consciousness throughits design, the mechanisms of its modules,their interaction, and its performance.

Every agent must sample and act on itsworld through a sense-select-act cycle. Thefrequent sampling allows for a fine-grainedanalysis of common cognitive phenomena,such as process dissociation, recognition vs.recall, and the availability heuristic. At a highlevel of abstraction, the analyses supportthe commonly held explanations of whatoccurs in these situations and why. At a finer-grained level, the analyses flesh out commonexplanations, adding details and functionalmechanisms. Therein lies the value of theseanalyses.

Unfortunately, currently available tech-niques for studying some phenomena at afine-grained level, such as PET, fMRI, EEG,implanted electrodes, etc., are lacking eitherin scope, in spatial resolution, or in temporalresolution. As a result, some of the hypothe-ses from the IDA model, although testablein principle, seem not to be testable at the



modules

conscious awareness system

declarative

episodic

memory

executive system

procedural system

response system

knowledge

Figure 7.7. Schacter’s model of consciousness.

present time for lack of technologies withsuitable scope and resolution.

There is also the issue of the breadthof the IDA model, which encompasses per-ception, working memory, declarative mem-ory, attention, decision making, procedurallearning, and more. How can such a broadmodel produce anything useful? The IDAmodel suggests that these various aspectsof human cognition are highly integrated.A more global view can be expected toadd additional understanding to that produ-ced by more specific models. This assertionseems to be borne out by the analyses of vari-ous cognitive phenomena (Baars & Franklin,2003 ; Franklin et al., 2005).

Sketches of Some Other Views

As we have seen, there are many attemptsto explain the difference in conscious acces-sibility. Various explanations have beenadvanced in terms of the content of knowl-edge (e.g., instances vs. rules), the organi-zation of knowledge (e.g., declarative vs.procedural), processing mechanisms (e.g.,spreading activation vs. rule matching andfiring), the representation of knowledge(e.g., localist/symbolic vs. distributed), and

so on. In addition to the two views elab-orated on earlier, let us look into somemore details of a few other views. Althoughsome of the models that are discussedbelow are not strictly speaking computa-tional (because they may not have beenfully computationally implemented), theyare nevertheless important because theypoint to possible ways of constructing com-putational explanations of consciousness.

We can examine Schacter’s (1990) modelas an example. The model is based on neu-ropsychological findings of the dissociationof different types of knowledge (especially inbrain-damaged patients). It includes a num-ber of “knowledge modules” that performspecialized and unconscious processing andmay send their outcomes to a “consciousawareness system,” which gives rise to con-scious awareness (see Figure 7.7). Schacter’sexplanation of some neuropsychological dis-orders (e.g., hemisphere neglect, blindsight,aphasia, agnosia, and prosopagnosia) is thatbrain damages result in the disconnectionof some of the modules from the consciousawareness system, which causes their inac-cessibility to consciousness. However, as hasbeen pointed out by others, this explana-tion cannot account for many findings inimplicit memory research (e.g., Roediger,



visual

convergence

somatosensoryconvergence

auditory

convergence

multimodalconvergencezone

Figure 7.8. Damasio’s model of consciousness.

1990). Revonsuo (1993) advocated a similarview, albeit from a philosophical viewpoint,largely on the basis of using Schacter’s (1990)data as evidence. Johnson-Laird’s (1983)model was somewhat similar to Schacter’smodel in its overall structure in that therewas a hierarchy of processors and conscious-ness resided in the processes at the top ofthe hierarchy. Shallice (1972) put forwarda model in which a number of “action sys-tems” could be activated by “selector input”and the activated action systems correspondto consciousness. It is not clear, however,what the computational (mechanistic) dif-ference between conscious and unconsciousprocesses is in those models, which did notoffer a mechanistic explanation.

We can compare Schacter (1990)’s modelwith Clarion. It is similar to Clarion, in thatit includes a number of “knowledge mod-ules” that perform specialized and uncon-scious processing (analogous to bottom-levelmodules in Clarion) and send their out-comes to a “conscious awareness system”(analogous to the top level in Clarion),which gives rise to conscious awareness.Unlike Clarion’s explanation of the con-scious/unconscious distinction through thedifference between localist/symbolic versusdistributed representations, however, Schac-ter’s model does not elucidate in computa-tional/mechanistic terms the qualitative dis-tinction between conscious and unconsciousprocesses, in that the “conscious awarenesssystem” lacks any apparent qualitative dif-ference from the unconscious systems.

We can also examine Damasio’s neu-roanatomically motivated model (Dama-sio et al., 1990). The model hypothesizesthe existence of many “sensory convergencezones” that integrate information from indi-vidual sensory modalities through forwardand backward synaptic connections and theresulting reverberations of activations, with-out a central location for information stor-age and comparisons; it also hypothesizesthe global “multimodal convergence zone,”which integrates information across modal-ities also through reverberation (via recur-rent connections; see Figure 7.8). Corre-lated with consistency is global informa-tion availability; that is, once “broadcast”or “reverberation” is achieved, all the infor-mation about an entity stored in differenceplaces of the brain becomes available. Thiswas believed to have explained the accessi-bility of consciousness.6 In terms of Clarion,different sensory convergence zones may beroughly captured by bottom-level modules,each of which takes care of sensory inputs ofone modality (at a properly fine level), andthe role of the global multi-modal conver-gence zone (similar to the global workspacein a way) may be played by the top level ofClarion, which has the ultimate responsi-bility for integrating information (and alsoserves as the “conscious awareness system”).The widely recognized role of reverberation(Damasio, 1994 ; Taylor, 1994) may be cap-tured in Clarion through using recurrentconnections within modules at the bottomlevel and through multiple top-down andbottom-up information flows across the twolevels, which leads to the unity of conscious-ness that is the synthesis of all the informa-tion present (Baars, 1988; Marcel, 1983).

Similarly, Crick and Koch (1990) hypoth-esize that synchronous firing at 35–75 Hz inthe cerebral cortex is the basis for conscious-ness – with such synchronous firing, piecesof information regarding different aspectsof an entity are brought together, and thusconsciousness emerges. Although conscious-ness has been experimentally observed tobe somewhat correlated with synchronousfiring at 35–75 Hz, there is no explana-tion of why this is the case and there is



no computational/mechanistic explanationof any qualitative difference between 35–75 Hz synchronous firing and other firingpatterns.

Cotterill (1997) offers a “master-module”model of consciousness, which asserts thatconsciousness arises from movement or theplanning of movement. The master-modulerefers to the brain region that is responsi-ble for motor planning. This model sees theconscious system as being profligate with itsresources: Perforce it must plan and organizemovements, even though it does not alwaysexecute them. The model stresses the vitalrole that movement plays and is quite com-patible with the IDA model. This centralityof movement was illustrated by the obser-vation that blind people were able to readbraille when allowed to move their fingers,but were unable to do so when the dots weremoved against their still fingers (Cotterill,1997).

Finally, readers interested in the possibil-ity of computational models of conscious-ness actually producing “conscious” artifactsmay consult Holland (2003) and other workalong that line.

Concluding Remarks

This chapter has examined general frame-works of computational accounts of con-sciousness. Various related issues, such asthe utility of computational models, expla-nations of psychological data, and poten-tial applications of machine consciousness,have been touched on in the process. Basedon existing psychological and philosophicalevidence, existing models were comparedand contrasted to some extent. It appearsinevitable at this stage that there is the coex-istence of various computational accounts ofconsciousness. Each of them seems to cap-ture some aspect of consciousness, but eachalso has severe limitations. To capture thewhole picture in a unified computationalframework, much more work is needed. Inthis regard, Clarion and IDA provide somehope.

Much more work can be conducted onvarious issues of consciousness along thiscomputational line. Such work may includefurther specifications of details of compu-tational models. It may also include recon-ciliations of existing computational modelsof consciousness. More importantly, it may,and should, include the validation of compu-tational models through empirical and the-oretical means. The last point in particularshould be emphasized in future work (seethe earlier discussions concerning Clarion

and IDA). In addition, we may also attemptto account for consciousness computation-ally at multiple levels, from phenomenol-ogy, via various intermediate levels, all theway down to physiology, which will likelylead to a much more complete computa-tional account and a much better picture ofconsciousness (Coward & Sun, 2004).

Acknowledgments

Ron Sun acknowledges support in part fromOffice of Naval Research grant N00014-95 -1-0440 and Army Research Institute grantsDASW01-00-K-0012 and W74V8H-04-K-0002 . Stan Franklin acknowledges supportfrom the Office of Naval Research and otherU.S. Navy sources under grants N00014-01-1-0917, N00014-98-1-0332 , N00014-00-1-0769, and DAAH04-96-C-0086.

Notes

1. There are also various numerical measuresinvolved, which are not important for thepresent discussion.

2 . Cleeremans and McClelland’s (1991) model ofartificial grammar learning can be viewed asinstantiating half of the system (the uncon-scious half), in which implicit learning takesplace based on gradual weight changes inresponse to practice on a task and the result-ing changes in activation of various represen-tations when performing the task.

3 . Note that the accessibility is defined in termsof the surface syntactic structures of the



objects being accessed (at the level of out-comes or processes), not their semantic mean-ings. Thus, for example, a LISP expressionis directly accessible, even though one maynot fully understand its meaning. The internalworking of a neural network may be inaccessi-ble even though one may know what the net-work essentially does (through an interpretiveprocess). Note also that objects and processesthat are directly accessible at a certain levelmay not be accessible at a finer level of details.

4 . This activation of features is important in sub-sequent uses of the information associatedwith the concept and in directing behaviors.

5 . This is a term borrowed from computer oper-ating systems that describes a small piece ofcode that waits and watches for a particularevent or condition to occur before it acts.

6. However, consciousness does not necessar-ily mean accessibility/availability of all theinformation about an entity; for otherwise,conscious inference, deliberate recollection,and other related processes would be unnec-essary.

References

Anderson, J. R. (1983). The architecture of cog-nition. Cambridge, MA: Harvard UniversityPress.

Anderson, J., & Lebiere, C. (1998). The atomiccomponents of thought. Mahwah, NJ: Erlbaum.

Anwar, A., & Franklin, S. (2003). Sparse dis-tributed memory for “conscious” softwareagents. Cognitive Systems Research, 4 , 339–354 .

Baars, B. (1988). A cognitive theory of con-sciousness. New York: Cambridge UniversityPress.

Baars, B. (2002). The conscious access hypothesis:Origins and recent evidence. Trends in CognitiveScience, 6, 47–52 .

Baars, B., & Franklin, S. (2003). How con-scious experience and working memory inter-act. Trends in Cognitive Science, 7, 166–172 .

Barsalou, L. (1999). Perceptual symbol sys-tems. Behavioral and Brain Sciences, 2 2 , 577–609.

Berry, D., & Broadbent, D. (1988). Interac-tive tasks and the implicit-explicit distinction.British Journal of Psychology, 79, 251–272 .

Bower, G. (1996). Reactivating a reactivation the-ory of implicit memory. Consciousness and Cog-nition, 5(1/2), 27–72 .

Bowers, K., Regehr, G., Balthazard, C., & Parker,K. (1990). Intuition in the context of discovery.Cognitive Psychology, 2 2 , 72–110.

Chaiken, S., & Trope, Y. (Eds.). (1999). Dualprocess theories in social psychology. New York:Guilford Press.

Clark, A. (1992). The presence of a symbol. Con-nection Science. 4 , 193–205 .

Clark, A., & Karmiloff-Smith, A. (1993). Thecognizer’s innards: A psychological and philo-sophical perspective on the development ofthought. Mind and Language, 8(4), 487–519.

Cleeremans, A., & McClelland, J. (1991). Learn-ing the structure of event sequences. Journalof Experimental Psychology: General, 12 0, 235–253 .

Collins, A., & Loftus, J. (1975). Spreading activa-tion theory of semantic processing. Psychologi-cal Review, 82 , 407–428.

Cosmides, L., & Tooby, J. (1994). Beyond intu-ition and instinct blindness: Toward an evolu-tionarily rigorous cognitive science. Cognition,50, 41–77.

Cotterill, R. (1997). On the mechanism of con-sciousness. Journal of Consciousness Studies, 4 ,231–247.

Coward, L. A., & Sun, R. (2004). Criteria foran effective theory of consciousness and somepreliminary attempts. Consciousness and Cog-nition, 13 , 268–301.

Crick, F., & Koch, C. (1990). Toward a neurobio-logical theory of consciousness. Seminars in theNeuroscience, 2 , 263–275 .

Damasio, A. (1994). Descartes’ error. New York:Grosset/Putnam.

Damasio. A., et al. (1990). Neural regionalizationof knowledge access. Cold Spring Harbor Sym-posium on Quantitative Biology, LV.

Dennett, D. (1991), Consciousness explained.Boston: Little Brown.

Edelman, G. (1987). Neural Darwinism. NewYork: Basic Books.

Edelman, G. (1989). The remembered present: Abiological theory of consciousness. New York:Basic Books.

Edelman, G., & Tononi, G. (2000). A universe ofconsciousness. New York: Basic Books.



Freeman, W. (1995). Societies of brains. Hillsdale,NJ: Erlbaum.

Fodor, J. (1983). The modularity of mind. Cam-bridge, MA: MIT Press.

Franklin, S. (2000). Deliberation and voluntaryaction in ‘conscious’ software agents. NeuralNetwork World, 10, 505–521.

Franklin, S., Baars, B. J., Ramamurthy, U., & Ven-tura, M. 2005 . The Role of consciousness inmemory. Brains, Minds and Media, 1, 1–38.

Franklin, S., & Graesser, A. C. (1997). Is it anagent, or just a program?: A taxonomy forautonomous agents. Intelligent agents III, 21–35 .

Franklin, S., Kelemen, A., & McCauley, L. (1998).IDA: A cognitive agent architecture. IEEE Con-ference on Systems, Man and Cybernetics.

Hadley, R. (1995). The explicit-implicit distinc-tion. Minds and Machines, 5 , 219–242 .

Hirschfeld, L., & Gelman, S. (1994). Mappingthe Mind: Domain Specificity in Cognition andCulture. New York: Cambridge UniversityPress.

Hofstadter, D., & Mitchell, M. (1994). The copy-cat project: A model of mental fluidity andanalogy-making. In K. J. Holyoak, & J. A. Barn-den (Eds.), Advances in connectionist and neuralcomputation theory, Vol. 2 : Logical connections.Norwood, NJ: Ablex.

Holland, O. (2003). Machine consciousness.Exeter, UK: Imprint Academic.

Hunt, E., & Lansman, M. (1986). Unified modelof attention and problem solving. PsychologicalReview, 93(4), 446–461.

Jackendoff, R. (1987). Consciousness and thecomputational mind. Cambridge, MA: MITPress.

Johnson-Laird, P. (1983). A computational anal-ysis of consciousness. Cognition and Brain The-ory, 6, 499–508.

Kanerva, P. (1988). Sparse distributed memory.Cambridge MA: MIT Press.

Karmiloff-Smith, A. (1986). From meta-processes to conscious access: Evidence fromchildren’s metalinguistic and repair data.Cognition, 2 3 , 95–147.

LeDoux, J. (1992). Brain mechanisms of emo-tion and emotional learning. Current Opinionin Neurobiology, 2 (2), 191–197.

Logan, G. (1988). Toward a theory of automati-zation. Psychological Review, 95(4), 492–527.

Maes, P. (1989). How to do the right thing. Con-nection Science, 1, 291–323 .

Marcel, A. (1983). Conscious and uncon-scious perception: An approach to the rela-tions between phenomenal experience andperceptual processes. Cognitive Psychology, 15 ,238–300.

Maslow, A. (1987). Motivation and personality(3d ed.). New York: Harper and Row.

D. Mathis D., & Mozer, M., (1996). Consciousand unconscious perception: A computationaltheory. Proceedings of the 18th Annual Con-ference of the Cognitive Science Society, 324–328.

McClelland, J., McNaughton, B., & O’Reilly, R.(1995). Why there are complementary learn-ing systems in the hippocampus and neocortex:Insights from the successes and failures of con-nectionist models of learning and memory. Psy-chological Review, 102 (3), 419–457.

Michalski, R. (1983). A theory and methodologyof inductive learning. Artificial Intelligence, 2 0,111–161.

Minsky, M. (1985). The society of mind. New York:Simon and Schuster.

Moscovitch, M., & Umilta, C. (1991). Consciousand unconscious aspects of memory. In Per-spectives on cognitive neuroscience. New York:Oxford University Press.

Neal, A., & Hesketh, B. (1997). Episodic knowl-edge and implicit learning. Psychonomic Bulletinand Review, 4(1), 24–37.

Negatu, A., & Franklin, S. (2002). An action selec-tion mechanism for ‘conscious’ software agen-ts. Cognitive Science Quarterly, 2 , 363–386.

Nelson, T. (Ed.) (1993). Metacognition: CoreReadings. Boston, MA: Allyn and Bacon.

O’Brien. G., & Opie, J. (1998). A connectionisttheory of phenomenal experience. Behavioraland Brain Sciences, 2 2 , 127–148.

Penrose, R. (1994). Shadows of the mind. Oxford:Oxford University Press.

Quillian, M. R. (1968). Semantic memory. In M.Minsky (Ed.), Semantic information processing(pp. 227–270). Cambridge, MA: MIT Press.

Ramamurthy, U., D’Mello, S., & Franklin, S.(2004). Modified sparse distributed memory astransient episodic memory for cognitive soft-ware agents. In Proceedings of the InternationalConference on Systems, Man and Cybernetics.Piscataway, NJ: IEEE.

Reber, A. (1989). Implicit learning and tacitknowledge. Journal of Experimental Psychology:General, 118(3), 219–235 .



Revonsuo, A. (1993). Cognitive models of con-sciousness. In M. Kamppinen (Ed.), Conscious-ness, cognitive schemata and relativism (pp. 27–130). Dordrecht, Netheland: Kluwer.

Roediger, H. (1990). Implicit memory: Retentionwithout remembering. American Psychologist,45(9), 1043–1056.

Rosenbloom, P., Laird, J., & Newell, A. (1993).The SOAR papers: Research on integrated intel-ligence. Cambridge, MA: MIT Press.

Rosenthal, D. (Ed.). (1991). The nature of mind.Oxford: Oxford University Press.

Rumelhart, D., McClelland, J., & the PDPResearch Group, (1986). Parallel distributedprocessing: Explorations in the microstructures ofcognition. Cambridge, MA: MIT Press.

Schacter, D. (1990). Toward a cognitive neu-ropsychology of awareness: Implicit knowledgeand anosagnosia. Journal of Clinical and Exper-imental Neuropsychology, 12 (1), 155–178.

Schneider, W., & Oliver, W. (1991). Aninstructable connectionist/control architec-ture. In K. VanLehn (Ed.), Architectures forintelligence. Hillsdale, NJ: Erlbaum.

Searle, J. (1980). Minds, brains, and programs.Brain and Behavioral Sciences, 3 , 417–457.

Seger, C. (1994). Implicit learning. PsychologicalBulletin, 115(2), 163–196.

Servan-Schreiber, E., & Anderson, J. (1987).Learning artificial grammars with competitivechunking. Journal of Experimental Psychology:Learning, Memory, and Cognition, 16, 592–608.

Shallice, T. (1972). Dual functions of conscious-ness. Psychological Review, 79(5), 383–393 .

Sloman, A., & Chrisley, R. (2003). Virtualmachines and consciousness. Journal of Con-sciousness Studies, 10, 133–172 .

Smith, E., & Medin, D. (1981). Categories andconcepts. Cambridge, MA: Harvard UniversityPress.

Smith, J. D., Shields, W. E., & Washburn, D. A.(2003). The comparative psychology of uncer-tainty monitoring and metacognition. Behav-ioral and Brain Sciences. 2 6, 317–339.

Smolensky, P. (1988). On the proper treatment ofconnectionism. Behavioral and Brain Sciences,11(1), 1–74 .

Stanley, W., Mathews, R., Buss, R., & Kotler-Cope, S. (1989). Insight without awareness: Onthe interaction of verbalization, instruction andpractice in a simulated process control task.

Quarterly Journal of Experimental Psychology,41A(3), 553–577.

Sun, R. (1994). Integrating rules and connectionismfor robust commonsense reasoning. New York:John Wiley and Sons.

Sun, R. (1995). Robust reasoning: Integratingrule-based and similarity-based reasoning. Arti-ficial Intelligence, 75(2), 241–296.

Sun, R. (1997). Learning, action, and cons-ciousness: A hybrid approach towards mod-eling consciousness. Neural Networks, 10(7),13 17–1331.

Sun, R. (1999a). Accounting for the computa-tional basis of consciousness: A connectionistapproach. Consciousness and Cognition, 8, 529–565 .

Sun, R. (1999b). Computational models ofconsciousness: An evaluation. Journal of Intel-ligent Systems [Special Issue on Consciousness],9(5–6), 507–562 .

Sun, R. (2002). Duality of the mind. Mahwah, NJ:Erlbaum.

Sun, R. (2003). A tutorial on CLARION.Retrieved from http://www.cogsci.rpi.edu/∼rsun/sun.tutorial.pdf.

Sun, R., & Bookman, L. (Eds.). (1994). Computa-tional architectures integrating neural and sym-bolic processes. Norwell, MA: Kluwer.

Sun, R., Merrill, E., & Peterson, T. (2001). Fromimplicit skills to explicit knowledge: A bottom-up model of skill learning. Cognitive Science,2 5(2), 203–244 .

Sun, R., & Naveh, I. (2004 , June). Simulatingorganizational decision making with a cogni-tive architecture Clarion. Journal of ArtificialSociety and Social Simulation, 7(3). Retrievedfrom http://jasss.soc.surrey.ac.uk/7/3 /5 .html.

Sun, R., & Peterson, T. (1998). Autonomouslearning of sequential tasks: Experiments andanalyses. IEEE Transactions on Neural Net-works, 9(6), 1217–1234 .

Sun, R., & Peterson, T. (1999). Multi-agent rein-forcement learning: Weighting and partition-ing. Neural Networks, 12 (4–5). 127–153 .

Sun, R., Peterson, T., & Merrill, E. (1996).Bottom-up skill learning in reactive sequentialdecision tasks. Proceedings of 18th Cognitive Sci-ence Society Conference. Hillsdale, NJ: LawrenceErlbaum Associates.

Sun, R., Merrill, E.. & Peterson, T. (1998). Abottom-up model of skill learning. Proceedingsof the 2 0th Cognitive Science Society Conference



(pp. 1037–1042). Mahwah, NJ: LawrenceErlbaum Associates

Sun , R., Slusarz, P., & Terry, C. (2005). The inter-action of the explicit and the implicit in skilllearning: A dual-process approach. Psychologi-cal Review, 112 , 159–192 .

Taylor, J. (1994). Goal, drives and consciousness.Neural Networks, 7 (6/7), 1181–1190.

Taylor, J. (1997).The relational mind. In A.Browne (Ed.), Neural network perspectives oncognition and adaptive robotics. Bristol, UK:Institute of Physics.

Toates, F. (1986). Motivational systems.Cambridge: Cambridge University Press.

Tversky, A. (1977). Features of similarity. Psycho-logical Review, 84(4), 327–352 .

Watkins, C. (1989). Learning with delayedrewards. PhD Thesis, Cambridge University,Cambridge, UK.

Willingham, D., Nissen, M., & Bullemer, P.(1989). On the development of proceduralknowledge. Journal of Experimental Psychology:Learning, Memory, and Cognition, 15 , 1047–1060.

7 computational models of consciousness: a...

Documents