model-based programming of intelligent embedded systems and

Model-Based Programming of IntelligentEmbedded Systems and Robotic Space Explorers

BRIAN C. WILLIAMS , MEMBER, IEEE, MICHEL D. INGHAM, SEUNG H. CHUNG,AND

PAUL H. ELLIOTT

Invited Paper

Programming complex embedded systems involves reasoningthrough intricate system interactions along lengthy paths betweensensors, actuators, and control processors. This is a challenging,time-consuming, and error-prone process requiring significant in-teraction between engineers and software programmers. Further-more, the resulting code generally lacks modularity and robustnessin the presence of failure. Model-based programming addressesthese limitations, allowing engineers to program reactive systemsby specifying high-level control strategies and by assembling com-monsense models of the system hardware and software. In executinga control strategy, model-based executives reason about the models“on the fly,” to track system state, diagnose faults, and performreconfigurations. This paper develops the Reactive Model-BasedProgramming Language (RMPL) and its executive, called Titan.RMPL provides the features of synchronous, reactive languages,with the added ability of reading and writing to state variables thatare hidden within the physical plant being controlled. Titan exe-cutes an RMPL program using extensive component-based declar-ative models of the plant to track states, analyze anomalous situ-ations, and generate novel control sequences. Within its reactivecontrol loop, Titan employs propositional inference to deduce thesystem’s current and desired states, and it employs model-basedreactive planning to move the plant from the current to the desiredstate.

Keywords—Constraint programming, model-based autonomy,model-based execution, model-based programming, model-basedreasoning, robotic execution, synchronous programming.

I. INTRODUCTION

Embedded systems, from automobiles to office-buildingcontrol systems, are achieving unprecedented levels of ro-

Manuscript received December 20, 2001; revised August 31, 2002.This work was supported in part by the Defense Advanced ResearchProjects Agency Model-Based Integrated Embedded Software programunder Contract F33615-00-C-1702 and in part by NASA’s Cross EnterpriseTechnology Development prgram under Contract NAG2–1466

The authors are with the Space Systems and Artificial Intelligence Lab-oratories, Massachusetts Institute of Technology, Cambridge, MA 02139USA (e-mail: [email protected]; [email protected]; [email protected]; [email protected]).

Digital Object Identifier 10.1109/JPROC.2002.805828

bustness by dramatically increasing their use of computation.We envision a future with large networks of highly robust andincreasingly autonomous embedded systems. These visionsinclude intelligent highways that reduce congestion, coop-erative networks of air vehicles for search and rescue, andfleets of intelligent space probes that autonomously explorethe far reaches of the solar system.

Many of these systems will need to perform robustlywithin extremely harsh and uncertain environments, ormay need to operate for years with minimal attention.To accomplish this, these embedded systems will need toradically reconfigure themselves in response to failures, andthen accommodate these failures during their remainingoperational lifetime. We support the rapid developmentof these systems by creating embedded programming lan-guages that are able to reason about and control underlyinghardware from engineering models. We call this approachmodel-based programming.

A. Robustness in Deep Space

In the past, high levels of robustness under extreme un-certainty was largely the realm of deep-space exploration.Billion-dollar space systems, like the Galileo Jupiter probe,have achieved robustness by employing sizable software de-velopment teams and by using many operations personnelto handle unforeseen circumstances as they arise. Efforts tomake these missions highly capable at dramatically reducedcosts have proven extremely challenging, producing notablelosses, such as the Mars Polar Lander and Mars Climate Or-biter failures [1]. A contributor to these failures was the in-ability of the small software team to think through the largespace of potential interactions between the embedded soft-ware and its underlying hardware.

For example, consider the leading hypothesis for the causeof the Mars Polar Lander failure. Mars Polar Lander useda set of Hall effect sensors in its legs to detect touchdown.These sensors were watched by a set of software monitors,

0018-9219/03$17.00 © 2003 IEEE

212 PROCEEDINGS OF THE IEEE, VOL. 91, NO. 1, JANUARY 2003

which were designed to turn off the engine when triggered.As the lander descended into the Mars atmosphere, it de-ployed its legs. At this point it is most likely that the force ofdeployment produced a noise spike on the leg sensors, whichwas latched by the software monitors. The lander continuedto descend, using a laser altimeter to detect distance to thesurface. At an altitude of approximately 40 m, the landerbegan polling its leg monitors to determine touchdown. Itwould have immediately read the latched noise spike andshut down its engine prematurely, resulting in the spacecraftplummeting to the surface from 40 m [2].

Suppose for a moment that the lander had been piloted by ahuman. The pilot would have received one altimeter readingof 40 m elevation, and a split second later a reading of touch-down from the leg sensors. It seems unlikely that the pilotwould immediately respond by shutting down the engine,given how these observations defy our common sense aboutphysics. Rather, the pilot would consider that a sensor failurewas most likely, gather additional information to determinethe correct altitude, and choose a conservative approach inthe presence of uncertainty.

Of course, such embedded systems cannot be piloted byhumans; they must be preprogrammed. However, the spaceof potential failures and their interactions with the embeddedsoftware is far too large for programmers to successfully enu-merate, and correctly encode, within a traditional program-ming language.

Our objective is to support future programmers withembedded languages that avoid commonsense mistakes,by automatically reasoning from hardware models. Oursolution to this challenge has two parts. First, we are creatingincreasingly intelligent embedded systems that automati-cally diagnose and plan courses of action at reactive timescales, based on models of themselves and their environment[3]–[7]. This paradigm, calledmodel-based autonomy, hasbeen demonstrated in space on the NASA Deep SpaceOne (DS-1) probe [8], and on several subsequent spacesystems [9], [10]. Second, we elevate the level at which anengineer programs through a language, called theReactiveModel-Based Programming Language(RMPL), whichenables the programmer to tap into and guide the reasoningmethods of model-based autonomy. This language allowsthe programmer to delegate, to the language’s compilerand run-time execution kernel, tasks involving reasoningthrough system interactions, such as low-level commanding,monitoring, diagnosis, and repair. The model-based execu-tion kernel for RMPL is calledTitan.

B. Model-Based Programming

Engineers like to reason about embedded systems in termsof state evolutions. However, embedded programming lan-guages, such as Esterel [11] and Statecharts [12], interactwith a physical plant by reading sensors and setting con-trol variables [Fig. 1 (left)]. It is then the programmer’s re-sponsibility to perform the mapping between intended stateand the sensors and actuators. This mapping involves rea-soning through a complex set of interactions under a rangeof possible failure situations. The complexity of the interac-

Fig. 1 Model of interaction with the physical plant for traditionalembedded languages (left) and model-based programming (right).

tions and the large number of possible scenarios makes thisan error-prone process.

A model-based programming language is similar to re-active embedded languages like Esterel, with the key dif-ference that it interacts directly with the plant state [Fig. 1,right]. This is accomplished by allowing the programmer toread or write “hidden” state variables in the plant, that is,states that are not directly observable or controllable. It isthen the responsibility of the language’s execution kernel tomap between hidden states and the plant sensors and con-trol variables. This mapping is performed automatically byemploying a deductive controller that reasons from a com-monsense plant model.

A model-based program is composed of two components.The first is acontrol program,which uses standard program-ming constructs to codify specifications of desired systemstate evolution. In addition, to execute the control program,the execution kernel needs a model of the system it must con-trol. Hence, the second component is aplant model, whichcaptures the physical plant’s nominal behavior and commonfailure modes. This model unifies constraints, concurrency,and Markov processes.

C. Model-Based Execution

A model-based program is executed by automaticallygenerating a control sequence that moves the physical plantto the states specified by the program (see Fig. 2). Wecall these specified statesconfiguration goals. Programexecution is achieved using amodel-based executive, suchas Titan, which repeatedly generates the next configurationgoal, and a sequence of control actions that achieve thisgoal, based on knowledge of the current plant state and plantmodel. The model-based executive continually estimatesthe most likely state of the plant from sensor informationand the plant model. This information allows the executiveto confirm the successful execution of commands and theachievement of configuration goals, and to diagnose failures.

A model-based executive continually tries to transition theplant toward a state that satisfies the configuration goals,while maximizing some reward metric. When the plant straysfrom the specified goals due to failures, the executive ana-lyzes sensor data to identify the current state of the plant, andthen moves the plant to a new state that, once again, achievesthe desired goals. The executive is reactive in the sense that itresponds immediately to changes in goals and to failures; thatis, each control action is incrementally generated using thenew observations and configuration goals provided in eachstate.

WILLIAMS et al.: MODEL-BASED PROGRAMMING OF INTELLIGENT EMBEDDED SYSTEMS AND ROBOTIC SPACE EXPLORERS 213

Fig. 2 Architecture for a model-based executive.

The Titan model-based executive consists of two compo-nents, acontrol sequencerand adeductive controller. Thecontrol sequencer is responsible for generating a sequenceof configuration goals, using the control program and plantstate estimates. Each configuration goal specifies an abstractstate for the plant to achieve. The deductive controller is re-sponsible for estimating the plant’s most likely current statebased on observations from the plant (mode estimation), andfor issuing commands to move the plant through a sequenceof states that achieve the configuration goals (mode reconfig-uration).

D. Outline

In this paper we develop RMPL and its corresponding ex-ecutive,Titan. Section II illustrates model-based program-ming in detail on a simple example. Section III specifiesthe formal semantics for a model-based program. Section IVintroduces the constructs of RMPL required for specifyingcontrol behavior. The remaining sections develop Titan. Sec-tion V describes the control sequencer, which translates theRMPL control program into a sequence of state configura-tion goals, based on the plant’s estimated state trajectory.Section VI describes the deductive controller, which uses acommonsense plant model to estimate the state of the systemand generate control actions that achieve the state configura-tion goals provided by the sequencer. Section VII concludesthe paper with a discussion of related work.

II. A M ODEL-BASED PROGRAMMING EXAMPLE

Model-based programming enables a programmer to focuson specifying the desired state evolutions of the system. Forexample, consider the task of inserting a spacecraft into orbitaround a planet. Our spacecraft includes a science cameraand two identical redundant engines (Engines A and B), asshown in Fig. 3. An engineer thinks about this maneuver interms of state trajectories:

Heat up both engines (called standby mode). Meanwhile,turn the camera off, in order to avoid plume contamina-

Fig. 3 Simple spacecraft for the orbital insertion scenario. Initialstate (left) and goal state (right) are depicted.

tion. When both are accomplished, thrust one of the twoengines, using the other engine as backup in case of pri-mary engine failure.This specification is far simpler than a control program

that must turn on heaters and valve drivers, open valves andinterpret sensor readings for the engines shown in the figure.Thinking in terms of more abstract hidden states makes thetask of writing the control program much easier and avoidsthe error-prone process of reasoning through low-levelsystem interactions. In addition, it gives the program’sexecution kernel the latitude to respond to novel failuresas they arise. This is essential for achieving high levels ofrobustness.

As an example, consider the model-based programcorresponding to the previously described specification forspacecraft orbital insertion. The dual main engine system(see Fig. 3) consists of two propellant tanks, two mainengines, and redundant valves. The system offers a range ofconfigurations for establishing propellant paths to a mainengine. When the propellants combine within the engine,they produce thrust. The flight computer controls the engineand camera by sending commands. Sensors include anaccelerometer, to confirm engine operation, and a camerashutter position sensor, to confirm camera operation.


We start by specifying the two components of a model-based program for orbital insertion: the control program andplant model. We then describe the execution of the programunder nominal and failure situations.

A. Control Program

The RMPL control program, shown in Fig. 4, encodes theinformal specification we gave previously as a set of state tra-jectories. The specific RMPL constructs used in the programare introduced in Section IV. Recall that to perform orbitalinsertion, one of the two engines must be fired. We start byconcurrently placing the two engines in the standby state andby shutting off the camera. This is performed by lines 3–5,where commas at the end of each line denote parallel compo-sition. We then fire an engine, choosing to use Engine A asthe primary engine (lines 6–9) and Engine B as a backup, inthe event that Engine A fails to fire correctly (lines 10–11).Engine A starts trying to fire as soon as it achieves standbyand the camera is off (line 7), but aborts if at any time En-gine A is found to be in a failure state (line 9). Engine B startstrying to fire only if Engine A has failed, B is in standby, andthe camera is off (line 10).

Several features of this control program reinforce our ear-lier points. First, the program is stated in terms of state as-signments to the engines and camera, such as “EngineB =Firing.” Second, these state assignments appear both as as-sertions and as execution conditions. For example, in lines6–9, “EngineA = Firing” appears in an assertion (line 8),while “EngineA = Standby,” “Camera = Off,” and “EngineA= Failed” appear in execution conditions (lines 7 and 9).Third, none of these state assignments are directly observableor controllable, only shutter position and acceleration may bedirectly sensed, and only the flight computer command maybe directly set. Finally, by referring to hidden states directly,the RMPL program is far simpler than a corresponding pro-gram that operates on sensed and controlled variables. Theadded complexity of the latter program is due to the need tofuse sensor information and generate command sequencesunder a large space of possible operation and fault scenarios.

B. Plant Model

The plant model is used by a model-based executive tomap queried and asserted states in the control program tosensed variables and control sequences, respectively, in thephysical plant. The plant model is built from a set of com-ponent models. Each component is represented by a set ofcomponent modes, a set of constraints defining the behaviorwithin each mode, and a set of probabilistic transitions be-tween modes. The component automata operate concurrentlyand synchronously. In Section III, we describe the semanticsof plant models in terms of partially observable Markov de-cision processes.

For the orbital insertion example, we can model the space-craft abstractly as a three-component system (two enginesand a camera) by supplying the models depicted graphicallyin Fig. 5. Nominally, an engine can be in one of three modes:

, , or . The behavior within each of these

Fig. 4 RMPL control program for the orbital insertion scenario.

modes is described by a set of constraints on plant variables,namely, and . In Fig. 5, these constraints arespecified in boxes next to their respective modes. The enginealso has a mode, capturing any off-nominal behavior.We entertain the possibility that the engine may fail in a waynever seen before, by specifying no constraints for the en-gine’s behavior in the mode. This approach, calledconstraint suspension[13], is common to most approachesto model-based diagnosis.

Models include commanded and uncommanded tran-sitions, both of which are probabilistic. For example, theengine has uncommanded transitions from , ,and to . The transitions have a 1% probability,and are shown in the figure as arcs labeled 0.01. Transitionsbetween nominal modes are triggered on commands, andoccur with probability 99%.

C. Executing the Model-Based Program

When the orbital insertion control program is executed,the control sequencer starts by generating a configurationgoal consisting of the conjunction of three state variableassignments: “EngineA = Standby,” “EngineB = Standby,”and “Camera = Off” (lines 3–5, Fig. 4). To determine howto achieve this goal, the deductive controller considersthe latest estimate of the state of the plant. For example,suppose the deductive controller determines from its sensormeasurements and previous commands that the two enginesare already in standby, but the camera is on. The deductivecontroller deduces from the model that it should send acommand to the plant to turn the camera off. After exe-cuting this command, it uses its shutter position sensor toconfirm that the camera is off. With “Camera = Off” and“EngineA = Standby,” the control sequencer advances to theconfiguration goal of “EngineA = Firing” (line 8, Fig. 4).The deductive controller identifies an appropriate setting ofvalve states that achieves this behavior, then it sends out theappropriate commands.

In the process of achieving goal “EngineA = Firing,”assume that a failure occurs: an inlet valve to Engine Asuddenly sticks closed. Given various sensor measurements(e.g., flow and pressure measurements throughout thepropulsion subsystem), the deductive controller identifiesthe stuck valve as the most likely source of failure. It thentries to execute an alternative control sequence for achieving


Fig. 5 State transition models for a simplified spacecraft. The probabilities on nominal transitionsare omitted for clarity.

the configuration goal, for example, by repairing the valve.Presume that the valve is not repairable; Titan diagnosesthat “EngineA = Failed.” The control program specifies aconfiguration goal of “EngineB = Firing” as a backup (lines10–11, Fig. 4), which is issued by the control sequencer tothe deductive controller.

III. M ODEL-BASED PROGRAM EXECUTION SEMANTICS

Next, we define the execution of a model-based programin terms of legal state evolutions of a physical plant, and de-fine the functions of the control sequencer and deductive con-troller (see Fig. 2).

A. Plant Model

A plant is modeled as apartially observable Markov de-cision process(POMDP) .

is a set ofvariables, each ranging over a finite domain.is partitioned intostate variables , control variables ,andobservable variables . A full assignment is definedas a set consisting of an assignment to each variable in.is the set of allfeasiblefull assignments over . A stateis defined as an assignment to each variable in. The set

, the projection of on variables in , is the set of allfeasible states. Anobservationof the plant, , is defined asan assignment to each variable in. A control action, , isdefined as an assignment to each variable in.

is a finite set oftransitions. Each transition isa function , that is, is the state obtainedby applying transition function to any feasible full assign-ment . The transition function models the system’snominal behavior, while all other transitions model failures.

associates with each transition functiona probability. is the probability that the plant has initial state

. The reward for being in state is , and the proba-bility of observing in state is .

A plant trajectoryis a (finite or infinite) sequence of fea-sible states [ ], such that for each there is afeasible assignment that agrees with on assign-ments to variables in , and , for some

. A trajectory that involves only the nominal transitionis called anominal trajectory. A simpletrajectory does

not repeat any state.The space of possible state trajectories for a plant can be

visualized using aTrellis diagram, which enumerates all pos-sible states at each time step and all transitions between statesat adjacent times [see Fig. 6(a)].

B. Model-Based Program Execution

A model-based program consists of a plant model, de-scribed previously, and a control program, described as adeterministic automaton .is the set ofprogram locations, where is the pro-gram’s initial location. A program location represents the“state” of the program’s execution at any given time. Tran-sitions between locations are conditioned on plant states

of ; that is, is a function : .Each location has a correspondingconfigurationgoal , which is the set of legal plant goal statesassociated with location.

A legal executionof a model-based program is a trajectoryof feasible plant state estimates, [ ] of , and lo-cations [ ] of such that: 1) is a valid initialplant state, that is, ; 2) is consistent withthe observations and the plant model, that is, for each,there is a of that agrees with and , onthe corresponding subsets of variables; 3) is the initialprogram location ; 4) represents a legal con-trol program transition, that is, ; and5) if plant state is the result of a nominal plant transi-tion from , that is, , then eitheris the maximum-reward state in , or isthe prefix of a simple nominal plant trajectory that ends inthe maximum-reward state in .


(a)

(b)

(c)

Fig. 6 (a) A Trellis diagram depicting the system’s possible state trajectories. (b) Mode estimationtracks a set of trajectories through the diagram, and selects the most likely state as its estimatefor s . (c) Mode reconfiguration chooses a path through the diagram along nominal transitions,terminating at the goal states .

C. Model-Based Executive

A model-based program is executed by amodel-basedexecutive. We define a model-based executive as a high-levelcontrol sequencer, coupled to a low-leveldeductive con-troller.

A control sequencer takes, as input, a control program,and a sequence [ ] of plant state estimates. It gen-erates a sequence [ ] of configuration goals.

A deductive controller takes, as input, the plant model, a sequence of configuration goals [ ], and a

sequence of observations [ ]. It generates a se-quence of most likely plant state estimates [ ]and a sequence of control actions [ ].

The sequence of state estimates is generated by a processcalledmode estimation(ME). ME is an online algorithm for

tracking the most likely states that are consistent with theplant model, the sequence of observations and the control ac-tions. ME is framed as an instance of Hidden Markov Modelbelief state update, which computes the probability associ-ated with being in each state at time , according tothe following equations:

where is defined as the probability that tran-sitions from to state , computed as the sum of over


all transition functions that map to state . The priorprobability is conditioned on all observations upto , while the posterior probability is also con-ditioned on the latest observation .

Belief state update associates a probability to each statein the Trellis diagram. For mode estimation, the tracked statewith the highest belief state probability is selected as the mostlikely state , as shown in Fig. 6(b).

The sequence of control actions is generated by a processcalledmode reconfiguration(MR). MR takes as input a con-figuration goal and the most likely current state fromME, and it returns a series of commands that progress theplant toward a maximum-reward goal state that achieves.MR can be thought of as picking a simple path through theTrellis diagram along nominal transitions that leads to thegoal state, as shown in Fig. 6(c).

MR computes a goal state associated with configura-tion goal , and a control action , such that: 1) isthe state that entails the configuration goal , while max-imizing reward ; 2) is reachable from through asequence of nominal transitions of; and 3) is a con-trol action that transitions the plant state from to ,where either or is the prefix of asimple nominal state trajectory that leads to state.

Just as concurrent constraint programming offers a familyof languages, each characterized by a choice of constraintsystem, model-based programming defines a family of lan-guages, each characterized by the choice of the underlyingplant modeling formalism. In the following three sections ofthe paper, we define one particularly powerful instance of amodel-based programming language and its correspondingexecutive. The practical importance of this instance has beendemonstrated through deployments on a wide range of ap-plications from the automotive and aerospace domains [14].In the next section, we define RMPL. In the subsequent twosections, we present in more detail the computational modelsand algorithms used by Titan’s control sequencer and deduc-tive controller implementations.

IV. THE REACTIVE MODEL-BASED PROGRAMMING

LANGUAGE

In this section we introduce RMPL, by first introducing itsunderlying constraint system, and then developing constructsfor writing control programs.

A. Constraint System: Propositional State Logic

A constraint system ( ) is a set of tokens , closedunder conjunction, together with an entailment relation

. The relation satisfies the standard rules for con-junction (identity, elimination, cut, and introduction), asdefined in [15]. RMPL currently supportspropositional statelogic as its constraint system. In propositional state logic,a propositionis, in general, an assignment ( ), wherevariable ranges over a finite domain . A propositioncan have a truth assignment of true or false. Propositions arecomposed into formula using the standard logical connec-tives: and ( ), or ( ), and not ( ). The constantsTrue and

Falseare also valid constraints. A constraint isentailed ifit is implied by the conjunction of the plant model and themost likely current state of the physical plant; otherwise, itis not entailed. We denote entailment by simply stating theconstraint. Nonentailment is denoted by using an overbar.Note that nonentailment of constraint(denoted ) is notequivalent to entailment of the negation of( ); the cur-rent knowledge of plant state may not implyto be true orfalse.

In specifying an RMPL control program, the objective isto specify the desired behavior of the plant, by stating con-straints that, when made true, will cause the plant to follow adesired state trajectory. State assertions are specified as con-straints on plant state variables that should be made true.RMPL’s model of interaction is in contrast to Esterel [11]and the Timed Concurrent Constraint (TCC) programminglanguage [16], which both interact with the program memory,sensors, and control variables, but not with the plant state. Es-terel interacts by emitting and detecting signals, while TCCinteracts by telling and asking constraints on program vari-ables. In contrast, RMPL control programsaskconstraints onplant state variables, and request that specified constraints onstate variables beachieved(as opposed totell, which assertsthat a constraintis true). State assertions in RMPL controlprograms are treated asachieveoperations, while state con-dition tests areaskoperations.1

B. RMPL Control Programs

To motivate RMPL’s constructs we consider a more com-plex example, taken from the NASA DS-1 probe. DS-1 usesan ion propulsion engine, which thrusts almost continuouslythroughout the mission. Once a week the engine is turned off,in order for DS-1 to perform a course correction, calledop-tical navigation. The RMPL control program for optical nav-igation (OpNav) is shown in Fig. 7. OpNav works by takingpictures of three asteroids and by using the difference be-tween actual and projected locations to determine the courseerror. OpNav first shuts down the ion engine and prepares itscamera concurrently. It then uses attitude control thrusters toturn toward each of three asteroids, uses the camera to take apicture of each, and stores each picture in memory. The threeimages are then read and processed, and a course correctionis computed. One of the more subtle failures that OpNav mayexperience is an undetected fault in a navigation image. If thecamera generates a faulty image, it gets stored in memory.After all three images are stored, the images are processed,and only then is the problem detected.

1) Desiderata: OpNav highlights five design featuresfor RMPL. First, the program exploits full concurrency, byintermingling sequential and parallel threads of execution.For example, the camera is turned on and the engine is setto standby in parallel (lines 3–4), while pictures are takenserially (lines 7–9). Second, it involves conditional execution,such as switching to standby if the engine is firing (line 4).

1Note that, although RMPL provides the flexibility of asserting any formof constraint on plant variables, the current implementation of the Titanmodel-based executive handles only assertions of constraints that are con-junctions of state variable assignments.


Fig. 7 RMPL control program for the DS-1 optical navigationprocedure. In RMPL programs, comma delimits parallel processesand semicolon delimits sequential processes.

Third, it involves iteration; for example, “when (Engine =Off Engine = Standby) Camera = Ondonext … ” (line6) says to iteratively test until the engine is either off or instandby and the camera is on, and then to proceed. Fourth, theprogram involves preemption; for example, “do… watchingSnapStoreStatus = Failed” (lines 5–15) says to perfom atask, but to interrupt it as soon as the watched condition(SnapStoreStatus = Failed) is entailed. Procedures used byOpNav, such as TakePicture, exploit similar features. Thesefour features are common to most synchronous reactiveprogramming languages. As highlighted in the preceding sec-tions, the fifth and defining feature of RMPL is the ability toreference hidden states of the physical plant within assertionsand guards, such as “if Engine = Firingthennext Engine =Standby” (line 4), where constraints on the hidden state of theengine are used in both a condition check and an assertion.

RMPL is an object-oriented, constraint-based language ina style similar to Java. We develop RMPL by first introducing

a small set of primitives for constructing model-based controlprograms. These primitives are fully orthogonal, that is, theymay be nested and combined arbitrarily. The RMPL primi-tives are closely related to the TCC programming language[16]. To make the language usable, we define on top of theseprimitives a variety of derived combinators, such as thoseused in the orbital insertion and optical navigation controlprograms (see Fig. 4 and 7).

In the following discussion, we use lowercase letters,like , to denote constraints, and uppercase letters, like

and , to denote well-formed RMPL expressions. AnRMPL expression is specified by the following grammar inBackus-Naur Form:

expression assertion combinator

prgm invocation

combinator

prgm invocation programname(arglist)

where an assertion is anachieveconstraint, made up ofconjunctions of propositions. Note that we allow procedurecalls specified as RMPL program invocations, in whichprogramname corresponds to another specified controlprogram. The arglist used in a program invocation corre-sponds to a (possibly empty) list of parameters defined forthe program. Within the invoked procedure, each parameteris replaced by the appropriate argument in arglist. Forexample, the OpNav control program calls the procedureTakePicture three times, each time with a different argument(Asteroid1, Asteroid2, and Asteroid3). With each invoca-tion of TakePicture, the argument replaces the parameter“target,” resulting in a specific Attitude variable assertion(Asteroid1, Asteroid2, and Asteroid3 are all members in thedomain of the Attitude state variable).

2) Primitive Combinators:RMPL provides standardprimitive constructs for conditional branching, preemption,iteration, and concurrent and sequential composition. Ingeneral, RMPL constructs are conditioned on the currentstate of the physical plant, and they act on the plant state inthe next time instant.

: This expression asserts that the plant should progresstoward a state that entails constraint. is anachievecon-straint on variables of the physical plant. This is the basicconstruct for affecting the plant’s hidden state.

maintaining : This expression executes expres-sion , while ensuring that constraint is maintained truethroughout. is anaskconstraint on variables of the physicalplant. If is not entailed at any instant, then the executionthread terminates immediately. This is the basic constructfor preemption by nonentailment.

do watching : This expression executes expression,but if constraint becomes entailed by the most likely plantstate, at any instant, it terminates execution ofin that in-stant. is anaskconstraint on variables of the physical plant.This is the basic construct for preemption by entailment.


if thennext : This expression starts executing RMPLexpression in the next instant, if the most likely currentplant state entails. is anaskconstraint on the variables ofthe physical plant. This is the basic construct for condition-ally branching upon entailment of the plant’s hidden state.

unless thennext : This expression starts executingRMPL expression in the next instant if the current theorydoesnotentail . is anaskconstraint on the variables of thephysical plant. This is the basic construct for conditionallybranching upon nonentailment of the plant’s hidden state.

, : This expression concurrently executes RMPL ex-pressions and , starting in the current instant. It is thebasic construct for forking processes.

; : This is the sequential composition of RMPL expres-sions and . It performs until is finished, then it starts

.always : This expression starts expressionat each in-

stant of time, for all time. This is the only iteration primitiveneeded, since finite iteration can be achieved by using a pre-emption construct to terminate analways.

The previously described primitive combinators cover thefive desired design features. We can use them to implementa rich set of derived combinators, some of which are usedin the orbital insertion and optical navigation examples. Forexample, the derived constructwhen donext is a tempo-rally extended version ofif thennext . It waits until con-straint is entailed by the most likely plant state, then startsexecuting in the next instant. The set of derived combina-tors is included as an Appendix to this paper.

The primitive and derived RMPL constructs are used to en-code control programs. This subset is sufficient to implementmost of the control constructs of the Esterel language [11].A mapping between key Esterel constructs and analogous ex-pressions in RMPL was presented in [5]. Note that RMPL canalso be used to encode the probabilistic transition models cap-turing the behavior of the plant components. The additionalconstructs required to encode such models are defined in [17].The plant model is further defined in Section VI.

V. CONTROL SEQUENCER

The RMPL control program is executed by the control se-quencer. Executing a control program involves compiling itto a variant of hierarchical automata, calledhierarchical con-straint automata (HCA), and then executing the automata incoordination with the deductive controller. In this section wedefine HCA as a specific instance of the deterministic controlprogram automaton presented in Section III. In addition, wediscuss the compilation from RMPL to HCA, and we presentthe execution algorithm used by Titan’s control sequencer.

A. Hierarchical Constraint Automata

To efficiently execute RMPL programs, we translate eachof the primitive combinators, introduced in the previous sec-tion, into an HCA. In the following we call the “states” of anHCA locations, to avoid confusion with the physical plantstate. The overall state of the program at any instant of timecorresponds to a set of “marked” HCA locations. An HCA

has five key attributes. First, it composes sets of concur-rently operating automata. Second, each location is labeledwith a constraint, called agoal constraint, which the physicalplant must immediately begin moving toward, whenever theautomaton marks that location. Third, each location is alsolabeled with a constraint, called amaintenance constraint,which must hold for that location to remain marked. Fourth,automata are arranged in a hierarchy—a location of an au-tomaton may itself be an automaton, which is invoked whenmarked by its parent. This enables the initiation and termina-tion of more complex concurrent and sequential behaviors.Finally, each transition may have multiple target locations,allowing an automaton to have several locations marked si-multaneously. This enables a compact representation for it-erative behaviors, like RMPL’salwaysconstruct.

Hierarchical encodings form the basis for embedded reac-tive languages like Esterel [11] and State Charts [12]. A dis-tinctive feature of an HCA is its use of constraints on plantstate, in the form of goal and maintenance constraints. Weelaborate on this point once we introduce HCA.

A hierarchical constraint automaton is a tuple, where:

1) is a set oflocations, partitioned intoprimitive loca-tions andcomposite locations . Each compositelocation corresponds to another hierarchical constraintautomaton.Wedefine thesetofsubautomataof as theset of locations of , and locations that are descendantsof thecomposite locationsof, that is, thesubautomataof are given by the following recursive function:

subaut subaut

2) is the set of ’s start locations(also called theinitial marking of ).

3) is the set of plant state variables, with eachranging over a finite domain . denotes theset of all finite-domain constraints over variables in,and denotes the set of finite-domainachieveconstraints over , where anachieveconstraint is aconjunction of state variable assignments.

4) : associates with each primitive lo-cation a finite-domain constraint thatthe plant progresses toward wheneveris marked.

is called thegoal constraintof . Goal con-straints may be thought of as abstract set points,representing a set of states that the plant must evolvetoward when is marked.

5) : , associates with each locationa finite-domain constraint that must hold at thecurrent instant for to be marked. is called themaintenance constraintof . Maintenance constraints

may be viewed as representing monitored con-straints that must be maintained in order for execu-tion to progress toward achieving any goal constraintsspecified within .

6) : 2 associates with each locationa transition function . Each : 2


Fig. 8 HCA representation of the RMPL always construct.

specifies asetof locations to be marked at time ,given appropriate assignments toat time .

At any instant , the state of an HCA is the set of markedlocations , called amarking. denotes the set ofpossible markings, where 2 .

As an example of an HCA, consider the RMPL combi-natoralways , which maps to the graphical representationin Fig. 8. This automaton results in the start locations ofbeing marked at each time instant. The locations,, of theautomaton consist of primitive location , drawn to theleft as a circle, and composite location, drawn to the rightas a rectangle. The start locationsare and , and areindicated by two short arrows.

In the graphical representation of HCA, primitive loca-tions are represented as circles, while composite locations arerepresented as rectangles. Constraints are indicated by low-ercase letters, such as, , or . Goal and maintenance con-straints are written within the corresponding locations, withmaintenance constraints preceded by the keyword “main-tain.” Maintenance constraints can be of the form or ,for some . For convenience, in our diagrams we use

to denote the constraint , and to denote the constraint. Maintenance constraints associated with composite lo-

cations are assumed to apply to all subautomata within thecomposite location. When either a goal or a maintenanceconstraint is not specified, it is taken to be implicitlyTrue.In the previous example, implicitly has goal and main-tenance constraintsTrue; other constraints may be specifiedwithin .

Transitions are conditioned on constraints that must be sat-isfied by the conjunction of the plant model and the mostlikely estimated state of the plant. For each location, werepresent the transition function as a set of transitionpairs ( ), where , and is a label (also known asa guard condition) of the form (denoted ) or (de-noted ), for some . This corresponds to the tradi-tional representation of transitions as labeled arcs in a graph,where and are the source and target of an arc with label.Again, if no label is indicated, it is implicitlyTrue. The HCArepresentation ofTrue always in Fig. 8 has two transitions,one from to itself and one from to . Both thesetransitions’ labels are implicitlyTrue.

Our HCA encoding has four properties that distinguish itfrom the hierarchical automata employed by other reactiveembedded languages [11], [12], [18]. First, multiple transi-tions may be simultaneously traversed. This permits a com-pact encoding of the state of the automaton as a set of mark-ings. Second, transitions are conditioned on what can be de-duced from the estimated plant state, not just what is explic-itly observed or assigned. This provides a simple, but gen-eral, mechanism for reasoning about the plant’s hidden state.

Third, transitions can be enabled based on lack of informa-tion, that is, nonentailment of a constraint. This allows de-fault executions to be pursued in the absence of better infor-mation, enabling advanced preemption constructs. Finally,locations assert goal constraints on the plant state. This al-lows the hidden state of the plant to be controlled directly.

It should be noted that two extensions to HCA have beenpresented elsewhere. In [17], we consider the problem of thestate estimation of probabilistic variants of HCA. In [7], weconsider the execution of model-based programs encoded astimed variants of HCA that allow for representation of non-deterministic choice.

B. Executing HCA

Informally, execution of an HCA proceeds as follows. Thecontrol sequencer begins with a marked subset of the HCA’slocations and an estimate of the current plant state from modeestimation. It then creates a set consisting of each marked lo-cation whose maintenance constraint is satisfied by the cur-rent estimated state. Next, it conjoins all goal constraints ofthis set to produce aconfiguration goal. A configuration goalrepresents a set of states, such that the plant must progresstoward one of these states, called thegoal state, which hasthe greatest reward.2 This configuration goal is then passedto mode reconfiguration, which executes a single commandthat makes progress toward achieving the goal. Next, the se-quencer receives an update of the plant state from mode es-timation. Based on this new state information, the HCA ad-vances to a new marking, by taking all enabled transitionsfrom marked primitive locations whose goal constraints areachieved or whose maintenance constraints have been vio-lated, and from marked composite locations that no longercontain any marked subautomata. Finally, the cycle repeats.

More precisely, to execute an HCA , the controlsequencer starts with an estimate of the current state ofthe plant, . It initializes using , a functionthat marks the start locations of and all their startingsubautomata. It then repeatedly steps automatonusingthe function Step , which maps the current state estimateand marking to a next marking and configuration goal.The functions and Step are defined later. Execu-tion “completes” when no marks remain, since the emptymarking is a fixed point.

Given an HCA to be initialized, creates afullmarking, by recursively marking the start locations ofandall starting subautomata of the following start locations:

For example, applying to automatonalways returnsthe set consisting ofalways , , and any start loca-tions contained within .

Step transitions an automaton from the current fullmarking to the next full marking, based on the current

2We define “progress” as taking an action that is part of a sequence ofactions that leads to the goal state.


Fig. 9 Step algorithm.

state estimate, and generates a new configuration goal. TheStep algorithm is given in Fig. 9.

A trajectory of a hierarchical constraint automaton, given estimated plant state sequence [ ],

is a sequence of markings [ ] and con-figuration goals [ ] such that: 1) isthe initial marking ; and 2) for each ,

Step .’s execution completes at time if is the empty

marking, and there is no such that is the emptymarking.

As an example, applying Step to the initial marking ofalways causes to transition to and back to ,and for to transition internally. ’s transition back to it-self ensures that it always remains marked. Its transition toputs a new mark on , initializing the starting subautomataof at each time step. The ability of an HCA to have mul-tiple locations marked simultaneously is key to the compact-ness of this novel encoding, by avoiding the need for explicitcopies of .

This example does not demonstrate the interaction with thephysical plant through goal and maintenance constraints. Wedemonstrate this by revisiting the orbital insertion example inSection V-D.

C. Compiling RMPL to HCA

Each RMPL construct maps to an HCA as shown inFig. 10. The derived combinators are definable in terms ofthe primitives, but for efficiency we map them directly toHCA as well. As before, lowercase letters denote constraintsexpressed in propositional state logic. Uppercase lettersdenote well-formed RMPL expressions, each of which mapsto an HCA.

Fig. 10 Corresponding HCA for various RMPL constructs.

To illustrate compilation, consider the RMPL code for or-bital insertion, shown earlier in Fig. 4. The RMPL compilerconverts this to the corresponding HCA, shown in Fig. 11.3

D. Example: Executing the Orbital Insertion ControlProgram

The control sequencer interacts tightly with the mode es-timation and mode reconfiguration capabilities of the deduc-tive controller. As a practical example, we consider a nom-inal (i.e., failure-free) execution trace for the orbital inser-tion scenario. Markings are represented in the following fig-ures by filling in the corresponding primitive locations. Anycomposite location with marked subautomata is consideredmarked. Locations are numbered 1–9 in the figures for refer-ence.

Initial State: Initially, all start locations are marked (lo-cations 1, 2, 3, 4, 5, 6, and 8 in Fig. 12). We assume modeestimation provides the following initial plant state estimate:{EngineA = Off, EngineB = Off, Camera = On}.

Execution will continue as long as the maintenanceconstraint on automaton 1, EngineA = FiringEngineB = Firing, remains true. In other words, ex-ecution of automaton 1 will terminate as soon asEngineA = Firing EngineB = Firing is entailed.

Similarly, execution of automaton 5 will terminate if everEngineA = Failed is entailed.

First Step: Since none of the maintenance constraintsare violated for the initial state estimate, all start loca-

3Note that the compilation process takes place offline. Only the resultingHCA model needs to be loaded into the embedded processor for eventualexecution.


Fig. 11 HCA model for the orbital insertion scenario.

Fig. 12 Initial marking of the HCA for orbital insertion.

tions remain marked. The goal constraints asserted bythe start locations consist of the following state assign-ments: EngineA = Standby, EngineB = Standby, andCamera = Off (from states 2, 3, and 4, respectively). These

state assignments are conjoined into a configuration goal,and passed to mode reconfiguration. Mode reconfigurationthen issues the first command in a command sequence thatachieves the configuration goal [e.g.,Cmd = CamOff ].

In this example, mode estimation confirms thatCamera = Off is achieved with a one-step operation,

by observing that the shutter position sensor reads closed.

Locations 2 and 3, which assertEngineA = StandbyandEngineB = Standby, remain marked in the next execution

step, because these two configuration goals have not yetbeen achieved. SinceCamera = Off has been achievedand there are no specified transitions from location 4, thisthread of execution terminates. Marked locations 6 and 8,which correspond to “when … donext …” expressions inthe RMPL control program, both remain marked in the nextexecution step, since the only enabled transitions from theselocations are self-transitions. The next execution step’smarking is shown in Fig. 13.


Fig. 13 Marking of the orbital insertion HCA after the first transition.

Second Step:The maintenance constraints correspondingto nonentailment ofEngineA = Failedon automaton 5, andnonentailment of EngineA = Firing EngineB = Firingon automaton 1, still hold for the current state esti-mate, so all marked locations remain marked. Next, thegoal constraints of the marked locations are collected,EngineA = Standby EngineB = Standby, and passed

as a configuration goal to mode reconfiguration. Modereconfiguration issues the first command in the sequence forachieving EngineA = Standbyand EngineB = Standby.

For the purposes of this example, we assume that a singlecommand is required to set both engines to standby, andthat this action is successfully performed. Consequently,mode estimation indicates that the new state estimateincludes EngineA = Standby and EngineB = Standby.This results in termination of the two execution threadscorresponding to these goal constraints, at locations 2 and3. In addition, for marked location 6, the transition labeledwith condition EngineA = Standby Camera = Off isenabled, and hence traversed to location 7. Finally, location8 remains marked.

Thus, after taking the enabled transitions, only two prim-itive locations are marked, 7 and 8. This new marking isshown in Fig. 14.

Third Step: Since none of the maintenance constraints areviolated for the current state estimate, all marked locationsremain marked. Marked location 7 asserts the goal constraintEngineA = Firing, which is passed to mode reconfigura-

tion as the configuration goal. Mode reconfiguration deter-mines that the engine fires by opening the valves that supplyfuel and oxidizer, and issues the first command in a sequencethat achieves this configuration.

We assume that this first command for achievingEngineA = Firing is executed correctly. Since the single

goal constraint is not yet satisfied, and the only enabled

transition is the self-transition at location 8, the markingsand configuration goals remain unchanged (see Fig. 14).

Remaining Steps:Given the same configuration goal asthe last step, mode reconfiguration issues the next commandrequired to achieve EngineA = Firing. It repeats thisprocess until a flow of fuel and oxidizer are established, andmode estimation confirms that the engine is indeed firing.Since this violates the maintenance condition on automaton1, the entire block of Fig. 11 is exited. Note that since noengine failure occurred during this process, the thread ofexecution waiting for EngineA = Failed did not advancefrom its start location (location 8).

VI. DEDUCTIVE CONTROLLER

In this section, we present Titan’s model-based deductivecontroller (see Fig. 15), which uses a plant model to es-timate and control the state of the plant. A physical plantis composed of discrete and analog hardware and software.We model the behavior of the plant as a POMDP, whichis encoded compactly using probabilisticconcurrent con-straint automata (CCA). Concurrency is used to model thebehavior of a set of components that operate synchronously.Constraints are used to represent cotemporal interactions be-tween state variables and intercommunication between com-ponents. Probabilistic transitions are used to model the sto-chastic behavior of components, such as failure and intermit-tency. Reward is used to assess the costs and benefits associ-ated with particular component modes.

This section begins by defining CCA as a compositionof constraint automata for individual components of a plant.The CCA encoding of the plant model is essential to the ef-ficient operation of the deductive controller (see Fig. 15).In Section VI-E, we develop mode estimation, which esti-mates the most likely state of a CCA, and in Section VI-F,


Fig. 14 Marking of the orbital insertion HCA after the second transition.

Fig. 15 Architecture for the deductive controller.

we develop mode reconfiguration, which generates a controlsequence that transitions the CCA to a state that achieves aconfiguration goal.

A. Constraint Automata

The semantics of the plant model was introduced in Sec-tion III-A. Our model of a physical plant is encoded as aset of concurrently operatingconstraint automata, one con-straint automaton for each component in the model. Con-straint automata are component transition systems that com-municate through shared variables. The transitions of eachconstraint automaton represent a rich set of component be-haviors, including normal operation, failure, repair actions,and intermittency. The constraint automata operate synchro-nously; that is, at each time step every component performs asingle state transition. Constraint automata for the simplifiedspacecraft components were provided pictorially in Fig. 5.

Each constraint automaton has an associatedmode vari-able with domain . Given a current mode assign-ment, , the component changes its mode by selectinga transition function among a set of possible tran-sition functions , according to probability distri-bution . A component’s behavior within each

mode is modeled by an associated constraintover the component’s attribute variables, and an associatedreward .

More precisely, the constraint automatonfor componentis described by a tuple , where:

1) is a set of variables for the component, where eachvar ranges over a finite domain var . is par-titioned into a singleton set , containing the com-ponent’s mode variable , and a set of attributevariables . Attributes include input variables, outputvariables, and any other variables needed to describethe modeled behavior of the component. de-notes the set of all finite-domain constraints over.

2) associates with each mode as-signment a finite domain constraint

. This constraint captures the component’sbehavior in a given mode.

3) associates with eachmode assignment a set of transi-tion functions. Each transition function

specifies an assignment to at time, conditioned on satisfaction of a constraint in

at time . is the set { ,, }, where the transi-

tion function representing nominal behavior is denoted, and the other transition functions repre-

sent fault behaviors.4) denotes the probability that

is the initial mode for component.5) denotes, for each mode

variable assignment , a probability distributionover the possible transition functions .

6) denotes the reward associated withmode variable assignment .


Fig. 16 Constraint automata for a driver component and a valve component. The driver has a singlefault mode, “resettable,” corresponding to a recoverable failure of the device. The valve is alsomodeled with a single fault mode, “stuck-closed.” The probabilistic transitions from the nominalmodes to these failure modes are omitted from the diagram for clarity.

It should be noted that these transitions correspond to a re-stricted form of the transitions defined for HCA. Transitionsbetween successive modes are conditioned on constraints onvariables . For each value of component mode variable

, we represent each of its transition functions, ,as a set oftransition pairs ( ), where , andis a set of labels (or guards) of the form (denoted ) or

(denoted ), for some . This corresponds tothe traditional representation of transitions as labeled arcs ina graph, where and are the source and target of an arcwith label . Unlike HCA transitions, we impose the restric-tion that, for each transition function, exactly one transitionpair will be enabled for any mode assignment . Thismeans that a component can only be in a single mode at anygiven time.

B. Concurrent Constraint Automata

A physical plant is modeled as a composition of concur-rently operating constraint automata that represent its indi-vidual components. This composition, including the inter-connections between component automata and interconnec-tions with the environment, is captured in a CCA.

Formally, a CCA is described by a tuple , wherethe following conditions exist.

1) denotes the finite set of con-straint automata associated with thecomponents inthe plant.

2) is a set ofplant variables, where each varranges over a finite domain var . denotes theset of all finite-domain constraints over. is parti-tioned into sets ofmodevariables ,control variables , observablevari-ables , and dependentvariables

.

3) is a conjunction of constraintsproviding the interconnections between the attributesof the plant components.

Mode variables represent the state of each component. Ac-tuator commands are relayed to the plant through assign-ments to control variables. Observable variables capture theinformation provided by the plant’s sensors. Finally, depen-dent variables represent interconnections between compo-nents. They are used to transmit the effects of control actionsand observations throughout the plant model.

The state space of, denoted , is the cross productof the var , for all variables var . The state space ofthe plant mode variables , denoted , is the crossproduct of the , for all variables . A stateofthe plant at time, , assigns to each componentmode variable a value from its domain. Similarly, anobser-vation of the plant, , assigns to each observ-able variable a value from its domain, and acontrol action,

, assigns to each control variable a value fromits domain.

A CCA models the evolution of physical processes byenabling and disabling constraints in aconstraint store.Enabled constraints in the store would include the set of

constraints imposed by the current plant stateand the constraints associated with the CCA.

C. CCA Specification Example

Consider the valve and driver component models de-picted in Fig. 16. The plant variables associated withthese components are Driver and Valve, with domains of{On, Off, Resettable} and {Open, Closed, Stuck-closed},respectively. The driver’s constraint automaton has attributesdcmd and dcmd . The valve’s constraint automaton hasattributes vcmd, inflow, and outflow.


The mode constraints for the driver component arespecified as follows:

(Driver = On) dcmd dcmd

(Driver = Off) dcmd no-cmd

Driver = Resettable dcmd no-cmd

For the driver component, two transition functions are spec-ified, the nominal transition function and the failuretransition function

(Driver = On) dcmd Off,Off

dcmd Off,On

(Driver = Off) dcmd On,On

dcmd On,Off

Driver = Resettable dcmd Reset,On

dcmd Reset,

Resettable

Driver = On True, Resettable

Driver = Off True, Resettable

Driver = Resettable (True,Resettable)

Mode constraints and transition functions can be similarlyexpressed for the valve component.

The probabilities in associated with transitionfunctions and are 0.99 and 0.01, respec-tively (in this case, the probabilities are independent of thecurrent mode variable assignment). Modal costs and initialmode probabilities are not specified in this example.

For the CCA composed of the driver and valve con-straint automata, we specify the set of control vari-ables dcmd , the set of observable variables

inflow outflow , and the set of dependent variablesdcmd vcmd .

The component interconnections are given by

dcmd vcmd

D. Feasible Trajectories of Concurrent ConstraintAutomata

Next, consider the set of trajectories that arefeasible, thatis, trajectories in which each pair of sequential states is con-sistent with a CCA. We defer to the next section discussionof the computation of the most likely plant state, given a setof observations.

Given a sequence of control variable assignments[ ], a feasible trajectoryof a plant is asequence of plant states [ ], such that: 1)is a valid initial plant state, i.e., forall assignments ; and 2) for each ,

Step .Step transitions the

CCA of a plant , by nondeterministically executing a tran-sition for each of its component automata, according to thealgorithm in Fig. 17.

Fig. 17 Step algorithm.

Fig. 18 One step of mode estimation for the orbital insertionexample. Possible faulty valves are circled and closed valves arefilled.

We use Step to denote a variant of the step functionthat is restricted to only the nominal transition function. Atrajectory that involves only nominal transitions is calleda nominal trajectory.

E. Estimating Modes

Mode estimation incrementally tracks the set of planttrajectories that are consistent with the plant model, thesequence of observations, and the control actions. This ismaintained as a set of consistent current states. For example,suppose the deductive controller is trying to maintain theconfiguration goal EngineA = Firing, as shown on theleft in Fig. 18. We assume that mode estimation starts withknowledge of the initial state, in which a set of open valvesallows a flow of oxidizer and fuel into Engine A. In thenext time instant, the sensors send back the observation thatThrust = Zero. Mode estimation then considers the sets

of possible component mode transitions that are consistentwith this observation, given the initial state and plant CCA.It identifies a number of state transitions that are consistentwith this observation, including that either of the inlet valvesinto Engine A has transitioned to stuck closed, or that any


combination of valves along the fuel or oxidizer supply pathare stuck closed, as depicted on the right in Fig. 18.

Unfortunately, the size of the set of possible current statesis exponential in the number of components. Even for thissimple example, this reaches about a trillion states. Further-more, a large percentage of these states will not be ruled outby past observations. For example, given a system whose be-havior looks correct based on current observations, it is al-ways possible thatany combination of components—fromone to all—are broken in a novel manner, but have simplynot manifested their symptoms yet.

This exponential set is infeasible to track in real time formost practical embedded systems. In addition, the set of con-sistent states is far too large to provide significant useful in-formation. To perform useful actions, the deductive controllerneeds tobeable todistinguish thesmall setofplausiblecurrentstates fromthe largesetofconsistentbut implausiblestates.Toaccomplish this, mode estimation computes the likelihood ofthe current estimated states, called abelief state. In addition, itcomputes the set of consistent trajectories and states in orderof likelihood. This offers an any-time algorithm, by stoppingwhen no additional computational resources are available tocompute the less likely trajectories.

In Section III, we defined the plant model in terms of aPOMDP, and discussed how mode estimation is framed asan instance of belief state update. In this section, we discussthe implementation of belief state for CCA. We then describethe simplest variant of mode estimation for CCA, based on aform of beam search.

1) Belief Update for CCA:To define belief update forCCA, we simply need to define and . To be con-sistent with our earlier development, these definitions mustconstrain the probability of any inconsistent trajectory to bezero. To calculate the probabilistic transition function forstate of a plant CCA, we define a plant transition functionto be composed of a set of component transition functions,one for each component mode . In addition, aCCA specifies the transition probability distribution for eachcomponent mode through . We make the keyassumption that component transition probabilities are con-ditionally independent, given the current plant state. Thisis analogous to the failure independence assumptions madeby the General Diagnostic Engine (GDE) [19], Sherlock [20]and Livingstone [3], and is a reasonable assumption for mostengineered systems. Hence, is computed as a crossproduct of the probability distributions over the individualcomponent transition functions

We calculate the observation function for from theCCA, taking an approach similar to that of GDE [19]. Giventhe constraint store for , computed as the conjunc-tion of and for all , we testif each observation in is entailed or refuted by the con-junction of and , giving probability 1 or 0, respec-tively. If no prediction is made, that is, if an observation is

neither entailed nor refuted, then ana priori distribution onpossible values of the observable variable is assumed (e.g.,a uniform distribution of for possible values). Thisoffers a probabilistic bias toward states that predict observa-tions, over states that are merely consistent with the obser-vations. These two definitions for and complete ourbelief update equations for CCA.

2) CCA Belief Update Under Tight Resource Con-straints: The remaining step of our development is toimplement mode estimation as an efficient form of limitedbelief update. Earlier in this section, we pointed out that thebelief state contains an exponential number of states. In par-ticular, the belief state is , where is the number of modevariables and is the size of the domain of the mode variables.A real-worldmodelofanembeddedsystem,suchas themodeldeveloped for NASA’s DS-1 probe [3], involves roughly 80mode variables and 3 states. DS-1 allocated 10% of its20–MHz processor to mode estimation and reconfiguration,and had an update rate for sensor values of once every second.These resource constraints allow only a small fraction of thestate space to be explored explicitly in real time.

A standard approach, frequently employed for HiddenMarkov Models, is to construct the Trellis diagram (seeFig. 6) over a finite, moving window, and to identify themost likely trajectory as a shortest path problem. In stateestimation for CCA, the Trellis diagram forms a constraintgraph with concurrent transitions, and the shortest path isfound by framing a problem to the OpSat optimal constraintsatisfaction engine [21].

For embedded systems with severe computational re-source constraints, such as a spacecraft, we desire anapproach that limits the amount of online search. This maybe accomplished by tracking the most likely trajectoriesthrough the Trellis diagram, expanding only the highestprobability transitions at each step, as time permits. Thisapproach provides an any-time algorithm, as the mostlikely belief states are generated in best-first order.4 Froma practical standpoint, this approach has proven extremelyrobust on a range of space systems [3], including deep-spaceprobes, reusable launch vehicles, and Martian habitats, andit far surpasses the monitoring capabilities of most currentembedded systems.

3) Computing Likely Mode Estimates Using OpSat:Titanframes the enumeration of likely transition sets as an optimalconstraint satisfaction problem (OCSP) and solves theproblemusing theOpSatalgorithm,presented indetail in [21].

An OCSP is a problem of the form “arg maxsubject to ,” where is a vector of decision vari-

ables, is a set of propositional state constraints, andis a multiattribute utility function that ismutually pref-

erentially independent (MPI). is MPI if the value ofeach that maximizes is independent of the valuesassigned to the remaining variables.

Solving an OCSP consists of generating a prefix of thesequence of feasible solutions, ordered by decreasing value

4The limitation is that a low probability trajectory may be pruned, which,after additional evidence is collected, could become highly likely.


Fig. 19 A spacecraft has complex paths of interaction: commands issued by the flight computerpass through the bus controller, 1553 bus, and PDE on the way to the spacecraft main engine system.

of . A feasible solution assigns to each variable inavalue from its domain such that is satisfied. To solvean OCSP, OpSat tests a leading candidate for consistencyagainst . If it proves inconsistent, OpSat summarizesthe inconsistency (called aconflict) and uses the summary tojump over leading candidates that are similarly inconsistent.

To view mode estimation as an OCSP, recall that each planttransition consists of a single transition for each of its compo-nent transition systems. Hence, we introduce a variable intofor each component in the plant whose values are the possiblecomponent transitions. Each plant transition corresponds toan assignment of values to variables in. is used toencode the condition that the target state of a plant transi-tion and its corresponding constraint store must be consistentwith the observed values. The objective function,, is just theprior probability of the plant transition. The resulting OCSPidentifies the leading transitions at each state, allowing modeestimation to track the set of likely trajectories.

Note that the OCSP for a typical plant model is quite large.For example, the plant model developed for the DS-1 space-craft consisted of roughly 80 mode variables, 3300 proposi-tional variables, and 12 000 propositional clauses. The tworeasoning methods we employ in OpSat to achieve fast re-sponse are conflict-directed A* [21] and incremental truthmaintenance (ITMS) [22]. Consistency of a candidate solu-tion is tested through systematic search and unit propagation.For our applications, the cost of performing unit propagationdominates. The ITMS offers an incremental approach to per-forming unit propagation that provides an order of magnitudeperformance improvement over traditional truth maintenancesystems. If the ITMS finds a candidate to be inconsistent,it returns a conflict associated with the inconsistency. Con-flict-directed A* then uses the conflicts returned by the ITMSto prune inconsistent subspaces during best-first search overdecision variables . Conflict-directed A* dramatically re-duces the number of candidate states tested for consistencyduring mode estimation; for most applications, the numberis typically less than a dozen states.

F. Reconfiguring Modes

Mode reconfiguration takes as input a configuration goaland returns a series of commands that progress the plant to-ward a maximum-reward state that achieves the configura-tion goal. Mode reconfiguration accomplishes this throughtwo capabilities, thegoal interpreterand reactive planner.These capabilities operate in tight coordination with mode

estimation, described in the preceding section. The goal in-terpreter uses the plant model and the most likely currentstate, provided by mode estimation, to determine a reachablegoal state that achieves the configuration goal, while maxi-mizing reward. The reactive planner takes a goal state and acurrent mode estimate, and generates a command sequencethat moves the plant to this goal state. The reactive plannergenerates and executes this sequence one command at a time,using mode estimation to confirm the effects of each com-mand. For example, in our orbital insertion example, given aconfiguration goal ofEngineA = Firing, goal interpretationselects a set of valves to open, in order to establish a flow offuel into the engine. Reactive planning then sends commandsto control units, drivers, and valves to achieve this goal state.

Having identified which valves to open and close, onemight imagine that achieving the configuration is a simplematter of calling a set of open-valve and close-valve rou-tines. This is in fact how Titan’s predecessor, Livingstone[3], performed mode reconfiguration. However, much of thecomplexity of mode reconfiguration is involved in correctlycommanding each component to its intended mode, throughlengthy communication paths from the control processor.

For example, Fig. 19 shows a typical set of communica-tion paths from the flight computer to part of the spacecraftmain engine system. The flight computer sends commandsto a bus controller, which broadcasts these commands over a1553 data bus. These commands are received by a bank of de-vice drivers, such as the propulsion drive electronics (PDE).Finally, the device driver for the appropriate device translatesthe commands to analog signals that actuate the device.

The following is an example of a scenario that a robustclose-valve routine should be able to handle:

The close-valve routine should first ascertain if the valveis open or closed, by polling its sensors. If it is open, itneeds to broadcast a “close” command. However, first itneeds to determine if the driver for the valve is on, againby polling its sensors, and if not, it should send an “on”command. Now suppose that shortly after the driver isturned on, it transitions to a resettable failure mode. Thevalve driver needs to catch this failure, and then beforesending the “close” command, it should issue a “reset”command. Once the valve has closed, the close-valveroutine should power off the valve driver, to save power.However, before doing so, it should change the state ofany other valve that is controlled by that driver; other-wise, the driver will need to be immediately turned onagain, wasting time and power.


There are several properties that make mode reconfigura-tion complex. First, devices are controlled indirectly throughphysical interactions that are established and removed bychanging modes of other devices. Hence, to change a de-vice’s mode, a communication path must be established tothat device, by changing the modes of other devices. Second,because communication paths are shared with multiple de-vices, mode changes need to be carefully ordered, in order toavoid unintended or destructive effects. Third, since failuresoccur, the modes of all relevant devices need to be monitoredat each step, through all relevant sensors. Finally, correctiveaction should be performed as soon as a failure is detected,in order to mitigate further damage.

The challenge for mode reconfiguration is twofold. First,controlling a plant, described by CCA, generalizes the classicartificial intelligence (AI) planning problem [23], which isalready NP-hard, to one of handling indirect effects. Second,to be practical for embedded systems, the deductive con-troller’s response time should be driven toward a small con-stant. Accomplishing this requires eliminating as many in-stances of online search as possible.

1) Robustness Through Reversibility:A key decisionunderlyingourmodel-based executive is the focuson the mostlikely trajectory generated by mode estimation. Livingstone[3] took a more conservative strategy of considering a singlecontrol sequence that covers the complete set of likely states.Thedifficultywith this approach is that the differentestimatedstates will frequently require different control sequences.

Utility theory can be used to select, among different con-trol sequences, one that maximizes the expected likelihood ofsuccess [24]. However, the cost of generating multiple statesand control sequences works against our goal of buildinga fast reactive executive. Achieving this goal is essential ifmodel-based programming languages are to be employedwithin everyday embedded systems, where procedural pro-gramming is the norm.

Instead, we adopt a greedy approach of assuming the mostlikely estimated state is the correct state. This greedy ap-proach introduces risk: the control action appropriate for themost likely state trajectory may be inappropriate, or worsedamaging, if the actual state is something else. Furthermore,the reactive focus of the model-based planner precludes ex-tensive deliberation on the long-term consequences of ac-tions, thus leaving open the possibility that control actions,while not outright harmful, may degrade the system’s capa-bilities. For example, firing a pyro valve is an irreversibleaction that forever cuts off access to parts of the propulsionsystem. Such actions should be taken after a human oper-ator or high-level planning system explicitly reasons throughthe consequences of the actions over a future time horizon.Hence, fundamental to the reliability of our approach is therequirement that MR will only generate reversible control ac-tions, unless the purpose of the action is to repair failures.5

2) Goal Interpretation (GI): GI is closely related tomode estimation. While mode estimation searches forlikely modes that are consistent with the observations, GI

5While repairing a failure is irreversible, it is important to allow a reactiveexecutive to repair failures, in order to increase functionality.

Fig. 20 ComputeGoalStatesalgorithm.

searches for modes that entail the configuration goal whilemaximizing reward. GI generates a maximum-reward plantstate that satisfies the current goal configurationand that is reachable from the current most likely state,using nominal transitions whose effects arereversible. Wedefine reversible transitions as transitions for whichthe source state may be returned to from the target state bya sequence of nominal control actions. We useto denote the set of all states that are reachable fromthrough repeated application of reversible transitions. Wecall these states “reversibly reachable” from .

A priori, computing seems like an expensivetask. It could be computed by generating all trajectoriesof reversible transitions originating at , for example,by employing a symbolic reachability analysis similar tothose used by symbolic model checkers. However,has the key property that it may be computed as the crossproduct of reachable target mode assignments, ,of the reversible transitions for individual components

This property follows from subtle arguments related toreversibility of mode assignments and transitions [4]. Thegoal interpretation function, ComputeGoalStates

2 , employs this property to compute the setof all reversibly reachable target statesthat achieve con-figuration goal , given a plant model with estimatedcurrent state . This algorithm is presented in Fig. 20.

Given reversibly reachable states for eachcomponent mode variable , casting GI as an OCSP

is straightforward. There is a decision variablein for each , with domain , representingthe reversibly reachable modes of component.is the condition that the target assignmentis consis-tent with “constraint store” , and that , togetherwith , entail configuration goal . is the reward,

, of being in a state. An example ofa reward metric is power consumption. Once again, Titansolves this OCSP using OpSat [21].

Fig. 21 shows an example of a conflict-directed searchsequence, which OpSat performs during GI. Suppose wehave the configuration goalSpacecraft = Thrusting, which


Fig. 21 Snapshots demonstrating how the goal interpreter searches for a reachable state thatachieves the configuration goal of(Spacecraft = Thrusting)with maximum reward. Conflicts used todirect the search are highlighted by the small black arrows. In the last step, GI identifies that the openvalves on the disabled Engine A should be closed, to minimize loss of propellant.

is achievable by firing either engine. Furthermore, supposethat the right flow inlet valve to Engine A has just stuckclosed. This is the configuration shown in the upper left ofFig. 21, with the stuck valve circled and closed valves filledin. Starting with this configuration as a candidate goal state,GI deduces that it does not entailSpacecraft = Thrusting.Furthermore, it extracts the conflict that thrust cannot beachieved unless the right inlet valve on either engine isopened. The right inlet valve on Engine A is stuck closedand cannot be made open. To resolve the conflict, OpSatcreates a new candidate state, which has the right inletvalve to Engine B open, as shown in the upper right ofFig. 21. OpSat tests this candidate and proves that it alsodoes not entail thrust. In addition, it extracts the conflictthat thrust will not be achieved as long as one of the twohighlighted valves feeding Engine B is closed, or the stuckinlet valve to Engine A remains closed. OpSat generates thenext candidate [Fig. 21 (lower right)], by opening one ofthe two highlighted valves leading into Engine B (the onewith higher reward associated with its open mode). OpSatcontinues this process, and finds the maximum-reward goalstate that achieves thrust [Fig. 21 (lower left)], after testingless than ten candidates. This is an exceptionally smallnumber of tested candidates, given that the state space forthis example contains roughly 3possible target states.

3) Reactive Planning:Given a goal state supplied by GI,the reactive planner (RP) is responsible for achieving thisgoal state. Titan’s model-based RP is called Burton and wasfirst introduced in [4]. RP takes as input the current mostlikely state and the goal state selected by GI, and gener-ates the first control action that moves the plant toward thegoal state. More precisely, RP takes as input a CCA modelof a plant , a most likely current state (from ME),and a maximum-reward goal state (from GI) that sat-isfies goal . RP generates a control action such that

Step is the goal state , oris the prefix of a simple nominal trajectory that

ends in .Titan’s model-based reactive planning problem closely re-

lates to the classic AI planning problem [23].6 The key dif-ference is that a classical planner invokes operators thatdi-rectly modify state. Titan’s model-based RP, on the otherhand, exerts control by establishing values for control vari-ables, which interactindirectlywith internal plant state vari-ables, through cotemporal, physical interactions, represented

6The classic AI planning problem involves generating a set of discrete ac-tions that move the system from an initial state to a goal state, where eachaction is an instance of a STRIPS operator [25]. A STRIPS operator is anatomic action that maps states to states. It is specified by a set of precondi-tions and a set of effects, which enumerate all changes in variable values.


in the CCA. This extension adds substantial expressivity tothe planner.

The challenge is to preserve this expressivity, whileproviding a planner that is reactive. Titan’s RP meets fivedesiderata. First, it only generatesnondestructive actions,that is, an action will never undo the effects of previousactions that are still needed to achieve top-level goals.Second, RP will not propose actions that lead todeadendplans, that is, it will not propose an action to achieve asubgoal when one of the sibling subgoals is unachievable.Third, the reactive planner iscomplete, that is, when givena solvable planning problem, it will generate a valid plan.Note, however, that RP may not generate the shortest lengthplan. Fourth, RP ensuresprogressto a goal, except whenexecution anomalies interfere, that is, the nominal trajec-tory generated by RP for a fixed goal is loop-free. Fifth,RP operates atreactive time scales—its average run-timecomplexity is constant. This speed is essential to providinga model-based executive with response times comparable totraditional procedural executives [26], [27].

Classical planners search through the space of possibleplans, while using mechanisms, such asthreat detection, todetermine if one action is in danger of prematurely removingan effect intended for a subsequent action [23]. Titan’s RPavoids run-time search, requires no algorithms for threat de-tection, and expends no effort determining future actions orplanning for subgoals that are not supported by the first ac-tion. The RP accomplishes this speedup by exploiting the re-quirement that all actions, except repairs, be reversible, andby exploiting certain topological properties of componentconnectivity that frequently occur in designed systems. Thispermits a set of subgoals to be solved serially (that is, one ata time), using divide and conquer to achieve an exponentialspeedup.

A complete development of RP is beyond the scope of thispaper. The design of the Burton reactive planner has beenpreviously presented in [4]. In the remainder of this section,we highlight three key elements of RP: compilation of con-current constraint automata into concurrent automata withoutconstraints, generation of concurrent policies from the com-piled concurrent automata, and online execution of the con-current policies.

Compiling Away Indirect Interactions:Indirect controland effects substantially complicate planning. We solve theproblem of indirect control and effects at compile-time, bymapping the plant CCA to an equivalent automaton that onlyhasdirecteffects. In particular, the transition labels of a CCAcontain conditions involving dependent variables that are notdirectly controllable, but depend on control variables that arerelated through the mode constraints and the intercon-nections . In the slightly modified driver/valve example ofFig. 22, the valve transitions are conditioned on assignmentsto the dependent variable vcmd, which is linked to controlvariable dcmd by the interconnection constraints and themode constraints of the driver [see Fig. 22(a)].

To eliminate indirect control, we eliminate dependent vari-ables from each transition label, by constructing an equiva-lent label in terms of control and mode variables alone. For

example, we replace the valve’svcmd open transi-tion label with the pair of conditionsdriver = on, dcmdopen} [see Fig. 22(b)]. Once the dependent variables havebeen eliminated from the labels, the mode constraintsand interconnectionsare no longer needed, and hence areremoved, as shown in Fig. 22(b). The result is a set of con-current automata without constraints, that are conditioned oncontrol variable assignments (calledcontrol conditions) andmode variable assignments (calledstate conditions).

Compilation is performed by a restricted form ofprime im-plicant generationthat is described in [4]. While prime im-plicant generation is NP-hard, in practice even sizable setsof implicants can be generated very fast. We use OpSat [21]to perform a form of abductive best-first search for primeimplicants in the plant model. This allows us to generate im-plicants from a spacecraft model consisting of over 12 000clauses in about 40 s on a Sparc 20. We compute these im-plicants at compile-time, thus avoiding the impact of thesecomputations on run-time performance.

Generating Concurrent Policies:Once the constraintshave been compiled out of the concurrent automata, we usethese automata to generate control codes for each compo-nent, which form a set of concurrentfeasible policies. Foreach possible current mode assignment and goalmode assignment , a feasible policy maps pairs

to the ordered list of state conditions and controlconditions that must be achieved in order to transition from

to (see Fig. 23). For example, given the goal state of(valve = open), and current state (valve = closed), the valvepolicy says that the plan must first achieve the goal state(driver = on).

To generate such policies, RP imposes the following re-quirements on the compiled plant model.

1) Each control variable has an idling assignment, andno idling assignment appears in any transition. Thelabel of every transition includes a (nonidling) controlassignment or a mode assignment subgoal.

2) No set of control assignments of one transition is aproper subset of the control assignments of a differenttransition.

3) The causal graph for the compiled concurrent au-tomaton must beacyclic. A causal graph for acompiled plant model is a directed graph whose ver-tices are the state variables of the plant.contains anedge from variable to , if appears in the labelof any of ’s transition pairs ( ). In the compiledmodel for the driver/valve example, the state variabledriver appears in two of the transitions for thevalvecomponent, so we draw an edge fromdriver to valvein the causal graph (see Fig. 24).

The basic idea behind these policies is that we wish tosolve a conjunction of goal assignments by achievingconjuncts one at a time—that is,serially—and by orderinggoal achievement by working “upstream” along the acycliccausal graph, from outputs to control variables. This up-stream ordering corresponds to a topological numbering.We establish atopological orderamong component mode


Fig. 22 Compilation example for a driver/valve pair. The uncompiled concurrent constraintautomata are shown in (a) and the compiled concurrent automata are shown in (b).

Fig. 23 Policies for the driver (top) and valve (bottom).

variables by performing a depth-first search of the causalgraph at compile-time, and numbering variables on the wayout. For our example, this process determines the topologicalorder as indicated in Fig. 24. By encoding serialized goalsin the policies, we eliminate the potential for conflictingsubgoal interactions, thus guaranteeing that RP generatesnondestructive action sequences. Proof of this fact is pro-

Fig. 24 A causal graph for the driver/valve pair. This causal graphis acyclic. Topological numbers are circled.

vided in [4]. Achieving subgoals serially also ensures thatRP always makes progress toward the goal (Desiderata 4).

For a mode variable with domain size , ’s policyis of size , and the policy is computed in time,where is the maximum number of transitions in a singlecomponent constraint automaton. Entries for eachare de-termined by computing a spanning tree directed towardthat connects all reversibly reachable modes, by reversibletransitions.

RP’s concurrent policies are analogous to traditional op-timal policies in control theory and universal plans in AI. Animportant difference is that Titan’s RP constructs a set of con-current policies, rather than a single policy for the complete


Fig. 25 ComputeNextAction algorithm.

state space. A traditional policy grows exponentially in thenumber of state variables, and is infeasible for models suchas a spacecraft, which contain more than 80 state variablesand 3 states. In contrast, Titan’s concurrent policies growonly linearly in the number of state variables, and have theadditional feature of preserving modularity.

Online Policy Execution:Once the concurrentpolicies have been generated at compile-time, plan-ning is performed using the online reactive planningfunction ComputeNextAction, presented in Fig. 25.ComputeNextAction takes as input the current state esti-mate , a goal state compiled concurrent policies

and the flag top?, used to indicate a top-level call.ComputeNextAction returns either a next control action,Failure if no plan exists, orSuccessif the current state is thegoal state. The assignments in are sorted by topologicalnumber, providing an upstream order in which the goalsmay be solved serially. A control action is a partial set ofcontrol assignments. All control variables not mentioned areassigned their idle value.

Step 1 of ComputeNextAction tests whether or not a con-junction of top-level goals can be achieved. This operationcan be performed very quickly, because the set of reversiblyreachable modes, , has already been computedduring the offline CCA compilation process. Since the RPonly introduces subgoal assignments that are guaranteedto be achievable, the test is needed at the top level only.Step 2 works upstream along the causal graph of variables,selecting the next unsatisfied (sub)goal assignment to beachieved. Step 3 takes the first step toward achieving theselected (sub)goal. Given a current assignment , agoal assignment is achieved by traversing a pathalong transitions of from to . Step 3 identifies the firsttransition along this path from to . Step 4 recursivelycalls ComputeNextAction on the subgoals comprising thestate conditions of this first transition, which results in theRP moving further upstream. If one or more state conditionsof the transition is unsatisfied, then a next control actionis computed in Step 4, and returned. If all state conditionsare satisfied, then the transition is ready to be executed, andStep 4 returns the transition’s control conditions as the nextcontrol action.

For example, suppose mode estimation identifies thecurrent state asdriver = off, valve = open, and goal inter-

pretation provides the goalvalve = closed, driver = off,which is determined to be reversibly reachable (Step 1). Step2 selects the first goal assignment in the topological order,valve = closed. Step 3 looks up open closed in the valve

policy (see Fig. 23), and gets SC driver = on andCC dcmd close . The first is an unsatisfied statecondition, so Step 4 recursively calls ComputeNextActionwith (driver = on) as the goal. Looking up off on inthe driver policy , we get the control conditiondcmd on . This control assignment is returned by

ComputeNextAction and is immediately invoked.For a fixed goal and no intervening failures, RP generates

successive control actions as a depth-first traversal throughthe policy tables. This traversal maps out a subgoal tree, andRP maintains an index into this tree during successive calls.To generate a sequence, RP traverses each tree edge exactlytwice, and generates one control action per vertex. Since thenumber of edges in a tree is bounded by the number of ver-tices, theaveragecase complexity of generating each con-trol action is roughly constant.7 Thus, Desiderata 5—reac-tivity—is achieved.

Failure States and Repair:A key function of RP isto handle failure. To accomplish this, we incorporate repairactions. The occurrence of failures are outside the reactiveplanner’s control, since there are no nominal transitions thatlead to failure. Hence, a repair sequence is irreversible, al-beit essential, and so is not covered by the development thusfar. Titan’s RP permits repair sequences, which are selectedso as to minimize the number of irreversible steps taken. RPnever uses a failure to achieve a goal assignment if the failureis repairable. However, if it is not repairable, then RP is al-lowed to exploit the component’s faulty state. For example,suppose a valve is needed to be open, and it is permanentlystuck-open. Since stuck-open is irreparable but has the de-sired effect, RP exploits the failure mode. The algorithm usedby RP to invoke appropriate repair actions is presented in [4].

Returning to our driver/valve example, suppose modeestimation reports that the driver has just turned on as aresult of our last call to ComputeNextAction. The cur-rent and goal assignments of the valve are unchanged,hence the entrysc,cc driver = on dcmd closeis revisited. However, the first conjunct, (driver = on),is now satisfied. Since no state conditions remain, RPissues (dcmd close) as the next action. Next, assumemode estimation reports that a partial failure occurredafter the last command was issued, such that the currentstate is (driver = resettable, valve = closed). In its nextcall to ComputeNextAction, RP determines that the firstunachieved goal assignment is now (driver = off). Lookingup resettable off in the driver policy (see Fig. 23)results in the repair action (dcmd off).

7The caveat is that at each node all of the assignments of a label’s transi-tion may be visited, in order to find the next one that needs to be achieved.A label typically contains only a couple of assignments; thus, its size cangenerally be bounded by a small constant, but in the worst case this is on theorder of the number of concurrent automata.


VII. D ISCUSSION

The RMPL compiler is written in C++. It generates HCAfrom RMPL control programs. The compiler supports allRMPL primitive combinators, and a variety of derivedcombinators. Plant models are currently written in MPL,the modeling language used for Livingstone. Compilationof the models into CCA is performed using Livingstone’sLisp-based compiler. The Titan model-based executive iswritten in C++, and builds on the OpSat system [21]. Titanis being demonstrated on mission scenarios for NASA’sMESSENGER mission, and will be flight demonstratedon the MIT SPHERES spacecraftinside the InternationalSpace Station. Titan is a superset of the Livingstone system[3], which was demonstrated in flight on the DS-1 mission,and on a range of testbeds for NASA, the U.S. Navy andindustry, including a space interferometer [9], a Marsin situpropellant production plant [10], and a Mars rover prototype[28].

Turning to related work, the model-based programmingparadigm synthesizes ideas underlying synchronous pro-gramming, concurrent constraint programming, traditionalrobotic execution, and Markov models. Synchronous pro-gramming languages [11], [12] were developed for writingcontrol code for reactive systems. Synchronous program-ming languages exhibit logical concurrency, orthogonalpreemption, multiform time, and determinacy, which Berryhas convincingly argued are necessary characteristics forreactive programming. RMPL is a synchronous language,and satisfies all these characteristics. One major goalof synchronous programming is to provide “executablespecifications,” that is, to eliminate the gap between thespecifications about which we prove properties, and the pro-grams that are supposed to implement these specifications.We carry this one step further by performing our reasoningon executable programs directly, in real time.

Model-based programming and concurrent constraintprogramming share common underlying principles, in-cluding the notion of computation as deduction over systemsof partial information [29]. RMPL extends constraintprogramming with a paradigm for exposing hidden states,a replacement of the constraint store with a deductivecontroller, and a unification of constraint-based and Markovmodeling. This provides a rich approach to managingdiscrete processes, uncertainty, failure, and repair.

RMPL and Titan also offer many of the goal-directedtasking and monitoring capabilities of AI robotic executionlanguages, like RAPS [26] and ESL [27]. One key differ-ence is that RMPL’s constructs fully cover synchronousprogramming, hence moving toward a unification of agoal-directed AI executive with its underlying real-timelanguage. In addition, Titan’s deductive controller handlesa rich set of system models, moving execution languagestoward a unification with model-based autonomy.

We are pursuing a number of extensions to the model-based programming language and executive presented here.For example, in [7] we extend the model-based program-ming paradigm to include fast temporal planning. In [30],

we extend the mode estimation capability of the executiveto handle hybrid concurrent constraint automata, whose con-straints are linear ordinary differential equations. Finally, weare extending RMPL to coordinate heterogeneous cooper-ative systems, such as fleets of unmanned air vehicles andMars explorers [31], by unifying high-level mission planningwith agile path planning.

APPENDIX

DERIVED RMPL COMBINATORS

The following useful derived combinators can beconstructed from the primitive constructs presented inSection IV-B2:

next : This expression starts executing expressioninthe next instant. It is equivalent to {if true thennext }.

if thennext elsenext : This extendsif then-next . Expression is executed starting in the next in-stant if is not entailed by the most likely current state.isanaskconstraint on the variables of the physical plant. Thisexpression is equivalent to {if thennext , unlessthennext }.

when donext : This is a temporally extended versionof if thennext . It waits until constraint is entailed bythe most likely plant state, then starts executingin the nextinstant. is anaskconstraint on the variables of the phys-ical plant. This expression is equivalent to {always if ( )thennext , do always watching }, where is a nonphys-ical state assertion introduced for guarding against starting

after the first instant in which is entailed. is triviallyachieved when asserted.

whenever donext : This is an iterated version ofwhendonext . For every instant in which constraintholds for

the most likely state, it starts programin the next instant.is anaskconstraint on the variables of the physical plant.

This expression is equivalent to {always if thennext }.

ACKNOWLEDGMENT

The authors would like to thank A. Barrett, L. Fesq, B.Laddaga, M. Pekala, R. Ragno, H. Shrobe, G. Sullivan, J. VanEepoel, D. Watson, A. Wehowsky, and D. Weld. They extenda special acknowledgment to V. Gupta for his contributionsto the development of model-based execution.

REFERENCES

[1] T. Young et al.. (2000) Report of the Mars Program IndependentAssessment Team. NASA, Washington, DC. [Online] Available:http://www.nasa.gov/newsinfo/marsreports.html

[2] J. Casaniet al., “Report on the Loss of the Mars Polar Landerand Deep Space 2 Missions,” NASA Jet Propulsion Laboratory,Pasadena, CA, JPL D-18 709, 2000.

[3] B. C. Williams and P. Nayak, “A model-based approach to reac-tive self-configuring systems,” inProc. 13th Nat. Conf. Artif. Intell.(AAAI-96), vol. 2, 1996, pp. 971–978.

[4] , “A reactive planner for a model-based executive,” inProc. Int.Joint Conf. Artif. Intell. (IJCAI-97), vol. 2, 1997, pp. 1178–1185.

[5] M. Ingham, R. Ragno, and B. C. Williams, “A reactive model-basedprogramming language for robotic space explorers,” presentedat the Int. Symp. Artif. Intell., Robotics, Automation Space(ISAIRAS-01), Montreal, QB, Canada, 2001.


[6] S. Chung, J. V. Eepoel, and B. C. Williams, “Improving model-basedmode estimation through offline compilation,” presented at the Int.Symp. Artif. Intell., Robotics, Automation Space (ISAIRAS-01),Montreal, QB, Canada, 2001.

[7] P. Kim, B. C. Williams, and M. Abramson, “Executing reactive,model-based programs through graph-based temporal planning,”in Proc. Int. Joint Conf. Artif. Intell. (IJCAI-01), vol. 1, 2001, pp.487–493.

[8] D. Bernardet al., “Design of the Remote Agent Experiment forspacecraft autonomy,” presented at the IEEE Aerosp. Conf., Aspen,CO, 1999.

[9] M. Ingham et al., “Autonomous sequencing and model-basedfault protection for space interferometry,” presented at the Int.Symp. Artif. Intell., Robotics, Automation Space (ISAIRAS-01),Montreal, QB, 2001.

[10] C. Goodrich and J. Kurien, “Continuous measurements and quan-titative constraints—challenge problems for discrete modeling tech-niques,” presented at the Int. Symp. Artif. Intell., Robotics, Automa-tion Space (ISAIRAS-01), Montreal, QB, Canada, 2001.

[11] G. Berry and G. Gonthier, “The synchronous programming languageESTEREL: Design, semantics, implementation,”Sci. Comput. Pro-gram., vol. 19, no. 2, pp. 87–152, 1992.

[12] D. Harel, “Statecharts: A visual formulation for complex systems,”Sci. Comput. Program., vol. 8, no. 3, pp. 231–274, 1987.

[13] R. Davis, “Diagnostic reasoning based on structure and behavior,”Artif. Intell., vol. 24, pp. 347–410, 1984.

[14] L. Fesqet al., “Model-based autonomy for the next generation ofrobotic spacecraft,” inProc. 53rd Int. Astronautical Cong. Int. As-tronautical Federation (IAC-02), 2002, Paper IAC-02-U.5.04.

[15] V. Saraswat, “The category of constraint systems is carte-sian-closed,” presented at the 7th IEEE Symp. Logic Comput. Sci.,Santa Cruz, CA, 1992.

[16] V. Saraswat, R. Jagadeesan, and V. Gupta, “Foundations of TimedDefault Concurrent Constraint programming,” presented at the 9thIEEE Symp. Logic Comput. Sci., Paris, France, 1994.

[17] B. C. Williams, S. Chung, and V. Gupta, “Mode estimation of model-based programs: Monitoring systems with complex behavior,” inProc. Int. Joint Conf. Artif. Intell. (IJCAI-01), vol. 1, 2001, pp.579–590.

[18] N. Halbwachs,Synchronous Programming of Reactive Systems,ser. Series in Engineering and Computer Science. Norwell, MA:Kluwer, 1993.

[19] J. de Kleer and B. C. Williams, “Diagnosing multiple faults,”Artif.Intell., vol. 32, no. 1, pp. 97–130, 1987.

[20] , “Diagnosis with behavioral modes,” inProc. Int. Joint Conf.Artif. Intell. (IJCAI-89), 1989, pp. 1324–1330.

[21] B. C. Williams and R. J. Ragno, “Conflict-directed A* and its rolein model-based embedded systems,” J. Discrete Appl. Math., to bepublished.

[22] P. Nayak and B. C. Williams, “Fast context switching in real-timepropositional reasoning,” inProc. 14th Nat. Conf. Artif. Intell.(AAAI-97), 1997, pp. 50–56.

[23] D. S. Weld, “An introduction to least commitment planning,”AIMag., vol. 15, no. 4, pp. 27–61, 1994.

[24] G. Friedrich and W. Nejdl, “Choosing observations and actionsin model-based diagnosis/repair systems,” inProc. 3rd Int. Conf.Principles Knowledge Representation Reasoning (KR-92), 1992,pp. 489–498.

[25] R. Fikes and N. Nilsson, “STRIPS: A new approach to the applica-tion of theorem proving to problem solving,”Artif. Intell., vol. 2, pp.189–208, 1971.

[26] R. J. Firby, “The RAP language manual,” Univ. Chicago, Chicago,IL, Animate Agent Project Working Note AAP-6, 1995.

[27] E. Gat, “ESL: A language for supporting robust plan execution inembedded autonomous agents,” inPlan Execution: Problems Issues,Papers 1996 AAAI Fall Symp., 1996, pp. 59–64.

[28] J. Kurien, P. Nayak, and B. C. Williams, “Model-based autonomy forrobust Mars operations,” presented at the 1st Int. Conf. Mars Soc.,Boulder, CO, 1998.

[29] V. Gupta, R. Jagadeesan, and V. Saraswat, “Models of concurrentconstraint programming,” inLecture Notes in Computer Science,CONCUR ’96: Concurrency Theory. Heidelberg, Germany:Springer-Verlag, 1996, vol. 1119, pp. 66–83.

[30] M. Hofbaur and B. C. Williams, “Mode estimation of probabilistichybrid systems,” inLecture Notes in Computer Science, Hybrid Sys-tems: Computation and Control. Heidelberg, Germany: Springer-Verlag, 2002, vol. 2289, pp. 253–266.

[31] B. C. Williams et al., “Model-based reactive programming of co-operative vehicles for Mars exploration,” presented at the Int. Symp.Artif. Intell., Robotics, Automation Space (ISAIRAS-01), Montreal,QB, Canada, 2001.

Brian C. Williams (Member, IEEE) receivedthe S.B., S.M., and Ph.D. degrees in computerscience and artificial intelligence from theMassachusetts Institute of Technology (MIT),Cambridge, in 1981, 1984, and 1989, respec-tively.

From 1989 to 1994, he was with the XeroxPalo Alto Research Center, Palo Alto, CA, wherehe pioneered multiple-fault, model-based diag-nosis in the 1980s and model-based autonomy inthe 1990s through the Livingstone model-based

health management and the Burton model-based execution systems. From1994 to 1999, he was with the NASA Ames Research Center, Moffett Field,CA, where he formed the Autonomous Systems Area and coinvented theRemote Agent model-based autonomous control system. He was a memberof the NASA Deep Space One flight team, which used Remote Agent tocreate the first fully autonomous, self-repairing explorer, demonstrated inflight in 1999. He was a member of the Tom Young Blue Ribbon Team in2000, assessing future Mars missions in light of the Mars Climate Orbiterand Polar Lander incidents. He is currently a Professor at MIT, where heformed the Model-Based Embedded and Robotic Systems group, focusingon the development of model-based programming technology and itsdeployment to robotic explorers and cooperative systems.

Dr. Williams is currently a Member of the Advisory Council of the NASAJet Propulsion Laboratory, Pasadena, CA. He received a NASA Space ActAward in 1999 for his work in the Autonomous Systems Area and with theRemote Agent control system.

Michel D. Ingham received the B.Eng. degree inmechanical engineering (honors program) fromMcGill University, Montreal, Canada, in 1995and the S.M. degree from the Department ofAeronautics and Astronautics at the Massachu-setts Institute of Technology (MIT), Cambridge,in 1998. He is a Ph.D. degree candidate at MIT.

In 1999, he spent a semester at the NASA JetPropulsion Laboratory, Pasadena, CA, analyzingflight data from the Interferometry Program Ex-periment, a technology demonstration precursor

for NASA’s Space Interferometry Mission. In 2001, he coordinated the MITModel-Based Embedded and Robotic Systems group’s involvement in theNew Millenium Program ST-7 autonomy concept definition study. He iscurrently a Research Assistant with the MIT Space Systems Laboratory,working with Prof. Brian Williams, and a Research Affiliate with the Ar-tificial Intelligence Group, NASA Jet Propulsion Laboratory. His currentresearch interests are in the field of model-based autonomy, focusing on hy-brid model-based programming as an approach to robust sequencing andcontrol of robotic spacecraft.

Mr. Ingham received the International Astronautical Federation’s 1998Luigi G. Napolitano Award for significant contribution to aerospaceresearch, for his research in the area of microdynamics of deployable spacestructures.


Seung H. Chung received the B.S. degreemagna cum laudefrom the Department ofAeronautics and Astronautics Engineering at theUniversity of Washington, Seattle, in 1999. Heis completing a Master of Science degree at theMassachusetts Institute of Technology (MIT),Cambridge, and has been awarded a NASAGraduate Student Research Program fellowshipfor his doctoral studies.

He was in the Propulsion Flight SystemsGroup, NASA Jet Propulsion Laboratory,

Pasadena, CA, participating in the design of the Mars Sample Returnmission. In 2002, he worked with the Artificial Intelligence Group at theNASA Jet Propulsion Laboratory on the development of a distributedmodel-based executive. He is currently with the MIT Space Systems andArtificial Intelligence Laboratories, working with Prof. Brian Williams. Hiscurrent research interests include model-based autonomy and its applicationto space missions, and the development of Titan, a model-based executivecapable of autonomously estimating the state of spacecraft, diagnosing andrepairing faults, and executing commands.

Paul H. Elliott received the S.B. degree inaeronautical and astronautical engineering andthe S.B. degree in electrical engineering andcomputer science from the Massachusetts Insti-tute of Technology (MIT), Cambridge, in 2001.He is currently working toward the S.M. degreein aeronautical and astronautical engineeringat MIT, and intends to pursue the Ph.D. degreethere.

He is currently a Research Assistant workingwith Prof. Brian Williams. His research interests

include graphics, software engines, simulators, and compilation. He is cur-rently working on model compilation.


model-based programming of intelligent embedded systems and

Documents