reasoning about and in time when building plans for safe, fully-automated aircraft flight

90
Thesis Proposal: Reasoning About and In Time when Building Plans for Safe, Fully-Automated Aircraft Flight Ella M. Atkins University of Michigan 1101 Beal Ave. Ann Arbor, MI 48109 Co-advisors: Edmund H. Durfee and Kang G. Shin Thesis Committee: Edmund Durfee, Kang Shin, Dan Koditschek, Mike Wellman, and N. Harris McClamroch

Upload: umich

Post on 03-May-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Thesis Proposal:

Reasoning About and In Time when Building Plans

for Safe, Fully-Automated Aircraft Flight

Ella M. Atkins

University of Michigan

1101 Beal Ave.

Ann Arbor, MI 48109

Co-advisors:

Edmund H. Durfee and Kang G. Shin

Thesis Committee:

Edmund Durfee, Kang Shin,

Dan Koditschek, Mike Wellman, and N. Harris McClamroch

ABSTRACT

Achieving safe, fully-automated control of a complex system requires fast, accurate

responses to maintain safety while also driving the system toward its objective (goal).

Researchers from the planning, real-time systems, and control systems fields have different

definitions of success. Planning researchers concentrate on high-level goal achievement,

using discrete-valued features to represent knowledge and a variety of search engines to

build action plans based on possible worlds that might be reached. Real-time researchers

consider problems from the resource requirement and scheduling perspective. Control

systems researchers use mathematical models in conjunction with sensor feedback to

determine actuator commands. I feel it is crucial to interface these fields in a way that

highlights the strengths of each. I propose a system that uses an AI planner to build a high-

level plan, which is then explicitly allocated resources by a real-time scheduler. This plan

will dictate a feasible set of state trajectories which are then achieved by a controller. One

big challenge in this endeavor is identifying an appropriate interface language. For my

research, I plan to concentrate on two questions, “How can the planner identify trajectories

that are feasible for the controller(s)?”, and “How can we best consider the real-time issues

associated with planning, scheduling, and executing these plans in a dynamic

environment?”

I plan to conduct all my thesis research in the context of the Cooperative Intelligent Real-

time Control Architecture (CIRCA), which combines a traditional AI planner, scheduler,

and real-time plan execution module to provide guaranteed performance when controlling

complex real-world systems. I propose to extend CIRCA by imposing planning execution

time bounds, and by implementing a plan storage system so that CIRCA can achieve a

near-optimal balance between online (time-constrained but reactive) and offline planning.

I propose to study these research issues by using CIRCA to help achieve safe, fully-

automated aircraft flight. This domain is certainly complex, given its highly nonlinear

dynamic properties and the criticality of real-time response to avoid a crash in the worst

case. During flight, safety involves not only reacting to ordinary circumstances, but also

reacting to a daunting set of anomalies. Today’s Flight Management Systems can already

completely control an aircraft under ordinary circumstances. I have augmented CIRCA so

that it can detect and respond to a variety of anomalies, and have begun testing CIRCA’s

ability to control an aircraft simulator. I propose that careful consideration of planning,

real-time, and control systems interfaces as well as associated temporal issues will move us

closer to making safe, fully-automated flight a reality.

1

=====================================================

CHAPTER 1

INTRODUCTION=====================================================

Throughout our recorded existence, one attribute of the world has remained predictable --

the constant passage of time. Changes in time are easy to model and measure, as should be

the case since time has been the basis for centuries of work in mathematics, physics, and

engineering. Today’s real-time systems experts base much of their work on the precept

that deadlines can only be met by carefully allocating the available computational resources

to complete the tasks at hand. Control engineers carefully construct all their models so they

can guarantee a stable and predictable system response using both sensor feedback and

their prediction of how the system state may change as a function of time.

At its inception, AI planning research focused only on modeling discrete changes in high-

level quantities, such as those found in the “blocks world” and “robot planning” STRIPS

examples [8], rather than modeling them as functions of time. Today’s AI researchers have

recognized the importance of accurately handling time during planning, and have responded

via mechanisms such as those to impose limits on planner deliberation time. However,

many planners still rely heavily on the assumption that the world may be modeled by a set

of highly discretized features, and that accurate world models can be constructed from a set

of state transitions do not explicitly consider either the relative or absolute passage of time.

In such models, if the feature discretizations are “natural” (i.e., the quantity is completely

modeled because it was discrete by nature, as in an aircraft model feature called “gear

status” with values “up” or “down”), then perhaps the planner can get away with not

modeling time explicitly. However, often discretized feature value boundaries are artificial

devices used to promote tractability when modeling or working with continuous quantities

(e.g., fuel quantity in an aircraft), in which case much information is lost.

Researchers have proposed planners using techniques such as Markov Decision Processes

(MDP) [6] to produce states corresponding with constant discrete time steps (∆t) in the real

world. And, in several simplified cases, these models can be shown to have desirable

properties, including computational tractability and the ability to accurately model changes

in discrete state features over time. However, a typical real-world domain model will

produce a very complex MDP [18]. Also, in some cases, the Markov assumption [27]

2

required for an MDP is difficult to satisfy, at which time the MDP becomes “partially

observable” (a POMDP [5]), and even more difficult to solve [5].

My academic research goal is to improve the capabilities of AI planning systems such that

they may accurately reason about all temporal characteristics associated with their domains,

where the planner’s “domain” actually includes both the plan execution system (e.g.,

computational resources, sensors, actuators, etc.) as well as the physical environment and

properties of the system to be controlled. The Cooperative Intelligent Real-time Control

Architecture (CIRCA) [20], [22] combines a traditional AI planner, scheduler, and real-

time plan execution module to provide guaranteed performance for the control of complex

real-world systems. With sufficient resources and perfect domain knowledge, CIRCA can

build and schedule control plans that, when executed, are assured of responding quickly

enough to any events so that CIRCA remains safe (its primary task) and whenever possible

reaches its goals. I chose CIRCA as a basis for my work because previous researchers

had explicitly designed the system to consider at least certain aspects of time -- not so much

from the “accurate world model” perspective I discuss above, but via explicit plan

scheduling and real-time plan execution modules to provide timeliness guarantees about the

actions to be executed. By improving the current knowledge representation and planning

algorithms, I hope to extend CIRCA so that it can build a sufficient world model within

time constraints, even for a complex dynamic system.

This research goal was developed as a prerequisite to fulfill a more comprehensive goal I

had prior to my arrival at Michigan: to help produce a system capable of safe, fully-

automated aircraft flight. Current FMS incorporate many concepts from the real-time and

control systems fields, but have very limited capabilities with regard to building plans and

reacting appropriately, particularly when anomalies (e.g., failed actuators, environmentally-

induced emergencies) arise. I believe my proposed work toward the complete and accurate

consideration of temporal aspects while planning will be crucial for the full automation of

aircraft, and in turn, I believe the complexity of the aircraft flight domain will help me better

identify aspects that need to be considered during the development of the time-sensitive

planner I propose in this document.

1.1 Problem Statement

When asked what I do for research, I typically respond with “I’m working to take the pilots

out of commercial aircraft”. This answer gets the attention of most everyone, and all are a

3

bit skeptical about the safety associated with fully automated aircraft, particularly scientists

and engineers. After all, who would want to fly in an aircraft with no pilot? The answer,

no one -- yet! I first came upon the fully-automated aircraft goal while training for my

private pilot license. I read, in shock, that the vast majority of aviation accidents, even in

commercial air carrier operations, are caused at least in part by pilot error. I asked myself

why pilot error would dominate the list of factors causing accidents. The answer contains

many aspects, ranging from pilot physical incapacitation to inadequate coordination among

the cockpit crew to a pilot’s lack of understanding of aircraft’s systems. The FAA sets

stringent standards, but they cannot screen out all pilots who might possibly commit some

erroneous act.

To date, the technical approach has been to improve cockpit Flight Management Systems

(FMS) [17] to minimize pilot error in tasks which can be easily handled with available

technology. As a start, such systems were built so that a pilot need not worry about

mistakes in mundane tasks such as fuel calculations and holding an altitude during cruise.

As controls and computing technologies have improved, FMS have evolved into today’s

systems that are able to completely operate an aircraft from takeoff through landing. Given

this current capability, why is the pilot still around? In summary, airline pilots are around

to enhance safety and to coordinate with air traffic controllers (a task which is in the

process of being partially automated by others [4], [32]). Current FMS work fine under

normal flight circumstances, but the pilots still must manually reprogram or override the

FMS and fly manually when any of a great number of anomalies occur during flight.

Motivated by my desire to produce a safe FMS that does not require pilot intervention as a

“fallback position”, one of my major research goals is to identify classes of anomalies that

may be present during flight, as well as all classes of actions that may be selected to

adequately handle such anomalies, and to appropriately use all technologies required to

allow an autonomous aircraft to accurately invoke the proper responses for all anomalies as

well as or better than a good pilot. These goals will carry me well beyond the Ph.D.

process, but during my research at Michigan I hope to build a good technical foundation for

carefully building safety-oriented FMS flight plans that will eventually help produce such

an FMS or equivalent applicable for any pilot-operated vehicle, ranging from aircraft to

automobiles to robots exploring deep space or the ocean floor.

To successfully control a fully-automated complex system situated in a dynamic

environment, technical approaches from the fields of artificial intelligence (AI), real-time

4

systems, and control systems can be combined to form a highly robust and capable system.

As discussed earlier, real-time and control experts have based much of their best work on

the careful consideration of system behavior as a function of time, thus I propose that time

is a basic key to the development of a robust system to build plans of action for any fully-

autonomous system. To attack this problem, I propose that much of my Ph.D. research

focus on the development of the necessary tools to help a planner consider all aspects of

time during deliberation. These temporal aspects include the following: 1) Meta-level

consideration of planning deliberation time, 2) Construction of a world model that

represents how all modeled domain features change in the world without loss of accuracy

due to discretization, and 3) Explicit scheduling of each plan to allow execution timing

guarantees for critical planned actions. Currently, architectures tend to focus on only one

or two of these aspects, but not all three. CIRCA already performs scheduling of critical

actions. I plan to augment CIRCA so it reasons about and effectively uses available

deliberation time, while accurately modeling domain feature changes over time. Of course,

the more accurate the domain model, the more deliberation time required, so appropriate

tradeoffs must be made when a planner strives for both accuracy and quick execution.

When CIRCA’s planner is augmented as described above, it will have the capability to

represent the temporal aspects required for effective deliberation, scheduling, and domain

modeling. However, to test its capabilities fully, I will need to interface to a dynamic

system with sufficient complexity. I wish to model the dynamic system in a feasible way,

such that I can maximize the computational capabilities of the overall system using the

CIRCA planning and scheduling algorithms along with the appropriate control systems

technology. Thus, when developing a CIRCA knowledge base, I must carefully specify

timings for each feature test and action, as was the case for previous versions of CIRCA,

which had a basic philosophy of “correctly” interfacing an AI planner with real-time

scheduler and plan execution system. Now, to correctly interface the control system with

the planner and real-time system, I need to identify data to be shared between the real-time

system and controllers and also between the AI planner and controllers (e.g., knowledge

base features which allow reasoning about reference trajectory). Although my model of an

aircraft will not become very complex during my Ph.D. research, I will concentrate on

modeling select quantities “correctly” so that the aspects of temporal reasoning and

combination of planning, real-time, and control systems technology are effectively

demonstrated.

5

1.2 Technical Approach

Typically, specialists from the AI planning, real-time systems, and control systems

research fields have looked at problems from their specific area of expertise, applying

“black boxes” to the other fields. Because specialists in each field do not design universal

interfaces to their systems, researchers in a different field find it very difficult to interface

with their systems. For example, many control engineers assume high-level reference

trajectories are available, but they don’t provide a representation of their system which

might easily allow a planning system to reason about the controller’s capabilities and

limitations in a way to promote generation of exclusively feasible reference trajectories. AI

researchers often build a planner’s knowledge base using discrete features that have little

natural relation to the dynamics or time requirements associated with the real-world system

to be controlled.

For my research, I will consider interfaces among these three fields in terms of safely

controlling a piloted dynamical system, and will demonstrate research results on an aircraft

simulator. I will define what is meant by system “state” in the languages of a planner, real-

time scheduler, and classical controller. Briefly, a planner considers state as a certain set of

features that are true in the world, usually discrete in nature, but possibly using temporal

and probabilistic models. A real-time scheduler models the world as a set of computational

resources and a group of periodic and aperiodic tasks that must be allocated resources to

meet their deadlines. Finally, a typical control system models the world as a continuously

changing state vector, input, and output sets, with states including system quantities such

as continuous-valued position and velocity, inputs including desired (or reference) state

directly measured or estimated using sensor measurements, and outputs including

commands to the system’s actuators.

I will assume that a group of state estimators and controllers exist that can successfully

observe and control the system within certain regions of the controller’s state space, and

that the designer of each controller has specified its capabilities and limitations. My work

begins by abstracting this information to the planner, maintaining the continuous nature of

each controller’s state-space only so much as is necessary for the planner to accurately

develop a new plan which considers the complete set of controller capabilities. I also

assume that each controller and state estimator will consume a known set of computational

resources and that these resources have already been allocated and scheduled. Under these

6

assumptions, the online scheduler need only be concerned with the resource requirements

associated with the execution of planned actions.

As discussed in Section 1.1, I will be performing my work in the context of CIRCA. In

CIRCA, reasoning about inherently continuous quantities has not been extensively done to-

date, although there is an implemented scheduler and a first-cut notion in the planner of

time-dependent transitions. I believe the key to incorporating these quantities and

addressing associated temporal issues will be found in the CIRCA planning module that

performs “action scoring” -- deciding which action to perform in a certain state. Basically,

the newly selected action must be applicable from that state, meaning that it must result in a

state which falls within the controllable state space region specified for that action’s

controller. This introduces a new challenge that is addressed later in this proposal:

Creating a representation of planner state that will be sufficiently expressive to provide the

values needed for action scoring as well as to maintain an accurate probability model.

A basic goal of this research is to implement a system which simultaneously addresses the

three temporal issues discussed in Section 1.1, including planner deliberation time, world

state changes over time, and scheduling of actions to meet deadlines. By starting with

CIRCA, I have in place a system which reasons about the computational resources during

plan execution -- the CIRCA planner computes action deadlines, then a schedule is built to

guarantee that critical actions happen fast enough. In the original CIRCA, the planner’s

deliberation time was reduced by clever expansion of only states that were reachable from

the initial state, but there were no algorithms implemented to reason about the planner’s

deliberation time. Also, state transition times were modeled using only worst-case constant

limits, but this assumption often resulted in overconservative scheduling, and there was

never any notion of absolute time, which, as I discuss in Section 3, is important for certain

classes of state feature sets. I propose a combination of MDP and the existing fast but

potentially inaccurate CIRCA state expansion algorithms to help achieve a good balance

between planner deliberation time and level of temporal modeling accuracy present during

planning. In this manner, if a large amount of time is available for planning, an MDP-

based state model will be developed. Conversely, if the planner’s deliberation time is

short, the world model will be developed with minimal temporal modeling, as is done in

the current version of CIRCA.

For my thesis research, I will work with fully-automated aircraft only in simulation, since a

safe fully-automated system is far from perfection. The simulator I have been using is

7

cheap (free), and uses a reasonably complex model of nonlinear aircraft dynamics. Also,

the source code is available, so additions to aircraft capabilities (e.g., to model anomalies)

and building an interface to CIRCA and any low-level controller modules are relatively

easy. Unfortunately, much of the work to completely identify the set of anomalies that may

occur during flight will remain after I graduate, but my goal is to identify a small and

realistic set of anomalies, and model those correctly to test the CIRCA capabilities and

rather primitive control law set I will be using.

1.3 Proposal Outline

The purpose of this document is to summarize my research to-date, and to describe in detail

all research I hope to accomplish for my dissertation. To start, I describe the CIRCA

system and its proposed evolution (Section 2), starting with CIRCA as it existed when I

first came to Michigan, followed by a discussion of how I have augmented CIRCA to-date

and my vision for CIRCA upon completion of my dissertation work. I describe my

rationale for proposed changes and additions to CIRCA only briefly in Section 2, with

further clarification presented as I address more detailed issues regarding my research

goals.

I devote a section of this proposal to each of my basic research goals. In each of these

sections, I first describe relevant background and work completed, and how it may help me

achieve the specific goal at hand. Next, I describe work to be done during my dissertation

research. Since I cannot solve all the world’s problems in a few scant years, I finally

discuss major issues that will still remain when I’m done with my Ph.D. research. In

Section 3, I describe a method by which CIRCA may reason about the crucial aspects of

time while planning, including planning deliberation time, system state changes as a

function of time, and real-time issues associated with plan execution in a dynamic

environment. Section 4 concentrates on the integration of AI planning, real-time

scheduling, and control systems technologies. I include an outline for the methodology by

which important quantities may be transferred between modules, and describe how these

quantities may be used in the context of CIRCA.

Section 5 addresses my goal of helping achieve fully-automated aircraft flight. For my

thesis work, I will use an aircraft simulator to illustrate situations that must be modeled and

handled in CIRCA and to provide an interesting testbed for the implemented CIRCA

algorithms. Although I plan to eventually use my CIRCA research to help achieve

8

automated flight, I will consider the aircraft primarily as a testbed for my dissertation

research, since I will need many years to develop a model that includes all documented

anomalous situations.

Finally, Section 6 provides a summary of my proposed research, in terms of goals and

methods to achieve these goals, as well as items that have been completed versus still to be

completed. I also present an ordered list of small steps I will take to accomplish my

research goals, although these steps will remain without explicit deadlines since I have

traditionally been very bad at predicting research task resource requirements.

9

=====================================================

CHAPTER 2

CIRCA=====================================================

I plan to conduct all thesis research within the context of the Cooperative Intelligent Real-

time Control Architecture (CIRCA). In this chapter, I describe the evolution of CIRCA,

including the system as I inherited it (Section 2.1), CIRCA as it operates today (Section

2.2), and the proposed architecture which will be operational before my Ph.D. work is

complete (Section 2.3). The first version of CIRCA had one basic purpose -- to combine a

traditional AI planner with separate real-time system such that the planner could develop a

plan, schedule it, then execute it in guaranteed real time. To-date, I have modified CIRCA

to handle a variety of new circumstances, including multiple subgoals, “unhandled” states

(to be described), and probabilistic state transition models. Concurrent work [19] has

allowed parallel execution of the scheduler and planner. For my thesis work, I propose a

future version of CIRCA that will allow a more realistic treatment of temporal issues

associated with limited planning deliberation time and reasoning accurately about the world.

2.1 Background

The CIRCA system [20], [22] was designed to provide guarantees about system

performance even with limited sensing, actuating, and processing power. When

controlling a complex system in a dynamic environment, a real-time plan execution system

may not have sufficient resources to be able to react in all situations. Based on a user-

specified domain knowledge base, CIRCA’s main goal was to build a plan to keep the

system "safe" (i.e., avoid catastrophic failure) while working to achieve its goals if

possible. CIRCA then used its knowledge about system resources to build a schedule that

guaranteed meeting critical deadlines. This schedule was then executed on a separate real-

time processor. Figure 1 shows the general architecture of the CIRCA system. The AI

subsystem (AIS) contained the planner and scheduler. The "shell" around all AIS

operations consisted of meta-rules controlling a set of knowledge areas, similar to the PRS

architecture [11]. Working memory contained tasks to be executed, including planning and

downloading plans from the AIS to the real-time subsystem (RTS), which subsequently

executed the scheduled plan.

10

Real-Time Subsystem

TAP Schedule

Environment Interface Functions

TAP schedules

handshake

data control commands

Environment

Sensors Actuators

AI Subsystem

Planner

Scheduler

Meta-level Control Knowledge

Knowledge Base

initial state / goals

state transitions

Figure 2-1. CIRCA -- Original Version.

The CIRCA domain knowledge base included a set of transitions which modeled how the

world can change over time, specification of the initial (or startup) state, and one goal state.

To minimize domain knowledge complexity, the world model (i.e., reachable set of states)

was created incrementally based on the initial state set and available transitions. Figure 2-2

shows a typical state set expanded during a planning cycle. CIRCA began planning by

selecting one of the initial states and building a list of descendants resulting from state

transitions. This original planner used lookahead search to select actions and backtracked if

the action did not ultimately help achieve a subgoal or avoid catastrophic failure (e.g.,

aircraft crash). Based on the assumption that it is infeasible to either build or schedule

Universal Plans [30] to handle all states (as discussed in [10]), CIRCA minimized planner

memory and time usage by expanding only states produced by transitions from initial states

or their descendants. State expansion terminated whenever all features of the specified goal

state were reached in at least one reachable state while avoiding all failure states.

The original CIRCA knowledge base contained three possible transition types: action,

event, and temporal. All CIRCA transitions had a set of preconditions, discrete feature-

value pairs that must be matched before that transition can occur, and a set of

postconditions, feature-value pairs that will be true after that transition occurs. Action

transitions corresponded with commands that CIRCA may explicitly execute (e.g., put

aircraft landing gear down), while temporal and event transitions corresponded to state

changes not initiated by CIRCA (e.g., collision-course air traffic appears). Each event

transition created a nondeterministic branch in state space -- at any time after the state

becomes true, the event may or may not happen. CIRCA always had to expand both the

branch where the event occurred and that where the event did not occur, since it had no

11

knowledge about likelihood of that event actually happening before some other transition

occurs. Temporal transitions were similar to the event transitions, except that there was a

nonzero constant minimum delay between the time a state was entered and the time that

transition could occur.

F

tt

ac

tt = temporal transitionttf = temporal to failureac = action transition

I = Initial States

tt

tt

ttf

F

tt

ttf

ac

tt

tt

.... Gac

G = Goal State

I

I

D

D

D

D = Deadend States

F = Failure States

Figure 2-2. States Expanded during Planning.

Minimum delay until a transition may occur is particularly important when a temporal

transition to failure (TTF) is involved. In this case, CIRCA must schedule an action that

will be guaranteed to execute before that temporal transition has a chance of occurring,

effectively preempting the TTF (thus avoiding the catastrophic situation). CIRCA "plays it

safe" by assuming the action must be guaranteed to occur before the delay has passed. Note

that the notion of absolute preemption of any transition was only possible when an action

could be guaranteed to complete execution before the temporal transition’s minimum delay

passed, so event transitions could never be preempted and there was no notion of “event

transition to failure”.

Once CIRCA finished expanding the set of reachable states, a list of planned actions and

states in which to execute each of these actions was compiled. This list was called a control

plan and was represented as a set of test-action pairs (TAPs). Typical tests to determine

system state involved reading sensors and comparing the sensed values with certain preset

thresholds, while actions involved sending actuator commands or transferring data between

CIRCA modules. CIRCA minimized the set of TAP tests using ID3 [24], using a list of all

states in which that TAP action should be executed as positive examples and all other

expanded states as negative examples. When the AIS planner created a TAP, it stored an

12

associated execution deadline, which is used by a distance-constrained scheduler [21] to

create a periodic TAP schedule that guarantees system safety when TTFs are present. If the

scheduler was unable to create a schedule to support all deadlines, the AIS backtracked to

the planner, whose only option was to search through the other possible action sets that

could preempt all TTFs until either the scheduler was successful or else all possibilities

were exhausted, in which case the planner failed.

Presuming the planner and scheduler are successful, the AIS downloaded the TAP plan to

the RTS, which acknowledged receipt of the plan, began executing the plan as specified in

the schedule, and notified the AIS when/if the goal state was reached, although in the

original CIRCA this message did not have great significance, since the one goal was

always the final system goal to be achieved, and there was little feedback describing what

went wrong if the goal was never reached.

2.2 Present-Day System

Figure 2-3 shows CIRCA as it exists today. Several differences from Figure 2-1 may be

noted, including the splitting of the AIS into separate modules, new feedback from the RTS

to AIS, and the presence of a new module called “Abstraction Subsystem (ABS)”.

Previously, the AIS was structured so that a meta-level set of Knowledge Areas (KAs)

were used to control a potentially complex set of tasks. However, we have seen over the

years that there just aren’t that many KAs (or individual posted tasks) to sort through,

particularly since many tasks must be executed in a specific sequence, even with complex

problem domains. Also, the KA system required a substantial amount of storage overhead,

slowing AIS execution as well as making the CIRCA code unnecessarily confusing for the

new user. This meta-level KA structure has now been removed from the code, and the AIS

has been split into two components: the “Planning Subsystem” and the “Scheduling

Subsystem”, as shown in Figure 2-3. As discussed in [19], the scheduler was split from

the planner so the two could execute in parallel, and the scheduler code was enhanced to

provide helpful numerical feedback to the AIS regarding plan schedulability, as opposed to

the “yes/no” answer given in the past. The planner has also been substantially modified, as

discussed below.

13

Knowledge Base

initial state / goals

temporal/action transitions

TAP list w/ timings

Real-Time Subsystem

Environment Interface

TAP plan executorScheduling Subsystem Schedule Manager

Scheduling routines

Planning Subsystem

Planner

Plan downloading

Status orTAP Schedule

Detected Anomalies;Execution Status

TAP plansfeaturevalue data

action commands

Abstraction Subsystem

"Environment"Sensors Actuators

State Estimators Controllers

Abstractor

De-Abstractor

Controller & actuator commands

Sensor &state data

Figure 2-3. CIRCA -- Current Version.

During my initial work with CIRCA, I identified several key items that needed

improvement before CIRCA could be expected to successfully control a system in which

domain knowledge was either incomplete or even slightly incorrect. First, in the original

CIRCA, the AIS might as well have developed and scheduled its plan offline, then just

stopped executing, because there was no informative feedback from RTS to AIS that would

help the AIS react should the executing plan need modification. This problem raised an

interesting research question: Given that the RTS only executes the TAP plan explicitly

created by the AIS, how can we make the AIS include TAPs that will detect world states

not sufficiently handled by the executing plan? As I describe in Section 3.2, we first

developed a classification of all possible world states using the planner’s available

feature/value set, then identified subclasses of these states that are important for the RTS to

detect. As shown in Figure 2-3, we have added the appropriate software to first detect

these “anomalous” or “unhandled” states via planned TAPs, then we feed back this

information to the Planning Subsystem (formerly the AIS), which reacts by replanning

based on this state feedback.

In studying the problems that arise due to incomplete or incorrect models, we also decided

that CIRCA needed an accurate representation of probability in its state models. The

14

original CIRCA had action, event, and temporal transitions to build its model of the world,

but no notion of the relative likelihood when multiple transitions matched a certain state,

except when an action was guaranteed to preempt one or more temporal transitions. I have

implemented software which preserves the basic forward chaining planner model while

also maintaining approximate state probabilities, as I discuss later in Section 3.2. Using

the current model of probability, event and temporal transitions have been collapsed into a

single transition type -- “temporal transitions”, which can completely mimic event and

temporal transitions from the original CIRCA.

To help CIRCA’s planner with complex planning problems, I have implemented the

capability to handle multiple subgoals. So, instead of relying on just one plan to move the

system from initial state all the way to its goal, CIRCA can now build a group of smaller

plans to reach the goal, thus the planner is actually working in parallel to the RTS (i.e., the

RTS runs the first subgoal’s plan while the AIS builds the second subgoal’s plan, etc.).

We have plans to automate the subgoaling process, discussed below in Section 2.3.

However, to-date, the user has specified the sequence of subgoals to be achieved, and

plans are simply built from this sequence, using all goal states from the previous plan as

initial states in the new planning cycle.

Finally, Figure 2-3 shows a new software module called the “Abstraction Subsystem

(ABS)”. In the past, CIRCA was exclusively used to control mechanisms with a relatively

simple set of sensors/actuators, so details of the conversion from numerical

sensor/actuator/controller values to the discrete CIRCA feature values remained hidden

from the basic architecture. I believe this abstraction process is one of the key issues

associated with fully-automating a complex piloted-vehicle such as an aircraft, so I have

promoted this abstraction software to a separate CIRCA module, and in the simplest case,

this module will simply pass values to and from the environment with minimal processing.

With the new ABS, we allow separation of domain-specific numerical calculations from the

RTS, so we can specify TAP execution more simply in terms of the discrete feature values

present in the actual TAP instructions. This is particularly important because, as discussed

in future sections, I expect feature abstraction to become more computationally complex in

two ways: 1) Environment and controller “state-space” may not correspond in a simple

uncoupled way to CIRCA feature space, and 2) We may seek two CIRCA representations

of environment state: CIRCA feature values, and parameters to allow the planner’s action

scoring algorithm to perform better based on feature value relationships (see Section 4.4).

Finally, maintaining an ABS module may help the system with predictive sufficiency issues

15

[22] by giving the ABS scheduled autonomy to sample the environment with sufficient

frequency such that current feature values are always available, thus optimizing the number

of actual sensor reads performed.1 Currently, the ABS still reads feature values from the

environment each time the RTS requests a value, but I hope to implement a more

appropriate model for the automated aircraft flight domain in the future, as I discuss in

more detail in Section 5.

2.3 Proposed CIRCA

Several other issues will require architectural changes before CIRCA can be considered

ready to safely control a fully-automated aircraft. The final version of CIRCA I plan to

adopt during the remainder of my thesis research is shown in Figure 2-4. Differences

between this version and the current CIRCA (Figure 2-3) include a new “Dispatching

Subsystem” module and additional calculation components within the RTS and Planning

Subsystem. In summary, these additions will allow CIRCA to automatically break an

overall goal into subgoals, and also will allow storage of and quick access to contingency

plans to achieve guaranteed response in select situations even when the original executing

plan does not contain a planned response. In this section, I describe the proposed CIRCA

in the context of a generic CIRCA planning problem, leading from the user-specified

overall goal through the final set of CIRCA interactions with the environment. In this

manner, I hope to convey an accurate vision of how CIRCA will function, then later refer

to elements of this overall procedure when describing more specific research issues.

Figure 2-5 illustrates the technique by which the proposed CIRCA will solve a problem,

with a generic example shown in Figure 2-5a and specific flight domain example shown in

Figure 2-5b. CIRCA solves the problem hierarchically, starting with the overall

“objective” (e.g., “fly-from-x-to-y”), and working down to the eventual product of

individual commands (scheduled TAPs). For simplicity, I expand only one node at every

level; all other nodes in each level would be expanded analogously. In the following

paragraphs, I describe how CIRCA builds the Figure 2-5 structure at each level in terms of

the Figure 2-4 modules and inter-module feedback/feedforward data.

1 The ABS must sample world features and output controller commands at a certain minimum frequency determinedby system dynamics. CIRCA may eventually dynamically schedule the ABS functions along with planned TAPs tomaximize resource usage efficiency. However, for my research, I will assume the user has allocated sufficientresources for the periodic ABS tasks, in the same fashion as I will assume the controller and state estimator sethave been allocated sufficient resources.

16

Real-Time Subsystem

Environment Interface

TAP plans

Knowledge Base

initial state / goals

temporal/action transitions

Dispatching Subsystem Plan message building Scheduled plan storage Plan downloading

Planning Subsystem

Feedback handler

Scheduling Subsystem

TAP list w/ timings

Contingency plans

TAP plan executor

plan handlingdirectives Schedule Manager

Scheduling routinesTAP schedules

status-3

status-1

status-2

handshakehandshake

PlannerSubgoal creation/storage

featurevalue data

action commands

Abstraction Subsystem

"Environment"Sensors Actuators

State Estimators Controllers

Abstractor

De-Abstractor

Controller & actuator commands

Sensor &state data

Figure 2-4. CIRCA -- Proposed Version.

Objective

Task1 Task2 Task i...

TAP 1 TAP 2

...

TAP k ...

startupplan cached

plan1cached plan j

Processed Sensor Info 1

Controller 1

Controller n

Processed Sensor Info m

Fly-from-x-to-y

Takeoff/climb fly-to-fix1 ... approach/land

goto fix1all normal react:

engine out ...react:error x

if ((airport nearby) and (sufficient altitude))set course for nearby airport --no engine

if ((approaching ground)and (at airport))switch to autoland --no engine

if (T)try enginerestart;reportemergencyto ATC

if (approaching ground) and(not at airport))switch to land -- offield, no engine

...

Processed sensorinfo: groundproximity warning

Processed sensorinfo: state estimator

Flight controller:mode parameters /reference trajectory settings (autoland, no engine)

if (deadend state, etc)start cached plan orreplan to handle

a) Generic Example. b) Flight Example

Figure 2-5. Illustration of CIRCA Problem Solving Technique.

17

Layer 1: Objective --> Tasks

The purpose of this layer is to decompose a high-level objective (or goal) into a set of tasks

(or subgoals) to consider during CIRCA’s development of TAP plans. Currently, this

procedure must be done manually, with the user explicitly specifying each subgoal to be

achieved in the CIRCA knowledge base prior to execution. I seek to automate this process

for a variety of reasons, ranging from easing the burden on the user to adding flexibility so

that the Planning Subsystem can dynamically modify its set of subgoals if necessary based

on RTS feedback from the environment (e.g., unhandled states).

This goal decomposition layer corresponds to the new “Subgoal creation/storage”

submodule within the Planning Subsystem (Figure 2-4). Ideally, this submodule could

operate both offline and online, and compute subgoals based on the desirable planning

problem size, where the “desirable” problem size is sufficiently small to allow successful

scheduling of the associated plan. However, planning problem size will be limited by

problem decomposability (i.e., the necessity to have plans that guarantee safety for

extended periods of time).

Perfecting subgoaling algorithms is not the emphasis of my research, so I will implement a

rather simple procedure (similar to ABSTRIPS [28]) that will most likely require future

enhancements not proposed here. I plan to use the existing CIRCA planner in a special

mode to plan from the abstracted initial state to end goal, where “abstracted” means using

only a subset of the available domain features specifically marked as “critical for

subgoaling”, then selecting actions to lead to the final goal without regard for exact timings

or transitions probabilities. For example, as shown in Figure 2-5b, suppose the overall

objective is to fly an aircraft from x to y. The subgoaling module may set up a planner

iteration that performs a high-level connect-the-dots for locations along the route (a

primitive form of guidance), but ignores features that do not directly affect the computation

of subgoal (waypoint) calculations. Note that this subgoaling procedure may be

appropriate only for piloted vehicle domains, so may need to be extended in future research

should CIRCA be used in other systems.

Layer 2: Task --> Scheduled Plan Set

The purpose of this layer is to build and schedule plans to achieve each task (or subgoal)

computed in layer 1. Currently, CIRCA’s planner builds and schedules a single plan for

18

each subgoal, then sequentially downloads them to the RTS as it is ready to execute them.

If RTS feedback indicates an “unhandled” state, one new plan to handle this state is

developed and immediately downloaded, then the planner continues along the subgoal path

until the final goal is reached.

As shown in Figure 2-4, I have proposed the addition of a new “Dispatcher Subsystem”

and also a “Contingency plan” storage area within the RTS. Note that Dave Musliner (at

Honeywell) has extended the RTS software to store a group of contingency plans, but to

date has only written these plans by hand, not interfacing with the CIRCA planning system

at all. By having both a Dispatcher Subsystem and small contingency plan storage area on

the RTS, CIRCA can effectively store and retrieve immediate contingencies associated with

the currently executing plan and also plans to achieve future subgoals. The RTS storage

area will be used exclusively for those contingency plans required for reacting to time-

critical situations, thus real-time guarantees will be associated with switching to these

plans. The Dispatcher storage area will contain all other plans, including contingencies for

the current subgoal that do not require a small guaranteed plan switch time2 as well as all

plans associated with future subgoals. By minimizing the RTS plan storage area, we can

impose tighter plan downloading and plan switching time constraints, but will still be able

to keep a large plan cache via the Dispatcher subsystem.

Plans will be built and stored using the following algorithm:

For each task (i) or subgoal,

- Build a “startup plan” using the specified initial state (or last plan’s goal state set

as the initial state). This plan is equivalent to the single executed CIRCA plan now

created for each subgoal.

- Build a contingency plan to handle unhandled (anomalous) states the planner

decides are either too important or too close to failure to allow time for CIRCA

replanning. I discuss issues involved with developing contingency plans in Section

3.3.

2 Ideally, the Dispatcher plans would also have a larger guaranteed switch time. However, this would be difficultbecause the executing RTS plan would need to contain guaranteed actions to download this next plan. The timerequired to download this plan is a function of plan size, communication link availability, etc. Thus, for mythesis research, I will simply state that switching to a Dispatcher plan is just “much faster” than building and thendownloading a new plan online.

19

Upon startup, CIRCA will build a complete set of plans for all identified tasks offline (i.e.,

before RTS execution of the first plan begins), and will store these plans in the “Dispatcher

Subsystem”, which will also receive sequencing information from the planner to download

plans to the RTS as appropriate. At this point, the RTS begins executing the first TAP plan

(layer 3). If everything goes as expected (including the set of contingencies for which

CIRCA has built plans), the planner is done, and layers 3 and 4 can progress without

further planner intervention. However, if a situation arises which either has no

contingency or otherwise requires further planning (e.g., if a contingency plan cannot

provide goal achievement, only safety), the planner will receive RTS feedback to this

effect. Then, depending on available time before system safety is jeopardized, layer 1 will

develop a new set of tasks if necessary, then execute the basic planning algorithm for each

new task and required contingency. The Dispatcher will immediately download the first

scheduled plan allow prompt response to the unhandled state, and subsequent plans will be

scheduled and stored in the Dispatcher as they become available. More details on this

procedure and associated timeliness issues are discussed in Section 3.3.

Layer 3: Scheduled Plan --> TAPs

As discussed above, layer 2 has been responsible for creating all plans, and the dispatcher

has ensured that the correct plan will be sent to the RTS for execution. Layer 3 of

execution is performed on the RTS, that takes a set of downloaded TAPs and executes

them as specified in the schedule, with all guaranteed TAPs executing cyclically before the

deadlines and all “if-time” or best-effort TAPs executing whenever time is available.

Currently, each schedule contains a special TAP which is used to check whether the

subgoal has been satisfied and switches to a new "startup" plan (as depicted in Figure 2-5)

if the subgoal has been satisfied. Even if an "unhandled" state is detected, the RTS

continues executing the current plan and switches whenever the new plan becomes

available. The proposed version of CIRCA will continue this basic procedure as before,

except that each RTS plan must now contain special TAPs responsible for managing the

receipt and storage of new contingency plans, as well as TAPs for identifying and

switching to the appropriate contingency plan when that particular situation arises.

20

Layer 4: TAP --> Environment I/O

This layer describes the procedures used to execute the instructions within each RTS TAP,

specifically those TAPs that deal with the environment. The “test” part of each TAP

typically requires measurement of a set of environment states, such as “aircraft position” in

an abstract form (as described in Section 5), while the “action” part of each TAP typically

corresponds to acting upon some environmental parameter, ranging from directly

controlling an actuator to setting a parameter of some controller. To execute these TAPs,

the RTS must send its feature value request or command to the ABS (Figure 2-4), which

then either uplinks the appropriate abstracted feature value to the RTS or de-abstracts and

sends the action command. I assume the ABS will always have current feature values via

careful (offline) ABS scheduling, as I discuss in the context of the piloted vehicle domain

in Section 5.

21

=====================================================

CHAPTER 3

MODELING AND REASONING ABOUT TIME=====================================================

3.1 Overview

In this section, I propose algorithms that will yield better temporal modeling and reasoning

in CIRCA. This section is ordered by expected time for completing the proposed research,

where Section 3.2 describes temporal modeling research completed to-date, Section 3.3

includes research I propose for dissertation work, and Section 3.4 describes future work

that is beyond the scope of my thesis, but that needs to be considered before a “complete”

temporal reasoning architecture can exist.

In this section, I focus on three important aspects of time: 1) Meta-level reasoning about

planner deliberation time, 2) Construction of a world model that accurately represents

domain feature changes over time, and 3) Explicit scheduling of plans to allow execution

timing guarantees for critical planned actions. For temporal aspect 1), I have no intentions

of inventing a revolutionary algorithm to reason about deliberation time, particularly since

many others are concentrating their research efforts in this area [3], [7], [11], [12], [15],

[33], and [35]. Instead, I plan to use a combination of design-to-time [9] and anytime [7]

strategies, modifying the planner such that it can dynamically alter planner parameters to

control expanded state-space size and halt search if time expires. Limiting online planner

deliberation time in conjunction with extensive offline creation of a set of cached reactive

plans will allow CIRCA to combine the best of exclusively reactive and exclusively

planner-based systems.

I have begun work toward limiting planning time by incorporating an approximate but

relatively fast computational model of probability within the planner. The existing CIRCA

probability model is discussed in Section 3.2. Unfortunately, there are key approximations

in STRIPS-based planners [8] that carry through to our existing probability model (which

has been placed within a STRIPS-like planner). In Section 3.3, I describe an approach by

which one can combine CIRCA’s current probability model with the other extreme: a

Markov Decision Process (MDP) based model [6] in which all states have an explicit time

22

stamp. The MDP model contains a more accurate model of state changes over time, but

such a model significantly increases planning complexity over our current model.

Initial work to better address temporal aspect 2) has also begun. In domains where

knowledge may be incomplete or incorrect, it is important for an automated system to react

even when some “unplanned-for” situation occurs. I describe our work to classify, detect,

and react to unplanned-for states in Section 3.2. Also, when aspects of domain knowledge

are statistically based (e.g., exogenous event occurrence), it is important to have a model

that can accurately represent experts’ probabilistic beliefs, so we allow the specification of

temporal transition probabilities in the CIRCA knowledge base, as described in Section

3.2. Unfortunately, there are still challenges associated with domain feature discretization

as well as the planner’s model of time associated with world state changes. I save

extensive discussion of accurately modeling the set of specific domain features for Sections

4 and 5, but in Section 3.3, I describe how CIRCA can generally improve model accuracy

by enhanced temporal modeling while planning (based on MDP models) combined with

special-purpose domain-specific functions to compute accurate feature values as a function

of time.

CIRCA currently best addresses temporal aspect 3): scheduling to allow real-time

execution of critical actions. In the original version of CIRCA, if the planner could not

successfully find some combination of actions that could be scheduled, it would simply

fail. In Section 3.2, I describe work to relax this hard restriction via the removal of actions

associated with low-probability states. The contingency plan storage capabilities will

facilitate scheduling, since we now will be able to have multiple plans with plan switch

guarantees instead of a single plan only. However, I will be looking at contingency plans

from the completeness perspective, leaving work on scheduling issues for others.

Reference [19] describes ongoing modifications to the scheduler-planner interface that will

further improve CIRCA’s ability to build schedulable plans.

3.2 Research Completed

My initial research efforts have involved improving CIRCA’s ability to schedule plans and

select viable goal paths, even with imprecise knowledge. First, we implemented a model

of probability in which individual state probabilities are computed using temporal transition

probability functions and expected action execution delays. This model and its uses to-date

are described below (Section 3.2.1) and in [2].

23

To minimize the planner’s set of expanded states as well as improve plan schedulability,

CIRCA selects only those states it considers reachable, and is satisfied so long as at least

one path reaches a goal state. In Section 3.2.2, I describe a generic classification of all

world states, and describe algorithms we have implemented in CIRCA that allow it to detect

and react when important “unplanned-for” states are reached.

3.2.1 Incorporation of Probabilistic Model for Planner States

To control a complex system, an agent must build and execute a plan that is capable of

recognizing state changes due to its own actions or external world events, even when these

changes are not completely predictable. Consider an agent capable of safe, fully-automated

aircraft flight control from takeoff through landing. To execute a successful flight, the

agent must have a set of goals, such as destination airport and intermediate positions, and

an accurate model of actions and possible world events. Each event has some chance of

occurring as a function of time and state feature values. A valiant goal for any agent is to

build and execute plans that yield a high probability of successfully reaching the specified

goals. My objective in this research was to calculate approximate state probabilities and use

them to guide CIRCA’s planner along highly-probable goal paths while ignoring low-

probability occurrences when necessary.

The CIRCA planner builds the reachable state set based on action and temporal transitions

specified in the domain knowledge base. In CIRCA’s current model, state probability is

computed locally based on the probability of its parent state and applicable state transition

probability functions3. Probabilities are propagated from initial states throughout the

expanded world state set in this fashion.

I use a simple model for action transition probabilities, by assuming action transitions will

affect state features following a constant delay after being executed on the RTS. Figure 3-1

shows the cumulative probability function used for actions as a function of time.4 To

construct this function, the user specifies a delay (tdelay) between the time the action is

initiated (time=0) and the time at which the action changes state features (time=tdelay).

Then, the total delay between reaching the state prompting action execution and the time

3 A temporal transition is “applicable” from a state if all temporal transition preconditions are matched.4 We assume all state features are observable and that if an action is initiated, it will be executed properly.Otherwise, we could not specify action probabilities that reach 1.0.

24

that action affects state features is: t (total delay) = tdelay + (delay between when the

state is reached and when this action begins executing in the schedule).

time

1.0

0 tdelay

p(t)

Figure 3-1. Action Transition Cumulative Probability.

Because they are not explicitly commanded, CIRCA cannot assume such tight control over

temporal transitions, some of which may not be precisely modeled. For this reason,

CIRCA allows the user to define a cumulative probability function for each temporal

transition, where time=0 is defined as the time at which that transition's preconditions were

first satisfied. Figure 3-2 shows two examples of temporal transition probability functions

and their associated cumulative probability functions. In Figure 3-2a, the transition has a

high probability of occurring immediately. This probability decays over time, leaving a

cumulative probability asymptote of Pmax < 1.0. The value (1 - Pmax) corresponds to

the probability that this event will never occur. As an example, consider the state in which

an aircraft collision avoidance alarm (indicating nearby traffic) has just sounded. The

probability p(t) that the transition “collide with other aircraft” will occur immediately jumps

to its maximum value, but decays in time, since either the other aircraft will pass or else the

collision will have already happened.

Figure 3-2b shows an example for which a delay occurs between the time the state is

reached and the time this transition may happen (i.e., p(t) > 0). The asymptote of the

cumulative probability function is 1.0, indicating this transition will occur if given

sufficient time. A simple example of this transition type is travel between distinct locations.

Suppose an aircraft flies along the coast from Los Angeles to Portland, Oregon. At a point

along the flight, the aircraft state changes to “Location: San Francisco”, at which time the

aircraft heads directly for Portland. The probability that the temporal transition “Arrive in

Portland” will occur is near zero for a certain amount of time, even with a tremendous

tailwind and maximum thrust. The peak of p(t) occurs at the expected arrival time based

on average calculations, while the width of p(t) increases as the uncertainty in wind,

aircraft performance, and/or course deviations increases.

25

cum_prob(t)

a) "Event" Transition.time

1.0

0 tdelay time

Pmax1.0

0

cum_prob(t)

b) Delayed Transition. time

p(t)

0 time

p(t)

Figure 3-2. Temporal Transition Probability Functions.

Because CIRCA allows multiple temporal transitions from any state, probabilistic

dependencies between these transitions must be considered. These dependencies exist

because the occurrence of one temporal transition changes the current state, thus no other

temporal transition may later occur from that state. In previous CIRCA versions, the

number of temporal transitions modeled in the knowledge base was minimized by making

preconditions minimal so that temporal transitions could occur in different combinations

from many states. However, CIRCA now must accurately capture transition probability

dependencies, so the user must make preconditions more specific, increasing the number of

temporal transitions in the CIRCA knowledge base. The following procedure defines how

the user specifies temporal transitions and their probabilities:

(1) Define temporal transition sets, where a “set” contains all temporal transitions with a

certain set of preconditions. Each transition set's preconditions must be sufficiently

specific such that no state can match the preconditions of two different transition sets.

(2) For each transition set, specify the probability function for each transition. The sum of

all cumulative probability function asymptotes in each set must be ≤ 1.0 (100%). We

assume the user has sufficiently restricted preconditions to explicitly define any features

on which transition probabilities depend.

State probabilities are computed recursively during state expansion, with a "parent" state

and applicable outgoing transitions used to determine "offspring" probabilities. The

planner begins with an initial state set and no knowledge of relative probabilities within this

set, so we assume a non-informative prior distribution. The planner expands each initial

state, using the available transitions to develop the set of offspring states and initialize their

probabilities. Offspring states eventually become parent states to be expanded, continuing

until all states that are reachable with (probability > ε) have been expanded.

26

A set of zero or more action and temporal transitions match the preconditions of any parent

state. The states resulting from all matching temporal transitions and any planner-selected

action are the offspring set. Figure 3-3 illustrates the three possible situations. In Figure

3-3a, a TTF exists, so the planner has chosen a preemptive (guaranteed) action. Offspring

states, P1-Pn, result from that action and all applicable temporal transitions. State Pn is a

failure state that must have probability less than a small value ε. Figure 3-3b illustrates the

case when a non-preemptive action is selected. Offspring states result from that action and

the (n-1) temporal transitions. Finally, Figure 3-3c illustrates the case when no action is

selected, a possibility if no TTF exists and the planner selects no action along a goal path.

In this case, all n offspring states result from temporal transitions.

a) Action Preempts TTF.

Pinit

P1

acP2

tt

ttf

....

Pn<

tt

i=1

n-1Pi = Pinit

c) No Action Planned.

Pinit

P1

ttP2

tt

tt

....

Pn

tt

i=1

nPi ≤ Pinit

b) Non-preemptive Action Planned.

Pinit

P1

acP2

tt

tt

....

Pn

tt

i=1

nPi ≤ Pinit

Figure 3-3. Possible Transitions from Parent to Offspring.

The algorithm in Table 3-1 is used to locally compute probabilities for each reachable state.

A more detailed description of the algorithm is provided in [2]. The two major

approximations in the local probability computation algorithm -- estimating constant state

probability values at “critical” times, and propagating offspring probability only when the

offspring has not yet been expanded -- allow state probabilities to be computed quickly, but

not as accurately as other models of probability (such as MDP-based approaches).

Proposed improvements for both approximations are discussed in Section 3.3.

CIRCA uses state probabilities in two ways: finding highly-probable goal paths and

removing improbable states. In the previous version of CIRCA, the planner expanded

states in depth-first order. The planner selected actions primarily to avoid TTFs and

secondarily to achieve goals. In the worst cases, the only goal-reaching states would have

probability near 0, and in some situations, no schedule may be possible that avoids all

failure transitions, since the original CIRCA had to include even those that were highly

27

improbable. Although the current CIRCA model is not perfect, it still has a better chance of

reaching its goals because probability considerations prevent this worst case. Quantifying

how much better the new CIRCA performs involves evaluating the probabilistic model for

a given domain, as well as estimating the presence of cycles, etc., that degrade calculated

state probability accuracy.

Table 3-1. Original State Probability Calculation Algorithm.

1. Select the most probable state for expansion and let Pinit be this state's probability. (O(m))2. Select an action by scoring all potential action candidates. (O(nf*na) )

3. Create a list of offspring states for temporal and action transitions. (O(nt))4. Compute critical time (t) for transition probabilities. Critical time is defined as follows for the

possible transition sets shown in Figure 3-3: Case a): t = preemptive action execution deadline;Case b): t = non-preemptive action average delay time; Case c): t = temporal transition asymptotic

probability. (O(1))5. Create a list of cumulative probabilities for the offspring states (O(nt)).6. Scale each probability by Pinit (Pi = Pi * Pinit ). (O(nt))

7. For each unexpanded offspring state, add any previously existing probability due to other parent states tothe newvalue (Pi = Pi + Piold). (O(nt) )

Overall complexity: O(m + nf*na + nt), where m=number of unexpanded, reachable states that couldbecome parent states, nf=number of features, na=number of action transitions, nt = number oftemporal transitions)

3.2.2 Detection and Handling of Unplanned-for States

Autonomous control systems for real-world applications require extensive domain

knowledge and efficient information processing to build and execute situationally-relevant

plans. To enable guarantees about safe system operation, domain knowledge must be

complete and correct, plans must contain actions accounting for all possible world states,

and response times to critical states must have real-time guarantees. Practically speaking,

these conditions cannot be met in complex domains, where it is infeasible to preplan for all

configurations of the world, if indeed they could even be enumerated. Realistic planners

use heuristics to bound the expanded world state set, coupled with reactive mechanisms to

compensate when unexpected situations occur. For this research, I focused on the question

of how an autonomous system can know when it is no longer prepared for the world in

which it finds itself, and how it can respond. In this section, I first identify the different

classes of “unhandled” states the planner may identify, then describe methods by which

CIRCA can detect and respond to these states appropriately.

28

Figure 3-4 shows the relationship between subclasses of possible world states. Modeled

states have distinguishing features and values represented in the planner’s knowledge base.

Because the planner cannot consider unmodeled states without a feature discovery

algorithm, unmodeled states are beyond the scope of this paper. “Planned-for" states

include those the planner has expanded. This set is divided into two parts: "handled" states

which avoid failure and can reach the goal, and "deadend" states which avoid failure but

cannot reach the goal with the current plan.

All World StatesModeled

Planned-for

"Handled" --can reach goalDeadend

RemovedImminent Failure

World States Actually Reached

Figure 3-4. World State Classification Diagram.

A variety of other states are modelable by the planner. Such states include those identified

as reachable, but “removed” because attending to them along with the “planned-for” states

exceeds system capabilities. Other modeled states include those that indicate “imminent

failure;” if the system enters these states, it is likely to fail shortly thereafter. Note that

some states might be both “removed” and “imminent-failure”, as illustrated in Figure 3-4.

Finally, some modeled states might not fall into any of these categories, such as the states

the planner considered unreachable but that are not necessarily dangerous. We are working

to find other important classes or else show no other modelable state classes are critical to

detect. As illustrated by the boldly outlined region in Figure 3-4, states actually reached

may include any subclass. To assure safety, the set should only have elements in the

“planned-for” region. When the set has elements outside this region, safety and

performance depend on classifying the new state and responding appropriately. For this

reason, we provide more detailed definitions of the most important classes.

A "deadend" state results when a transition path leads from an initial state to a state that

cannot reach the goal, as shown in Figure 3-5. The deadend state is safe because there is

no transition to failure. However, the planner has not selected an action that leads from this

29

state via any path to the goal. Deadend states produced because no action can lead to a goal

are called "by-necessity", while those produced because the planner simply did not choose

an action leading to the goal are called "by-choice”.

InitialState

Deadend State

... GoalState

temporal

temporal oraction

Figure 3-5. "Deadend state" illustration.

A planner that generates real-time control plans needs to backtrack whenever scheduling

fails. If backtracking is unsuccessful, another option is to prune the most improbable states

from consideration and then replan. A "removed" state set is created when the planner has

purposefully removed the set of lowest probability states during backtracking, as illustrated

in Figure 3-6. In the first planner iteration, all states with nonzero probability are

considered, as depicted by the "Before Pruning" illustration. A low probability transition

leads to a state which transitions to failure. This failure transition is preempted by a

guaranteed action.

Before Pruning After Pruning

Removed State

low probability temporal

( < prob << 1)

Failure State

...

temporal

preemptive action

InitialState

... GoalState

temporal oraction ( < prob < 1)

InitialState

... GoalState

temporal oraction ( < prob < 1)

Removed downstream states

ε

ε

ε

Figure 3-6. "Removed state" illustration.

Suppose the scheduler fails. The planner will backtrack and build a new plan without low-

probability states. The resulting state diagram -- "After Pruning" -- is shown in Figure 3-6.

Due to the low probability transition, all downstream states are removed from

consideration. The preemptive action is no longer required, giving the scheduler a better

chance of success.

30

During plan development, all temporal transitions to failure (TTF) from reachable states are

preempted by guaranteed actions. If preemption is not possible, the planner fails.

However, the planner does not worry about TTF from any states it considers unreachable

from the initial state set. The set of all modelable states considered unreachable that also

lead via one modeled temporal transition to failure are labeled "imminent-failure".5 Actually

reaching one of the recognizable imminent-failure states indicates either that the planner’s

knowledge base is incomplete or incorrect (i.e., it failed to model a possible sequence of

states), or that the planner chose to ignore this state in order to make other guarantees.

Figure 3-7 shows a diagram of a reachable state set along with an isolated state (labeled

“Imminent-failure”) leading via one temporal transition to failure. This state has no

incoming transitions from a reachable state, so the planner will not consider it during state

expansion. However, if this state is reached, the system may soon fail. The imminent-

failure unhandled states are important to detect because avoiding system failure is

considered CIRCA’s primary goal.

Initial State

GoalState

temporal or action ( < prob < 1)

Failure State

Imminent Failure State

temporal

...

...ε

Figure 3-7. "Imminent-failure state" illustration.

A critical premise in our work is that a planner cannot be expected to somehow just “know”

when the system has deviated from plans---it must explicitly plan actions and allocate

resources to detect such deviations. For example, to make real-time guarantees, CIRCA's

planner must specify all TAPs to be executed, including any to detect and react to

unhandled states. In our implementation, after the planner builds its normal plan, it builds

TAPs to detect deadend, removed, and imminent-failure states. Other unhandled states,

such as those “modeled” but outside “planned-for”, “removed”, and “imminent-failure”

regions in Figure 3-4, are not detected by CIRCA. On reaching one of these unhandled

states that is not detected by CIRCA, the system may eventually transition back to a

planned-for state (where the original plan executes properly), transition to an imminent-

5 Note that it is also possible that modelable states could lead directly to failure with a known transition, or thatmodelable states could lead directly to failure with transitions that are not known to the planner, or thatunmodelable states could lead directly to failure with an unknown transition. We exclude these cases from the“imminent-failure” set because the planner is incapable of classifying them in this way.

31

failure state (where CIRCA will detect the state and react), or simply remain safe forever

without reaching the goal. The algorithms to build lists for deadend, removed, and

imminent-failure states are described in detail in [1]. To summarize, CIRCA builds a list of

each class of unhandled state, then uses ID3 [24] with that unhandled state list as the set of

positive examples and a subset of the reachable states (depending on unhandled state type)

as the set of negative examples. ID3 returns what it considers a minimal test set, which is

then used as the TAP test to detect that unhandled state class.

When any of the three unhandled state detection TAP tests (for deadend, removed, and

imminent-failure states) return “true”, the RTS feeds back current state features to the

planner along with a message stating the type of unhandled state detected. The planner then

builds and schedules a new TAP plan which will handle this state, and subsequently

downloads this new scheduled plan to the RTS. By detecting all unhandled states which

may reach failure, presuming we have accurately modeled all TTFs, the system will always

be able to initiate a reaction to avoid impending doom. However, this is predicated by a

new plan being developed to avert disaster faster than disaster could strike. To-date, I have

assumed that the planner could simply develop a plan fast enough, so tests have worked

because CIRCA’s planner responded “coincidentally” in real time. “Coincidental” real time

responses are insufficient for critical reactions, particularly when failure is involved. A

major part of my proposed thesis research is devoted to addressing this timeliness problem

when unhandled states arise, as discussed below.

3.3 Proposed Research

CIRCA must consider many aspects of time simultaneously when exercising control in any

time-critical domain. In Section 3.3.1, I briefly recap CIRCA’s current algorithms for

modeling the passage of time in the problem domain. I then propose a more accurate MDP-

based modeling procedure, which will unfortunately require more planning resources for

each problem. To address the accuracy vs. complexity tradeoff which arises, I propose a

hybrid model that contains elements of both approaches.

Originally, CIRCA’s RTS was split from the AIS to allow careful scheduling of reactive

actions to keep the system safe indefinitely. But, the only way the system remained safe

indefinitely was to assume all state transitions and reactions were completely and accurately

modeled, and that all necessary reactive actions could be successfully scheduled. Unlike

traditional robot systems where either a simple movement sequence or a “STOP” action will

32

allow maintenance of a sort of “safe haven”, there is no such thing as an “indefinitely safe

state set” for dynamic systems like an aircraft in flight. So, we cannot reasonably just

assume our system will be safe forever. Whenever any state is detected (handled or

“unhandled”, as discussed above), we must either have preplanned a reaction set or we

must build a new plan within the time bounds associated with how long the system can

remain safe while executing the current RTS plan.

In Section 3.3.2, I discuss current technology for placing bounds on planning time, and

describe a procedure by which CIRCA’s planner may reason about planning time bounds

and modify planner parameters to actually achieve these bounds. By allowing CIRCA to

explicitly bound deliberation time, I may make more concrete statements regarding

replanning response time, moving past the current claim of strictly “coincidental” real-time

response. However, given that the planner must still trade off planning speed for accuracy,

I have proposed a CIRCA system which promotes the building and storing of contingency

plans so that a quick switch to a contingency plan (produced under less time pressure

offline) will be possible instead of always forcing the planner to build a plan online. At

system startup, CIRCA will perform a significant amount of planning offline (i.e., before

the RTS begins execution of its first plan, such as when the aircraft sits motionless at the

gate). Section 3.3.3 describes the procedure by which the planner may determine which

plans to build offline, and also discusses how these plans will be stored and accessed.

Figure 3-8 illustrates the different scenarios that may arise if CIRCA leaves the “planned-

for” state set. Each oval in the figure represents a set of states, with “planned-for” ovals

representing all states with goal-seeking reactions (i.e., “handled” in Figure 3-4), while the

blank ovals represent unhandled state sets. As shown in the figure, I will presume that

CIRCA can detect some state in each unhandled state set (using the algorithm from Section

3.2.2); otherwise it would not know it had left the planned-for set. Once detecting the

unplanned-for state, in cases where a TTF occurs quickly (or “fast”), CIRCA will switch to

an RTS cache plan in guaranteed real-time. If the TTF occurs a bit slower (i.e., so that the

dispatcher will have time to download the plan, but the planner would not necessarily have

time to build a new plan), CIRCA will download and begin executing a plan stored in the

Dispatcher. If either no TTF exists (i.e., states in the blank oval are deadend only) or else

the TTF is “very slow”, CIRCA will allow online replanning, with time limit

corresponding to the time before the “very slow” TTF may occur.

33

“Fast” ttf

tt

tt

“Slow” ttftt

Action:Execute RTS Cache Plan

Action: Download & Execute Dispatcher Plan

Planned-for States

Failure

Planned-for States

No ttf or “Very Slow” ttf

Planned-for States

Action: Replan

Detected Unplanned-for State set

Figure 3-8. Proposed Plan Switching Logic in CIRCA.

3.3.1 Accurately Representing Time in the Planner’s World Model

In the original version of CIRCA, the planner modeled states without explicit time stamps

in each, capturing the world’s changes over time using feature changes specified by

nondeterministic event transitions, temporal transitions with a known minimum delay, and

actions. I have modified CIRCA to include a model of probability, so that nondeterminism

in the original model has been replaced by likelihood estimates. However, in keeping with

the original CIRCA philosophy, the current model with probability estimates also does not

explicitly model time in states, allowing cycles to minimize state-space size. In both

versions, actions are given explicit timing requirements to guarantee safety based on the

planner’s model of the world.

To illustrate a key deficiency in the current CIRCA modeling process, consider the simple

task of flying an aircraft in a holding pattern, as illustrated in Figure 3-9. Assume we pick

up the planner’s state expansion process when the aircraft is at Location 1, with exactly 1/2

tank of fuel. Note that in this example, I ignore other model features for simplicity.

Assume the CIRCA knowledge base contains only actions to travel between holding pattern

locations, a temporal transition for fuel usage, and temporal transitions to failure,

34

corresponding with the simplified consequences of the aircraft failing to actively control its

trajectory, possibly resulting in a crash (failure). CIRCA would build the state diagram

shown in Figure 3-10. Define min-∆ as in [22]: the minimum delay before which a

temporal transition can occur, corresponding with the maximum action response time

allowable for preempting the transition. Note that in our current probabilistic model, min-∆

is the time at which a temporal transition’s probability exceeds some small value ε. As

shown in Figure 3-10, the “fly-to-fix-x” actions must be guaranteed to preempt the TTF

from each state (labeled ttf1-ttf4 in the figure). The min-∆ of each TTF is smaller than the

min-∆ of the fuel usage transition, corresponding to the realistic model in which the

minimum time before the aircraft may crash from Location x given no action is smaller than

the time required for the aircraft to deplete its fuel from 1/2 tank to 1/4 tank. As a side

effect of preempting the TTFs, these actions will also preempt the fuel usage transitions

with larger min-∆ (as shown in Figure 3-10, where bold lines depict guaranteed actions and

thin lines represent preempted temporal transitions).

LOCATION 1 LOCATION 2

LOCATION 4 LOCATION 3

Figure 3-9. Illustration of Aircraft Holding Pattern.

FUEL 1/2

LOCATION 1

FUEL 1/2

LOCATION 2

FUEL 1/2

LOCATION 3

FUEL 1/2

LOCATION 4

FUEL 1/4

LOCATION xFAILURE

fly-to-2 fly-to-3 fly-to-4

ttf-3

ttf-1

ttf-2 ttf-4

fuel-use fuel-use fuel-use

fuel-use

fly-to-1

Figure 3-10. State Diagram for Aircraft Holding Pattern -- Current CIRCA model.

35

This side effect introduces a crucial inaccuracy into the model when a state cycle is present

along with preempted temporal transitions: the possibility of completely ignoring a

temporal transition that may eventually happen. In the Figure 3-10 example, the TTFs will

continue to be properly reset and preempted as the aircraft successfully flies around the

pattern. However, fuel continues to be used, and CIRCA has no concept of this fact since

it contains no model of how many cycles will be completed.6 The end result is that CIRCA

never believes the fuel quantity decreases the entire time the plane is in the holding pattern,

so CIRCA would believe the system could be safe indefinitely, when it really will

eventually run out of fuel.

Perhaps the easiest fix for this particular example is to model the fuel quantity with many

more discrete values. With sufficient discretization, the fuel-use transition would have a

min-∆ less than all TTFs, avoiding preemption and the problem illustrated in Figure 3-10.

However, it may take, say, 6 minutes (1 1/2 minute legs) to completely fly around a

holding pattern once. On a Boeing 747, it might take 3 hours for the fuel to decrease by

1/4 tank, so it would require more than 30 discrete values of fuel per 1/4 tank (or >120

values overall) just to achieve a min-∆ for the “fuel-use” transition slightly less than that of

the TTF that must be preempted. And, if there were an even quicker TTF elsewhere in

state-space, the discretization of fuel (as well as all other slowly-changing quantities) must

increase even further.

So, increasing the level of feature discretization isn’t an optimal solution because it may

expand the state-space size quite a bit, requires the user to be careful that situations such as

that in Figure 3-10 cannot occur, and requires the user to specify many more transitions in

the knowledge base (at least one per new discrete feature value). I propose that adding a

time stamp to each state is a better way to solve this problem. Markov Decision Processes

(MDP) [6] employ a model which attaches a time stamp to each state. In this manner, there

are never cycles in state-space, since any one value of time can occur only once. Figure 3-

11 shows how the aircraft holding pattern problem would map to an MDP model. The

state-space would be quite large, as depicted by the “...”, because instead of allowing a

cycle, the MDP must continue to expand all the states necessary until reaching the goal.

For this example, the MDP would create new states that would look like exact copies of

these states (except for the time stamps), complete with preemption, until the time stamp

6 Reference [22] addresses the problem of a persistent temporal transition, but the author only considers the casewhere there is a clear path along which CIRCA can backtrack. Due to the existence of a cycle, such linearbacktracking is impossible, so his approach does not solve the problem illustrated by this example.

36

was sufficiently large that the fuel-use transition is no longer preempted by the proposed

action. Assuming the holding pattern continued until the fuel quantity actually became 1/4

tank, given the above numbers (e.g., 6 minute holding pattern cycle time; 3 hours per 1/4

tank of fuel), the MDP model would expand 30 copies of the complete holding pattern (or

120 states), then the planner would notice that the fuel-use transition was no longer

preempted and would react accordingly.

T = 10FUEL 1/2LOCATION 1

T = 11.5FUEL 1/2LOCATION 2

T = 13.0FUEL 1/2LOCATION 3

T = 14.5FUEL 1/2LOCATION 4

T = yFUEL 1/4LOCATION x

FAILURE

fly-to-2 fly-to-3 fly-to-4 fly-to-1

ttf-3

ttf-1

ttf-2 ttf-4

fuel-use fuel-use fuel-use

fuel-use

T = 16FUEL 1/2LOCATION 1

ttf-1

fuel-use

...

Figure 3-11. State Diagram for Aircraft Holding Pattern -- MDP model.

The current CIRCA model is intractable in the worst case (i.e., exponential search

required). However, the MDP model is even more intractable, if such a comparison can be

made, because of CIRCA’s current ability to model many sequences of world changes with

cycles. Presuming the planner was never time-limited, I would propose using an MDP-

based model in CIRCA to achieve more accurate plans. Unfortunately, CIRCA’s planner

often must operate reactively, and, as discussed in Section 3.3.2 below, I am working to

impose planning time limits when necessary. To achieve a compromise between the

precise MDP representation and the inadequate model currently in CIRCA, I propose a

model depicted by the example in Figure 3-12. In this proposed CIRCA model, each state

contains a representation of time (T) as in the MDP model, but, instead of creating a new

state for each time step, the planner minimizes the state set by expressing the state time

stamp as a range of times. When the planner detects that all features in a potential new state

match a previously expanded state (except time), the planner decides whether a separate

new state is necessary based on outgoing transition probabilities (as described in the Table

3-2 algorithm). The planner builds up a range of times for which each state of a cycle will

be valid, and branches out of the old state set whenever the relative probability between

outgoing transitions changes significantly. In the Figure 3-12 example, this branch occurs

at the state where T=190 (LOCATION 1), which is the critical time where the fuel-use

temporal transition is no longer preempted by the fly-to-x action.

37

...

...

fly-to-1 (t < 188.5)

T = 10,16,...,184FUEL 1/2LOCATION 1

T = 11.5,...,185.5FUEL 1/2LOCATION 2

T = 13,...,187FUEL 1/2LOCATION 3

T = 14.5,...,188.5FUEL 1/2LOCATION 4

T = yFUEL 1/4LOCATION x

FAILURE

fly-to-2 fly-to-3 fly-to-4

ttf-3

ttf-1ttf-2 ttf-4

fuel-use fuel-use fuel-use

T = 191.5,...FUEL 1/4LOCATION 1

ttf-1 fuel-use

T = 190,...FUEL 1/2LOCATION 1

T = yFUEL EmptyLOCATION x

fuel-usettf-1

fly-to-1 (t >= 188.5)

Figure 3-12. State Diagram for Aircraft Holding Pattern -- Proposed CIRCA model.

The algorithm in Table 3-2 shows how, at a specific time t, the planner can decide whether

to branch to a new state or return in a cycle to an existing state. This approach clearly saves

time over the MDP approach when many instances of Case 2 exist, since in this situation,

state k is expanded only once after multiple times t have been incorporated into the range

Tk . However, this algorithm does not necessarily save time over the MDP approach,

particularly when many instances of Case 3 exist, where the system must propagate the

new time range through the descendants of the already-expanded state, effectively

performing a computation for each time step as is done in MDP models. Fortunately, even

with Case 3, this algorithm may still save time over the MDP model if a descendant

requires no branch to a new state, because the action selection process and associated

timing have previously been computed.

As shown in Table 3-2, the planner’s decision of whether to branch or not in Case 3 is

based on whether a state’s outgoing probabilities change “significantly”, a rather

ambiguous term. Consider the extreme cases. If CIRCA classifies any minute change in

probability as “significant”, all Case 3 instances will most likely cause branching, and the

state-space will begin to resemble MDP (except for savings possible from Case 2).

Conversely, if CIRCA classifies all probability changes as “insignificant”, one achieves a

model similar to that in the current CIRCA, in which all states with matching features

(excluding T) are considered identical. By varying the probability variation tolerance

(perr ), (i.e., my definition of “significant”), CIRCA can move along the spectrum from

the existing “fast but inaccurate” model to the MDP “slow but accurate” model.

38

Table 3-2. Overview of Algorithm to handle State Time Stamp Ranges.

Let min(Ti) = the minimum of the time range T for state i; max(Ti) = the maximum of the time range T for state i;

t = the current time associated with the potential new state j .

Case 1: Features of a new state j do not match those of any existing (old) state:1) Create new state j with current features.2) Set min(Tj) = max(Tj) = t .3) Place state j on the stack to be expanded.

Case 2: Features of a new state j match those of old state k (except T, of course),and state k has not yet been expanded (so no descendants exist yet):

Note: Since CIRCA can later choose an action that considers the entire time range T in state k, no new state j need be created.1) Set min(Tk ) = minimum (min(Tk), t ). Note that CIRCA cannot necessarily be assured of

expanding states in strictly chronological order, since it will perform best-first search based on a notionof utility discussed further in Section 3.3.2.

2) Set max(Tk) = maximum (max(Tk), t).3) Leave state k on the stack to be expanded.

Case 3: Features of a new state j match those of old state k (except T),and state k has already been expanded (so descendants may exist):

1) If no temporal transitions match state k (so either no descendants or only one action possible), setmin(Tk) and max(Tk) as prescribed in Case 2, since there will be no change in probabilities.

2) Otherwise, consider any action (and any existing deadline) previously selected for state k. Compute alloutgoing transition probabilities with respect to the new time t using this old action.

3) If no new probabilities change “significantly” from the old values:a) Augment the range T as specified in Case 2 above,b) Put k’s descendants with modified time ranges back on the state stack for subsequent consideration(as if the state k were a new state), since this new time could affect descendant actionchoices/probabilities. Note: this does not mean new states will be created for the descendants; it justmeans the planner will need to modify the range T and check downstream probabilities.

4) If new probabilities do “significantly” change relative to each other:a) Create a branch (as is done at t=190 in Figure 3-12), so that state j becomes a new state.b) Set min(Tj) = max(Tj) = t.c) Place state j on the stack to be expanded.

I will work to improve the algorithm in Table 3-2 during future dissertation work. I plan to

concentrate on improving planning efficiency when Case 3 is present, since Cases 1 and 2

have very simple handling procedures already. By considering methods for detecting state

cycles and precomputing time range bounds for each state, I hope to make the planner

better minimize its computations when propagating a new time range (thus potentially new

set of outgoing probabilities) through descendant states.

39

3.3.2 Limiting Planner Deliberation Time

CIRCA will be performing much of its planning offline, as discussed in 3.3.3 below.

However, since CIRCA cannot be expected to compute responses to all states offline,

CIRCA must occasionally plan dynamically (online). I shall assume CIRCA’s planner will

require no deliberation time limiting unless it is operating online, in which case the planner

will know its initial state from RTS state feedback, and it will be constrained by how long

the system will remain safe given the currently executing RTS plan.

As I stated in the introduction, numerous researchers are working on the problem of

limiting planner deliberation time. Because it is such a complex problem, I plan to make

several severe assumptions throughout my dissertation work. Briefly, my assumptions

include the following:

-- Computation of deliberation time and setting of planning parameters takes insignificant time compared to the total time available for the planning process.

-- The scheduling process has known average execution time, given an average number of tasks submitted to it.7 So, we can subtract this time and include this costin the initial value of deliberation time propagated through the planning process.

-- If the planner’s first-cut plan is not schedulable, time required for planner-scheduler negotiations to produce a schedulable plan is insignificant.

With these assumptions, the deliberation time computed only applies to a single planning

cycle, and can be directly used to set planner parameters and guide the planner’s best-first

search. Of course, these assumptions need to be addressed in detail before CIRCA can be

relied upon to always produce plans in a timely fashion, so I discuss future work on each

in Section 3.4.

I propose an approach to limiting planner deliberation time that combines elements from

design-to-time [9] and anytime [7] algorithms. As shown in Figure 3-13, upon receipt of a

state for which plans must be developed online, the planner first computes available

deliberation time. This quantity is used in a design-to-time fashion to set up CIRCA

planning parameters. Finally, the planner executes using a best-first search until

deliberation time (tdelib) expires. Each of these procedures is described in more detail

below.

7 I use average instead of worst case because scheduling algorithms are NP-complete in the worst case.

40

Compute available deliberation time ( tdelib )

Design-to-time:Set planner parameters ( p (tdelib) )

Anytime:Plan using best-firstsearch until t = tdelib

initial state(s) planned actions

tdelib tdelib, p

Figure 3-13. Proposed Algorithm for Limiting Planner Deliberation Time.

Computing Available Deliberation Time

I plan to use the planner’s initial state (fedback from the RTS) to quickly compute an initial

estimate of a planning time limit, then potentially modify this estimate based on

environment changes during planning. Since CIRCA’s main goal is always maintaining

system safety, the limiting factor for deliberation time is how long the system will be

guaranteed to remain safe executing the current RTS plan.

To compute available deliberation time (tdelib), I propose that CIRCA perform an

approximate lookahead projection, using the fedback initial state and the currently executing

RTS plan to specify action transition choices and timings. The nearest TTF (with

probability above ε) will correspond with the deliberation time limit. This is a very

approximate computation, but will serve the basic purpose of obtaining an approximate

tdelib value for my research.

Setting Planner Parameters as a Function of Available Deliberation Time

In Section 3.3.1, I have described a procedure by which CIRCA can vary its model from

the current approximate but simple state-space to the more accurate but resource-intensive

MDP-based model. A single numeric parameter (called perr in Section 3.3.1) may be

roughly used to control CIRCA’s state-space size, so I propose modifying this parameter

before planning begins based on available deliberation time (tdelib).

To compute the value of perr , CIRCA’s planner should have at its disposal an average

state branching factor based on the available set of temporal transitions, and at least an

approximate function relating perr to the available deliberation time (perr = f(tdelib) ). I

have not yet constructed this function, and doubt that an exact function of this nature exists.

However, an approximate function f(tdelib) may be sufficient because this design-to-time

approach will be used in conjunction with the anytime approach applied during planning.

41

Again, I will work to improve this function, but do not propose to perfect it during my

dissertation work.

Anytime Planning using Best-First Search

In the original version of CIRCA, search proceeded depth-first, so there was no guarantee

that the resulting goal path was any more desirable than other possible goal paths.

Reference [2] discusses the basic conversion to best-first search based solely on state

probability estimates. However, in this work, “best” is based completely on state

probability, with state expansion occurring in decreasing order of state probability.

The planner may combine its knowledge about probabilities, temporal delays, and

proximity to failure to achieve a better mechanism for controlling the best-first search.

State expansion may be ordered by decreasing utility u(s), as shown in Equation (3-1),

where p(s) = probability of reaching state s, t(s) = minimum time before the system can

reach state s, pf(s,n) = probability of reaching failure in n (or fewer) time steps from state

s. The constants a, b, c, and n (if constant) are as yet undetermined. By expanding states

in this order, I hope to plan for the most “important” states, achieving a balance between

state probability, system safety (i.e., prioritizing expansion to handle states that can reach

failure), and the time horizon considered by the planner (i.e., near-term states are handled;

far-term states will be handled by subsequent plans).

u (s) = a * p(s) + (b / t(s)) + c * pf(s, n) (3-1)

3.3.3 Achieving Timely Reactions via Plan Caching

Ideally, CIRCA could build all required actions for maintaining system safety indefinitely

into a single scheduled control plan. However, as discussed earlier, this is infeasible for

domains with a large set of actions to schedule. Currently, CIRCA builds its control plans

to handle “expected” states, with online planning required both to achieve future subgoals

and to react when “unhandled” states are reached. By building and storing a set of plans in

advance, CIRCA has a better chance of responding quickly to environmental events, thus

improving overall performance in complex domains. In this section, I address specific

questions associated with having plan storage areas in addition to available online planning.

First, I discuss a method by which CIRCA can decide which plans to build offline vs.

online (or reactively). Next, I discuss how CIRCA will split plan storage between the RTS

42

and Dispatching Subsystem caches. Finally, I discuss issues associated with reorganizing

or rebuilding the plan cache when a world state prompts online reactive planning.

Planning Offline vs. Online?

Ideally, a planner could build and store all necessary plans to achieve its goals offline, so

that it could deliberate as long as necessary to ensure development of all required reactive

plans. I argue that, in a complex domain, CIRCA cannot hope to schedule a complete set

of reactions in a single scheduled plan executing with limited resources. To address this

problem, I propose that CIRCA plans be built extensively offline, caching scheduled plans

for execution should the appropriate situations arise.

CIRCA builds plans for a set of sequential subgoals (determined by the user now;

proposed in Section 2.3 to be created automatically in the future). In some domains, these

subgoals may be structured so that the system will indefinitely remain safe while the

planner builds its next subgoal plan [20]. However, in dynamic domains such as aircraft

flight, CIRCA would be limited to one subgoal for the entire flight if CIRCA required

indefinite safety within each subgoal plan. CIRCA may not be able to schedule such a

comprehensive plan, in which case multiple plans without indefinite safety within each

must be created. Since CIRCA will not have an indefinite amount of time to plan online, I

propose that basic plans for the entire sequence of schedulable subgoals be developed

offline and stored in the Dispatcher plan cache. In this manner, if all goes as expected, the

planner will do all its work before plan execution begins, so no approximate planning will

be necessary.

Unfortunately, if the domain is sufficiently complex, CIRCA also cannot presume to build

and store contingency plans even offline for all modelable states (as illustrated in Figure 3-

4). I propose that CIRCA build its offline plans based on the “reachable” state concept it

uses now, and then use the world state classification discussed above in Section 3.2.2 to

identify states for which contingency plans should exist, versus classes of states for which

it will be acceptable to reactively plan online.

Recall that, in Figure 3-4, I divided the modelable world states into “handled”, “deadend”,

“removed”, “imminent-failure”, and all others (which are considered neither reachable nor

close to failure). I imposed mechanisms to detect the deadend, removed, and imminent-

failure state classes, and replan should one be reached. I propose to build a set of

43

contingency plans offline to handle all states which are likely or lead quickly to failure (i.e.,

a subset of the removed and imminent failure states), since these are the set for which a

guaranteed response may be required. Conversely, the deadend states have planned

reactions that keep them from quickly leading to failure. For these states, I propose to

allow the planner to reactively (online) build a new plan, using the algorithms discussed in

Section 3.3.2 to limit deliberation time as appropriate for each particular deadend state

reached.

So far, I have discussed building normal plans offline for all predetermined subgoals, as

well as contingency plans for all removed and imminent-failure states. It may appear there

will never be much online planning, but this is not necessarily the case. The “goal” in

CIRCA’s contingency plans will be primarily to postpone failure when a removed or

imminent-failure state is reached, so deadend states requiring replanning may frequently

result. When any “unhandled” state results in the current subgoal becoming unattainable

using existing cached plans, a substantial amount of online replanning may be required

either using a different set of subgoals or at least to specifically guide the system back to the

original subgoal path. In these cases, the planning deliberation time bounding procedures

become quite crucial, and we also have to reorganize the Dispatcher cache.

Plan storage in Dispatching Subsystem vs. RTS

Two key features will distinguish the RTS cache from the Dispatcher cache: available

storage space and plan access time. The RTS cache will be relatively small, so that a

minimum number of plans are downloaded for each subgoal and so that the RTS can

switch to a plan within its cache within a guaranteed time bound. Conversely, the

Dispatcher cache will be able to store a much larger set of plans, but these plans will need

to be downloaded to the RTS just before execution, so more time will be required for the

RTS to switch to a Dispatcher plan.

Above, I proposed that CIRCA plan offline for all subgoals, and that for each of these

subgoals, a “startup” plan will be created to handle normal situations, and contingency

plans will be created to handle some subset of the removed and imminent-failure state sets.

Figure 3-14 shows the proposed storage scheme for all cached plans. For each subgoal, a

“startup” plan plus all contingencies will be cached. As shown in Figure 3-14, for the

current subgoal (Subgoal 1 in the figure), the “startup” plan begins execution on the RTS,

and the RTS cache contains plans that specifically handle “unplanned-for” states that lead

44

quickly to failure.8 All other contingency plans for that subgoal (i.e., states that will lead to

failure, but not so quickly) as well as plans for future subgoals are stored in the Dispatcher.

“Fast” ttf

tttt

“Slow” ttf

tt

Failure

No ttf or “Very Slow” ttf

Executing Plan

RTS Cache Plan

Dispatcher Plan

SUBGOAL 1

“Fast” ttf

tttt

“Slow” ttf

tt

Failure

No ttf or “Very Slow” ttf

SUBGOAL 2

...

“Fast” ttf

tttt

“Slow” ttf

tt

Failure

No ttf or “Very Slow” ttf

SUBGOAL n

Figure 3-14. CIRCA Plan Storage -- RTS ready to begin execution of Subgoal 1.

Because plan switching should occur seamlessly when a subgoal has been achieved, the

RTS plan cache will contain two partitions: one for the current set of critical contingency

plans, and another for the “startup” and contingency plans for the next subgoal in the

sequence. The new subgoal’s set of plans will be sent from Dispatcher to RTS as part of

the current RTS plan. Outdated plans (i.e., either old or unattainable subgoal plan sets)

stored on the RTS will simply be overwritten as the Dispatcher downloads new plan sets.

Rebuilding the Cache when “Unplanned-for” states occur

Figure 3-14 shows how the CIRCA plan caches are organized during “normal” plan

execution. However, the planner may need to be invoked online in situations where the

current subgoal set is no longer achievable (e.g., deadend states). In these cases, the

planner will be responsible for selecting a new set of subgoals if necessary, then building

online a new set of plans to be executed. As new plans are built and scheduled online, they

will be downloaded to the Dispatcher along with indexing information regarding the new

sequence of subgoals to be achieved. If time limits prevent the planner from building

comprehensive sets of startup and contingency plans prior to beginning the execution of

these new plans on the RTS, the planner must notify the Dispatcher, which will then send

at least the next “startup” plan to the RTS for execution.

The goal of the planner will be to stay ahead of the system’s progression through subgoals,

so that it will be able to eventually rebuild both startup and contingency plans for all new

8 For my thesis research, I propose allowing user-specified constant values to classify TTFs as “quick”, “slow”, or“very slow”, although CIRCA should eventually calculate these automatically . As a start, I propose classifyingonly removed and imminent failure states by TTF delay, and assume all deadend states will not require acontingency plan.

45

subgoals. To best achieve this objective, when creating the new subgoal set based on

unplanned-for state feedback, the planner will have a significant bias toward directing the

system back to its original goal path. If the planner is successful in this endeavor, online

replanning will be held to a minimum, since once the system is back on the original subgoal

path, the cached plan set will be able to continue execution as normal. Details of an

algorithm to implement this subgoaling/planning heuristic have not yet been developed, but

such an algorithm will exist before my thesis work is complete.

3.4 Future Work

In Section 3.3.2, I discussed limiting planner deliberation time in the context of several

major assumptions associated with limiting CIRCA process execution time. Because

achieving time-limited planning alone is a complex problem, I feel these assumptions are

necessary, but each must be addressed before CIRCA can be used to control complex real-

world systems. In the following sections, I discuss potential methods for relaxing each

assumption. First, I address my assumption that the meta-level computation of deliberation

time and planning parameters is insignificant. Next, I address the problem of predicting

scheduler execution time, which I will assume to be constant or linear in the number of

TAPs to be scheduled. Finally, I address the problem of allotting time to scheduler-planner

negotiations when the scheduler fails with the first set of planned TAPs.

3.4.1 Timely Computation of Planner Parameters and Time Constraints

The CIRCA planning subsystem will contain a meta-level module to compute the available

planning deliberation time and parameters based on the fedback world state for which a

plan is being developed, as discussed in Section 3.3.2. The algorithm will involve a

lookahead search to identify the closest path to failure possible given the currently

executing plan. Unfortunately, the lookahead search process itself may take a significant

amount of time.

Lookahead search is necessary for identifying the time from any state to failure because

CIRCA bases its knowledge of the world on state transitions. I believe the most efficient

route to the nearest-term failure is a best-first search where “best” is based primarily on

time horizon of that state. In fact, time horizon will be the sole criteria I will use for

lookahead during my thesis research. Using time horizon, the more involved the

lookahead search becomes, the longer the path (temporally speaking) will be from the initial

46

state. However, CIRCA will need to account for time associated with this lookahead

procedure, then truncate the search if it becomes too costly, where “costly” is not known a

priori (since CIRCA is still in the process of computing available deliberation time).

To address this problem, I would begin by identifying a utility function which traded off

more lookahead search with the disutility of ending the search and beginning the planning

process without a specific deliberation time limit. This utility will most likely involve some

ratio r = (time already spent performing lookahead) / (current time horizon of lookahead).

To compute r, CIRCA would need to include at least an average time to expand one new

state, branching factor, and time step size between states. The ratio r may be used to

identify an approximate amount of lookahead desired in advance, then if the actual r is

different from that predicted using averages, lookahead can terminate earlier or later than

originally estimated.

Once the planning deliberation time is computed, CIRCA will compute the planning

parameters then monitor time passage during anytime planning. These processes will cause

no timeliness difficulties, since planning parameter computation will take constant time and

monitoring time during planner state expansion will be accounted for by the anytime

process used to truncate planning.

3.4.2 Reasoning about Scheduler Execution Time

Except in a few special cases, real-time scheduling algorithms are NP-complete. CIRCA

will not necessarily build a set of TAPs which fall into one of these special cases, so

CIRCA’s scheduling algorithm is NP-complete. My assumption of a constant or linear

scheduling time is particularly bad given the complexity. In the future, if the scheduler

remains NP-complete, the anytime approach proposed for planning may be extended to

include both the planner and scheduler, with appropriate tradeoffs used to assess the utility

of continued planning versus starting the scheduler with the existing action set.

Others [19], [21] have worked to optimize the CIRCA scheduler so that it is relatively fast

given TAP maximum periods and worst-case execution times. Heuristics include reducing

TAP maximum periods to shorten the required schedule length (based on the least common

multiple of all assigned TAP periods), and performing utilization and conflict checks prior

to scheduling so that failures may be identified early. Ongoing research efforts [19] are

beginning to allow relaxation of worst-case requirements for low-priority (or utility) TAPS

47

to help the scheduler succeed. This procedure to relax TAP execution time and period

requirements may allow the scheduler to execute within imposed time limits.

3.4.3 Achieving Efficient, Timely Planner-Scheduler Negotiations

In the original version of CIRCA, whenever the scheduler failed, it returned an

uninformative “fail” message, at which time the planner backtracked through its set of

guaranteed actions in hopes of finding actions that were easier to scheduler. I improved

CIRCA’s ability to find a schedulable plan by allowing the removal of improbable states

(and any associated guaranteed actions), while others [19] have enhanced the scheduling

procedures and scheduler-planner feedback to allow the planner to better reason about how

a plan needs to change before attempting to schedule again. Both these additions have

improved the planner-scheduler negotiation process, both in terms of planning speed and in

terms of the final plans produced. However, these additions do not make negotiation time

negligible, as I state in my rather strict assumptions (Section 3.3.2) to limit planning

deliberation time.

Negotiations between planning and scheduling may involve both replanning and

rescheduling for each iteration. Achieving bounds on planner-scheduler negotiations

involves several issues, including: 1) bounding planning time, 2) bounding scheduling

time, and 3) bounding the number of planning and scheduling iterations required during a

negotiation. I will be addressing 1) during my dissertation work, as described in Section

3.3.2. I discuss how others are beginning to address issue 2) above (Section 3.4.2). To

address issue 3), CIRCA would have to be able to reason about the convergence properties

associated with the negotiation process (i.e., limits on how plan schedulability improves as

a function of number of planner-scheduler iterations). I have not yet carefully considered

how such properties may be established for CIRCA.

48

=====================================================

CHAPTER 4

INTERFACING PLANNING, REAL-TIME,

AND CONTROL SYSTEMS TECHNOLOGIES=====================================================

In this section, I describe methods to improve CIRCA performance by incorporating

methods from planning, real-time, and control systems research areas. Section 4.1 gives a

brief overview of the strengths of each field along with a generic model each system uses

for its world. Section 4.2 describes how I believe all three of these technologies may be

generically combined, and describes how the proposed CIRCA maps to this generic model.

Then, for the remainder of this section (Sections 4.3-4.5), I describe model component

interfaces in the context of CIRCA.

To address specific issues in the interfaces, I consider the inter-module links between in

CIRCA in the context of pairwise combinations of planning, real-time algorithms, and

control systems. To date, CIRCA has focused primarily on the interface between planning

and real-time scheduling technologies. In Section 4.3, I describe both existing and

proposed methods to combine AI planning and plan execution under the constraints present

in systems requiring real-time response guarantees. Section 4.4 proposes a method for

combining planning and control technology. The planner must contain sufficient

knowledge describing how the controller functions, then must build a sufficiently accurate

world model to allow proper selection of actions to be executed. I present my ideas which,

if implemented properly, will allow the planner to efficiently build its world model from

symbolic feature values while maintaining the precision required to construct a valid plan.

The other pair to interface, control and real-time systems, has no section devoted to it in

this proposal because control engineers already incorporate real-time constraints when

implementing their systems. Typically, the controller and state estimator functions will

each have a constant predetermined execution period based on system dynamics and

controller convergence properties. Also, the worst-case execution time is relatively easy to

compute and will most likely be close to the average execution time because typical control

loops involve only limited branching (e.g., based on controller mode). Because of these

properties, a valid execution schedule for controllers and state estimators can be built

offline (even by hand). CIRCA models the controllers and state estimators as part of its

49

environment, so it effectively assumes the offline scheduling will allow the controllers and

state estimators to operate as described in the its knowledge base and in the Abstraction

Subsystem interface to the environment.

For my thesis work, I will be attempting to develop a generic interfacing strategy that can

easily accommodate modifications to any of the basic planning, real-time scheduling, and

control algorithms placed into the system. I plan to address basic interfacing issues in the

context of a complex task: automated aircraft flight. However, due to time and

accessibility constraints, I will be limited to a fairly simple set of controllers and knowledge

base transitions. In Section 4.5, I describe research and testing that will be required to

fully verify that this interface is sufficient to handle the spectrum of available planners,

schedulers, and controllers.

4.1 Background

In this section, I provide a basic description of how typical planning, real-time, and control

systems view the world, as well as what each system can compute and what each assumes

is available or true in the world. By showing that each system has concentrated on

different but equally important aspects of the autonomous control problem, I hope to

convincingly illustrate exactly why one would want to combine the three technologies.

These system descriptions are used in later sections to show how the three systems can be

usefully combined.

4.1.1 AI Planning Systems

Figure 4-1 illustrates the components and interconnections for a typical AI planning system

[27]. System input is some sort of user-specified domain knowledge, which may be

represented in the form of rules or transitions, preferences, fitness functions, etc. The

planner then typically performs some sort of search process to develop one or more actions

it deems appropriate based on domain knowledge and possibly the current system state.

When the plan of one or more actions is complete, it is executed by the plan executor,

which then will cause the system to act in its environment, where “environment” is loosely

defined, ranging from internal computing tasks (e.g., database management) to directly

operating an actuator that causes physical movement in the environment. The “state”,

specified in the language of the planner, may be fed back to the executor and planner to

help decide which action or plan to build or execute next.

50

Planner Plan Executor EnvironmentDomainKnowledge

Plans Actions State

Figure 4-1. Traditional AI Planning System.

The main strength of most AI planning systems is their ability to take high-level knowledge

that might be written by some domain expert, and efficiently search through the space of

possible actions and states to determine appropriate high-level reactions as a function of

state. One key to planning efficiency and accuracy is effective discretization of the

environmental properties, often using a symbolic state feature representation. To apply a

planning system to any domain, the system assumes that its relatively simple set of actions

will be sufficient for controlling the system, so that each “action” may require significant

processing (hidden in the “Environment” in Figure 4-1) before any actuator commands are

developed. Also, the state fed back to planner and executor is not a set of sensor values,

but instead processed sensor data that has been abstracted to the format used to represent

state in the planner and knowledge base.

4.1.2 Real-Time Systems

Real-time algorithms [13] focus their efforts on allocating computational resources to

provide guarantees regarding system performance. As shown in Figure 4-2, typical input

includes a set of tasks to be executed, along with a set of execution constraints. Tasks

correspond with sets of functions that will require system resources. Constraints include

minimum parameters to be achieved after scheduling that task, including features such as

task periods, deadlines, and backups (replications/versions) for reliability. For a system

with distributed resources (e.g., multiple CPUs, I/O channels, etc.), the real-time algorithm

develops an initial task allocation based on available resources and task constraints, then

attempts to schedule these tasks, negotiating between scheduling and allocation as needed.

Because not all task constraints may be possible to satisfy, feedback regarding which task

constraints were impossible may be made available, in case the task list developer wishes to

modify the input task list. Once the schedule has been developed, execution begins, and

online algorithms feed back any changes in available resources (e.g., a CPU fails) to the

allocation and scheduling modules, which may then recompute the task allocation and

schedule(s) if necessary.

51

Execution Platform

Task Allocator

Scheduler

Task Listw/ Constraints

Task SetsTaskSchedule

AvailableResources

Status Negotiation

Figure 4-2. Traditional Real-Time System.

Task allocation and scheduling are crucial to systems in which constraints must be

guaranteed (e.g., an airplane controller must operate at a certain frequency to ensure

stability). Many systems have been carefully designed by hand to meet real-time

constraints, but allocating/scheduling manually is very expensive and is virtually

impossible to do with real-time constraints. Thus, the strength of working in a real-time

paradigm is the ability to efficiently manage resources dynamically for any task set.

However, as shown in Figure 4-2, the real-time system assumes task specification in the

form of a specific set of constraints such as deadlines, etc., as well as assuming that

execution platform computational resources are predictable and easily measured.

4.1.3 Control Systems

Figure 4-3 shows the components and their interconnections for a traditional feedback

control system [14]. The input to the system includes a reference trajectory (r(t)) to be

tracked and sensor feedback (y(t)) from the plant (or environment). The output from the

system is a set of actuator commands (u(t)) which operate on the plant. To compensate for

imprecision in sensor measurements, a state estimator is invoked to compute the best

estimate of state (y(t) ) from measured state (y(t)), applied actuator forces (u(t)), past state

estimates, and a dynamical model of the system. The trajectory offset or error (e(t) = r(t) -

y(t) ) is then used by the controller to compute the next actuator value set.

Controller Plant

StateEstimator

-

+r(t)

y(t)

e(t) u(t) y(t)

Figure 4-3. Traditional Control System.

52

The control systems field is quite mature compared to both AI planning and real-time

systems. After the plant dynamics have been sufficiently described, well-defined

mathematical methods may often be employed to design a controller set (linear or nonlinear)

which will provide response guarantees in terms of stability and tracking. However, such

guarantees are possible only if a certain minimum set of sensors providing y(t) is available,

and if the reference trajectory r(t) is actually achievable (i.e., within the set of controllable

or at least stabilizable regions of the controller’s state-space). Input r(t) is usually

considered a continuous function, and is specified in the same form as the controller state,

including quantities that fully describe the system’s position and velocity in all dimensions

(translational and rotational).

4.2 Combined Planning, Real-time, and Control System

In this section, I address the question of how an autonomous system might interface the

planning, real-time, and control systems algorithms described above. I propose a generic

method of connecting these three systems to accomplish the overall task of autonomous

control, illustrating operation in the context of a piloted vehicle. The discussion in this

section occurs at a fairly high level, with more interface details described below in Sections

4.3 and 4.4.

In Section 4.1, I identified input quantities that are assumed to exist: domain knowledge

for the planner, task and constraint sets for real-time system, and reference trajectories (r(t))

for the control system. In each case, the inputs must be “acceptable” before the system will

succeed (e.g., domain knowledge must be sufficiently complete and correct; task set must

be schedulable; r(t) must be accessible). I base my work on the assumption that the easiest

of these three types of inputs for a human user to provide is a comprehensive domain

knowledge base, and I further assume that a symbolic knowledge representation will assist

the user in this endeavor. Although providing comprehensive domain knowledge is

difficult, the more difficult alternative is to build by hand a comprehensive set of tasks,

negotiation functions, and constraint sets for the scheduler, and/or to construct by hand a

comprehensive set of acceptable reference trajectories for all possible objectives (or goals)

the system may be trying to achieve.9

9 Many systems currently exist where sufficient backups (e.g., human pilots) allow system input that may not becomprehensive. However, I am working to achieve safe, fully-automated aircraft flight, in which case any gaps ininput completeness must be explicitly detected and handled by the system itself.

53

Consider a system in the form of Figure 4-4. Major components from planning, real-time,

and control systems are connected such that the only overall system inputs are the

knowledge to the planner and the sensor feedback from the plant (or environment). The

only system module missing from this diagram is the “allocation” module from real-time

systems. During my thesis research, I will be using uniprocessor scheduling algorithms to

simplify the overall problem, but task allocation may be straightforwardly added in future

research.

Working from left to right in Figure 4-4, the first module is the planner, which uses its

input knowledge and any feedback from the plan executor (explained below) to construct

one or more plans. These plans are scheduled then executed by the plan executor, which

corresponds with both the “plan executor” module in Figure 4-1 and the “execution

platform” module in Figure 4-2. The basic responsibility of the plan executor is to execute

its planned actions within real-time constraints. Action commands are somehow translated

to the continuous language of reference trajectories, then the controller computes actuator

commands for the plant (i.e., environment). Sensor feedback is sent to the state estimator,

which provides the current state both to the controller and plan executor. The plan executor

uses this feedback to determine the planned action to execute next and provides any state

feature feedback to the planner.

Planner Scheduler PlanExecutor Controller(s)

StateEstimator

Plant(Environment)

Figure 4-4. Combined AI Planning / Real-Time / Control System.

The Figure 4-4 depiction is intended to show how basic planning, real-time, and control

strategies may be combined. Because I am working in the context of CIRCA, I will begin

by comparing the proposed architecture for CIRCA (recapped in Figure 4-5) with Figure 4-

4. Then, in Sections 4.3 - 4.5, I speak in the context of CIRCA modules. The Figure 4-4

planner and scheduler map directly to CIRCA’s Planning and Scheduling Subsystems.

The Figure 4-4 plan executor maps to a combination of the CIRCA Real-Time Subsystem

54

(RTS) and Abstraction Subsystem (ABS), where the RTS actually executes the plans, but

the ABS does all language conversion required to make the connection between

controller(s)/state estimator(s) and the plan executor. As shown in Figure 4-5, CIRCA

considers the controllers and state estimators to be part of its environment. This

representation was chosen to allow flexibility in controller and state estimator design for

each domain, and simply means that CIRCA researchers (such as myself) will concentrate

on the ABS / controller interface language, not the actual inner workings of controllers or

state estimators.

Real-Time Subsystem

Environment Interface

TAP plans

Knowledge Base

initial state / goals

temporal/action transitions

Dispatching Subsystem Plan message building Scheduled plan storage Plan downloading

Planning Subsystem

Feedback handler

Scheduling Subsystem

TAP list w/ timings

Contingency plans

TAP plan executor

plan handlingdirectives Schedule Manager

Scheduling routinesTAP schedules

status-3

status-1

status-2

handshakehandshake

PlannerSubgoal creation/storage

featurevalue data

action commands

Abstraction Subsystem

"Environment"Sensors Actuators

State Estimators Controllers

Abstractor

De-Abstractor

Controller & actuator commands

Sensor &state data

Figure 4-5. CIRCA -- Proposed Version.

4.3 Interfacing Planning and Real-Time Systems

To-date, CIRCA research has focused on the combination of an AI planner, real-time

scheduler, and plan execution module (RTS). Plans are built then scheduled such that

critical actions have associated real-time response guarantees. However, the planner

currently operates without time bounds, assuming the system will remain safe long enough

for any new plans to be built. As discussed in Section 3.3, I believe this is an

unreasonable assumption for complex domains, so I have proposed a method to impose

real-time constraints on the planning process (Section 3.3.2), assisted by the use of offline

planning and plan storage in the Dispatching Subsystem and RTS cache (Section 3.3.3). I

have classified the modelable states (Section 3.2.2) such that the most time-critical

55

responses are available either in the executing plan or a cached plan, leaving more (but not

indefinite) time for replanning should a less-time-critical “unhandled” state be reached.

Using the methods described in Section 3, I believe CIRCA will be much better prepared to

cope with the unruly combination of the intractable planning problem and a complex

domain requiring real-time response. Although issues in guaranteeing response times for

the planner deliberation time calculation process, scheduling, and planner-scheduler

negotiations [19] still need to be addressed (as discussed in Section 3.4), I believe the

combination of methods proposed in this document will provide the basic links between the

planning / plan execution processes and real-time scheduling / execution system.

4.4 Interfacing Planning and Control Systems

In this section, I describe how a planning and control system may be interfaced in the

context of CIRCA. For my thesis research, I will assume all controllers and state

estimators are well-understood and execute reliably. The basic interface between planner

and control system occurs in two places: 1) RTS action output is sent to the controllers via

the CIRCA Abstraction Subsystem (ABS), and 2) State estimator and discrete sensor

values are sent as feedback to the RTS and planner via the CIRCA ABS. Since the planner

is selecting actions that will issue commands to the controller, the planner must have a

model of controller behavior in its transitions as well as a sufficiently accurate world model

(i.e., expanded state set) to ensure that appropriate actions will be sent to the controller. In

this section, I first discuss the functionality required in the ABS to translate between control

system and planning languages. Next, I describe how the CIRCA planner “action scoring”

(or reward assignment) functions may help the planner select only actions that are feasible

given the computed state and modeled controller properties.

4.4.1 CIRCA Abstraction Subsystem

As shown in the CIRCA system diagram (Figure 4-5), the control system components

connect through the Abstraction Subsystem (ABS) to the remainder of CIRCA. As

discussed in Section 2.2, the ABS must execute with real-time guarantees; however, for

my thesis research, I will be assuming the static ABS tasks are pre-scheduled by the user.

The two main ABS tasks include maintaining a current abstract representation of state for

CIRCA’s RTS and planner, and translating RTS action commands to appropriate controller

or discrete actuator commands.

56

To maintain a current abstract state estimate, discrete sensor and state estimator values must

be converted by the ABS into feature-value pairs that can be used by the planner and RTS.

Currently, the ABS directly reads sensor values from the environment. Since each sensor

uniquely describes one CIRCA feature, CIRCA’s ABS simply performs a simple value

comparison to select the appropriate feature value “bin” containing the current sensor

reading. This 1:1 correspondence will exist throughout my thesis work, allowing the same

type of value comparison functions to be used, regardless of whether a sensor or state

estimator is supplying the data.

The planner knows about high-level actions, such as “climbing to a cruise altitude” in an

aircraft, but has no knowledge of the numerical values associated with that action. The

second ABS task is to take an abstract action command, translate it into the language of the

controller or discrete actuator, then output it to the appropriate place in the “environment”.

The ABS may be required to perform functions such as the “guidance” functions described

for aircraft in Section 5.1.1, in which case action translation would involve using a

dynamical model of the system being controlled to convert the high-level command to

appropriate controller reference commands. However, I cannot develop a complex

guidance system during my thesis research, and I would want borrow such a system from

domain experts anyhow. Thus, I will be using a much simpler set of functions in the ABS,

moving “guidance” (see Sections 5.1 and 5.3) into the CIRCA environment. In the current

(and proposed) ABS models, a lookup table is used to convert the text RTS action

command (e.g., “descend to flight level 10”, “autoland”) to the appropriate set of numerical

values to be output to the environment/controllers (e.g., “reference altitude = 10000 ft.”,

“navigation frequency = 109.0; controller mode = autoland”).

4.4.2 “Action Scoring” in CIRCA

For each state expanded, a planner must select an action (if any) based on the goals to be

achieved, including failure avoidance. This decision process should be made quickly

because the planner must run this algorithm once for each state it expands. In CIRCA, the

set of possible action choices is initially narrowed via action transition precondition

matching. However, many actions may remain in this set, including the choice to execute

no action (NO-OP) so long as the state does not transition directly to failure (i.e., no TTF is

present). CIRCA uses “action scoring” to determine the utility of each action (including

NO-OP if applicable), then selects the highest-utility action of the set. Once the best action

57

has been selected, the CIRCA planner computes the periodic timing requirements for that

action based on TTFs, then uses this value during computation of all descendant state

probabilities.

In previous versions of CIRCA [22], “action scoring” was based on lookahead search.

For each action whose preconditions matched the current state, the action scoring function

searched ahead a user-specified number of levels, expanding a small state tree of

descendant states based only on applicable temporal transitions (since actions had not yet

been selected for these states). The primary purpose of lookahead was to determine if the

proposed action would allow the system to come close to failure at a future time, with a

secondary purpose of seeing if goal features may be attained in the future. Unfortunately,

in an inherently dangerous domain such as aircraft flight, failure is almost always possible

in some near-term scenario, so NO-OP tended to gain a significant advantage over other

actions, often incorrectly (e.g., the aircraft preferred to never leave the ground, since it was

“safe” there). Also, since the lookahead used only temporal transitions, all of which might

be preempted with later selection of guaranteed actions, there was no way to predict

whether any past the first-level descendant state would actually be reached. Finally,

lookahead search was a very expensive algorithm to use, especially since action scoring is

performed for each state.

After incorporating the initial state probability model (described in Section 3.2.1), I

abandoned the lookahead search algorithm in favor of a fast action scoring procedure that

simply considered whether the direct descendant of the action achieved any new goal

feature and if that action allowed preemption of any TTF in the current state. This

procedure is much faster than the lookahead process, but works well only when goal

features can be achieved in one action, since CIRCA currently does not contain the ability

to assess a state feature’s “proximity” to the goal value.10

I believe a key to better action scoring is to help CIRCA compute proximity relationships

among feature values.11 For example, suppose a feature “altitude” were modeled with a

10 Lookahead enabled the action scorer to notice goal achievement further downstream so long as, after this initialaction, only temporal transitions were required to achieve the goal feature. However, if more than one action wererequired to achieve a goal feature, neither the lookahead nor the simple “one-step” scoring algorithm would beable to notice that this first action brought the system closer to its goals.11 Proximity relationships are not new to planning, but are new to CIRCA. Quantities such as Manhattan Distancefor the 8-puzzle problem have been used to “score” actions in many systems. However, in such systems, onemetric is typically used to measure proximity for all features. I propose allowing a separate, dynamically-basedmetric for each feature or group of features.

58

symbolic value set of {0, 1000, 2000, 3000, 4000, 5000}. Also, suppose the current state

feature value is (altitude 0), and the goal value is (altitude 5000). Currently, the only way

CIRCA can determine that an “altitude” feature value of “1000” is between “0” and “5000”

is by stringing together a set of temporal transitions that describe altitude changes when

climbing. If, instead, CIRCA was able to employ simple mathematical comparisons, it

would certainly be able to quickly see that an action leading to an altitude of 1000 is

“closer” to the goal (difference 5000 - 1000 = 4000) than doing no action (difference

remains 5000 - 0 = 5000). With discrete transitions for climbing 1000 units of altitude,

lookahead search would have needed 5 levels to discover the goal, while my “newer”

algorithm that does not use lookahead will not even realize a “climb” action would help

achieve the goal. However, if the action scoring function knew relations between different

feature values, then my “newer” scoring algorithm could simply notice that, although the

“climb” action did not immediately reach the goal, it did bring the system “closer” to the

goal by a certain fraction which may be used for scoring purposes.

I propose adding mathematical functions to the CIRCA knowledge base that will allow

computation of proximity relations between the symbolic feature-value pairs. The

functions will mathematically compare an input feature value with the goal, returning a

“utility” between 0-1 describing how close the feature is to the goal.12 Revisiting the

altitude example described above, define the “utility” function for altitude as shown in

Equation 4-1. Using this equation, the initial feature utility is 0, but the initial “climb”

action that creates an altitude of 1000 will have utility 0.2, so “climb” will be selected over

“NO-OP”.

altitude utilitycurrent altitude goal altitude

altitude altitude_ .

_ _

max( ) min( )= −

−−

1 0 (4-1)

So far, I have only addressed the issue of relating feature values to numerical values to help

select actions. What does this buy in terms of interfacing a controller with a symbolic

planner? In one word: flexibility. When CIRCA’s tasks include issuing commands to

controllers, state features will include values describing high-level controller parameters or

modes (e.g., my simple “takeoff”, “cruise”, “autoland” set for aircraft control). While

utility functions normally return values describing feature proximity to the goal, controller

12 For binary-valued features (e.g., values “True” and “Nil), utility may be simply defined as 1 if the feature valuematches the its goal value and 0 otherwise.

59

utility functions may be designed to consider the entire state and incorporate items such as

proximity to the edge of the controllability envelope as well.13

4.5 Future Interface Work

I address remaining interface issues in this section, first by describing how the choice of

planning, real-time, and control algorithms may change the interface, then by describing

future work that must be done to fully validate the proposed CIRCA-based interfaces.

4.5.1 Effects of System Evolution

I have tried to present a fairly general picture of planning, real-time, and control systems

technology, so that the basic interface will not be invalidated by the evolution of techniques

available in any of these areas. However, certainly the choice of algorithms used for

planning, real-time, and control computations will have some effects on the interface

between the systems. I cannot predict how each system type will evolve in the future, so

instead I describe how changes in each system will affect the others, in the context of the

proposed CIRCA system.

Effects of Controller Modifications

A controller modification may cause two types of changes in the rest of the CIRCA system:

different state feature values (because different data is needed or available), and changes in

the planner’s knowledge regarding controller functionality or capabilities. So long as the

new controller properties can be effectively described in a knowledge base and reliably

monitored during execution, the interface between the CIRCA modules and the controller

will not need to change.14

Effects of Modifying Real-time Computations

Currently, CIRCA performs uniprocessor scheduling, and only considers CPU resources.

The algorithm used to schedule the CPU for CIRCA’s RTS can be easily modified without

13 I have not yet developed a more precise definition for these utility/proximity functions, so I am certainlylooking for ideas.14 This flexibility is in part due to the decision to place the controller in CIRCA’s “environment”, then expresscontroller functionality in terms of planner features, state transistions, and associated “action scoring” functions.

60

affecting the rest of the system, since the main output is an ordered list specifying the

schedule [21].

Natural extensions to CIRCA’s real-time scheduling capabilities include the implementation

of algorithms to perform task allocation and to schedule additional resources (e.g., network

traffic, I/O) as well as CPU usage. Adding these new algorithms should not cause

significant modification to other CIRCA algorithms, except for allowing functions to be

split among different processors.

Effects of Modifying the Planner

Because it will be most well-developed, CIRCA’s planning system is perhaps the most

inflexible to modifications. Any “new” planner put into CIRCA would still need to reason

about the real-time requirements of planned actions so the interface to the scheduler could

be nearly identical to the current interface [19].

Unfortunately, switching the planner may require different knowledge base structures, so

both environmental and controller properties would need to be modified to fit into the new

knowledge base. Additionally, the proposed algorithm to limit planning time (Section

3.3.2) relies on the planner’s use of a best-first search strategy to allow anytime bounding

of planning time, as well as the notion of using some variable parameters (e.g., state

probability accuracy) as a mechanism for approximate planning using the design-to-time

approach. Many planners do not have analogous heuristics, in which case the algorithm to

bound planning time would need to be modified appropriately.

As discussed in Section 4.4, a key to using symbolic state representation during planning

for continuous-valued variables is employing an “action scoring” function. Any planner

that could be used would require some set of actions, but, if the action scoring process is

an integral part of the system, it may be difficult to include a hybrid symbolic-numeric

calculation process like that I will be developing for the CIRCA planner.

4.5.2 Testing the Interfaces

During this research, I will be performing only limited tests of the interfaces, so I expect

much more would need to be done. Others [19] have described work that will still needs to

be done with respect to testing the CIRCA planner - scheduler interface. I believe the key

61

to testing the other interfaces is to build a more complex knowledge base and controller set,

then run a very diverse set of tests. Given the numerous possibilities for information

feedback, plan switch procedures (e.g., RTS cache, Dispatcher, replanning), and types of

action scoring functions used, I expect more rigorous sequences of testing can be used to

both find bugs in the existing code, and possibly even point to algorithmic deficiencies that

will need to be addressed in the future.

The CIRCA-based interface between planning, real-time, and control systems is not

specific to the fully-automated aircraft control problem. In future tests, CIRCA should be

given the chance to control different domains, with their own ideas about state values,

controllers, and knowledge base transition properties. If CIRCA performs well in these

domains, then one could better make claims about the generality of CIRCA’s algorithms.

Conversely, if the new domain produces problems for CIRCA, then these tests may result

in improvements to the CIRCA algorithms that have not been foreseen during my research.

62

=====================================================

CHAPTER 5

ACHIEVING SAFE, FULLY-AUTOMATED

AIRCRAFT CONTROL=====================================================

My primary long-term research goal is to help achieve safe, fully-automated flight. In this

section, I describe the fully-automated flight control problem and propose a simplified

model I plan to use for my thesis research. I believe the aircraft domain is a perfect choice

for testing CIRCA because it requires strict real-time response guarantees to maintain safety

(i.e., not crashing). Indefinite safety can never be achieved so long as the aircraft is aloft,

and considering the complete set of possible aircraft states is not feasible. I propose that

CIRCA will need to build and store multiple plans to capably handle all aspects of a flight.

Also, to allow response to all possible anomalies, CIRCA will also need to be able to

replan dynamically within time limits imposed by the reachable set of aircraft states.

In Section 5.1, I describe the aircraft control problem in terms of existing Flight

Management Systems (FMS), tasks that still must be performed by the human cockpit

crew, and then present arguments for full cockpit automation. Understanding current FMS

capabilities and limitations is particularly important because I propose to use CIRCA

basically on top of existing FMS (without the “flight planner” module, of course). CIRCA

will perform many of the functions currently handled by pilots, and it will minimize

duplication of tasks already performed adequately by the FMS. I describe my current

rather primitive CIRCA aircraft model and simulation tests performed to-date (Section 5.2),

followed by the (slightly less primitive) model and subset of possible emergency situations

I will consider during CIRCA testing (Section 5.3). Finally, in Section 5.4, I address

post-dissertation work that will still need to be tackled before safe, fully-automated aircraft

is possible.

5.1 Background

In this section, I describe common practices and available technology for commercial

aircraft flight. I begin with a discussion of capabilities and limitations of modern Flight

Management Systems (FMS). Next, to illustrate the broad range of functionality that

would be required of a fully-automated aircraft system, I describe the role of the human

63

cockpit crew. Finally, I motivate my push for fully-automated aircraft by describing the

most prevalent cause of aviation accidents today: pilot error.

5.1.1 Current Aircraft Control Technology

Today's most advanced commercial aircraft are capable of fully-automated flight from

takeoff roll through full-stop landing provided the original flight plan is not significantly

altered and no anomalous situations arise. In this section, I describe the capabilities and

limitations of state-of-the-art FMS. As described in [17], current FMS have two basic

components: the Flight Management Computer (FMC) and the Control and Display Unit

(CDU). The FMC is responsible for all aircraft computational and control tasks, while the

CDU serves as the main interface between cockpit crew and FMS. Typically, to increase

reliability, each aircraft will contain two independent copies of the entire FMS system, one

near the pilot and one near the co-pilot.

In this section, I focus on FMS tasks that are applicable to a fully-automated aircraft, since

no pilot interface would be required. For more details of FMS tasks related to user

interfacing, see [17]. Several basic functions are performed by the FMC: Flight planning,

Navigation, Performance Optimization, Performance Prediction, and Guidance. Figure 5-1

shows the computation modules of the FMS and how they are connected. I briefly

describe each below; more details are provided in [17] and [31].

Performance Prediction

Guidance Control Flight Planning

Pilot

ATC

Performance Optimization

aircraftdata

r(t)

NavigationSensordata

Nav Radio Tuning

u(t)

attitude,thrust sensor data

plandescentprofile

x, xreference

.

x, x, wind

.

Figure 5-1. Flight Management Computer Tasks.

Flight Planning

Current FMS have the capability to follow flight plans in the format of waypoints,

altitudes, and takeoff and arrival procedures. The flight plan may be entered by the pilot,

uplinked from a ground station, or recalled from a preset database of flight plans. Pilot-

64

entered or uplinked weather data is used by the FMS to compute enroute speeds, fuel

consumption, and arrival times for the flight.

A fully-automated aircraft would need to always be able to build its own flight plan, never

assuming assistance would be available from a human pilot or ground station. Current

FMS rely on a large database of preset flight plans, but since the database includes flight

plans between major airports around the world, there are few different plans for any

particular departure/destination airport pair. As airspace becomes more crowded and

corridors are not so clearly defined (e.g., “free flight” using GPS [34]), it may become

prohibitive to store all possible flight plans in a preexisting database. Instead, it may

become a better policy to build a set of flight plans (primary and backup) using a more

general knowledge base, based on the specific departure and destination airports for the

upcoming flight.

Navigation

Navigation involves determining current aircraft state based on sensor input and output

from a variety of computational modules. Navigation module output includes aircraft

position, velocity, and wind parameters, and is used by several FMS modules (see Figure

5-1) to keep track of how well the aircraft is following its flight plan. A navigation module

does not fully replace the “state estimator” present in feedback controllers, because aircraft

controller state must include additional state values such as (roll, pitch, yaw angles and

rates).

Performance Optimization

This function of the FMS computes aircraft performance parameters that are subsequently

used by the guidance module (see below), such as altitude, airspeed, fuel, and thrust. The

current flight plan, aircraft configuration parameters (e.g., gross weight), and current state

(from the navigation module) are used during these computations.

Performance Prediction

This function performs a faster-than-real-time simulation of the flight using the current

flight plan to predict future attributes of the flight, including arrival time, fuel consumption,

etc. This simulation continues throughout the flight, and if the system predictions violate

65

constraints (e.g., not enough fuel to follow the flight plan as-is), a warning message will

be displayed for the pilot, who then is fully responsible for reacting to this warning.

Subsidiary functions are also provided in this module, primarily to provide information to

the pilot or support the FMS flight performance optimizations described above. These

subsidiary functions are time-consuming [17], and are strictly done on a best-effort basis as

background processes. Quantities computed from these background processes include

predictions of nearest alternate airports, descent path generation (to determine the inflight

location to begin the initial descent from cruise), etc. In a fully-automated aircraft, these

functions would need real-time response guarantees, because no pilot could be relied upon

as a backup for critical decisions (e.g., selecting and entering a course for an alternate

airport).

Guidance

Using the flight plan and the more detailed descent path altitude reference trajectory, the

guidance module is responsible for generating the continuous, time-dependent reference

trajectory in terms of low-level aircraft state. In an aircraft, linear position and velocity are

tightly coupled to aircraft attitude, thrust, and airspeed. In fact, the FMS controls only

these attributes to achieve the desired linear position and velocity. By using the input flight

plan, an approximate dynamic model of the aircraft, and current linear position error

estimates, the guidance module computes the desired roll, pitch, airspeed, and thrust to be

achieved. These values are then sent to the low-level controllers as reference inputs.

Controllers

As discussed above, the low-level controllers will receive reference commands for roll,

pitch, airspeed, and thrust. These controllers then use aircraft-specific feedback control

laws for achieving these commands. Since aircraft dynamics are highly nonlinear, these

controllers are difficult to specify for wide ranges of reference inputs. Techniques such as

gain scheduling [16] allow local linearization of the system, which facilitates the

computation of controller parameters.

Due to the nonlinear and tightly-coupled nature of the reference command attributes, only

certain combinations of state (r(t) = {roll, pitch, airspeed, thrust}) may be successfully

achieved. I have not yet encountered a careful description of how limitations on these

66

regions of “controllable” state space are propagated all the way from controller to flight

planner. I hypothesize that FMS designers have worked around this problem by storing

only preset flight plans that have been shown to behave acceptably during “near-normal”

flight conditions. However, I also hypothesize that the FMS will fail when “abnormal”

conditions (e.g., severe wind shear, actuator loss) result in guidance and/or controller

reference states that are not achievable.

5.1.2 The Role of the Human Cockpit Crew

The flight crew's primary tasks are to monitor instruments and aircraft performance,

communicate with ATC (Air Traffic Control), and make appropriate route changes based

on situations such as ATC directives, instrument indications (including failures), weather,

and other air traffic. In the most modern aircraft, control automation has progressed to the

extent that there are only two cockpit crew members, the pilot and the co-pilot. One person

(the pilot) supervises all flight operations, while the other (the co-pilot) typically handles

any manual flying, navigation, and communication with ATC. In this section, I describe

the tasks typically performed by each member of a two-person cockpit crew, then briefly

discuss procedure changes associated with handling emergencies.

Pilot

Prior to leaving the gate, an initial flight plan from departure point to destination is

approved by the pilot and transmitted to ATC. The FMS flight planner calculates and

displays an initial plan for a standard flight from one airport to another. However, this

program has limited capabilities (as discussed above) with respect to automatically

responding to changes in aircraft performance capabilities, unusual sensor readings (e.g.,

collision-course traffic), or even ATC commands. The pilot or co-pilot must manually

enter course changes whenever a situation warrants a major modification to the original

flight plan.

The pilot's major responsibility during all phases of flight is supervising cockpit functions

as well as "taking control" of the aircraft whenever he/she thinks it is necessary. The

theory is that if the pilot is freed from the time-consuming tasks of route calculation, aircraft

control, and communication, he/she will be better able to perform critical monitoring tasks,

thus discovering and reacting to problems as quickly as possible. This allows the pilot to

"get ahead of the airplane" -- to develop plans for potential conflicts or diversions from the

original flight plan based on dynamic changes in route and/or aircraft conditions.

67

Since the pilot assumes primarily management responsibilities, the co-pilot typically

performs the manual flight functions unless the pilot has taken control (e.g., during

emergencies). From takeoff roll to landing touchdown, the pilot constantly monitors the

flight instruments (e.g., altimeter, airspeed indicator, heading indicator, etc.), looking for

any anomalies in actuator inputs and responses. Additionally, the pilot compares the

aircraft flight behavior (determined visually and from instruments) with the

expected/planned behavior to make sure the cockpit crew has correctly entered the desired

commands to the FMS and/or manual controls.

The pilot is ultimately responsible for all important decisions made during flight, including

aborted takeoffs, missed approach (go-around) calls, and any emergency handling

procedures. Aborted takeoffs may occur so long as sufficient runway remains, when

problems such as a failed engine or critical instrument warning indicate that the plane

should not be flying. A missed approach occurs in many situations, including situations in

which there is an obstacle on the runway, landing equipment such as a gear malfunction, or

inclement weather prohibits landing. Pilots spend many hours explicitly training to handle

emergency situations when or if they arise, because experience is considered invaluable for

making the right decision during a high-pressure emergency requiring quick response.

Because of its importance, both pilot and co-pilot must perform collision avoidance tasks,

using data from ATC, automatic TCAS (Terminal Collision Avoidance System) warnings,

and visual identification of nearby aircraft and/or terrain. This task is particularly important

on approach to landing since the traffic is frequently close together and the airplane is not

too far above the ground. The pilot has the ultimate responsibility to initiate any course

changes required to avoid collisions, although he/she is often assisted by the co-pilot.

Finally, the pilot is responsible for interacting with the rest of the people in the airplane.

He/she supervises the cabin crew, advising them of times to prepare for takeoff or landing,

and any emergency procedures that may be required. The pilot also has the job of

informing and calming the passengers by telling them of various situations, landmarks, etc.

Perhaps the main difficulty with a fully-automated cockpit would be the job of calming the

passengers, especially those that had a fear of computer systems.

Co-Pilot

Perhaps the most common view of a co-pilot is based on his/her role as a “backup system”

for the pilot. I, instead, view both the pilot and co-pilot as people to take over the airplane

68

if the situation is not handled by the flight computers. The primary responsibilities of the

co-pilot include communicating with ATC and manual flying of the aircraft, freeing the

pilot to adequately perform the supervisory tasks discussed above.

The co-pilot generally handles all communication with ATC. Communication begins while

the plane is still at the gate. The flight plan (trajectory) is transmitted to clearance delivery,

who transmits this plan to ATC computers and alters the plan if necessary. ATC then

automatically clears a corridor of airspace for the aircraft for its entire flight, significantly

reducing the chance of mid-air collision. When the plane is ready to leave the gate, ground

control is called, and the plane is guided along taxiways to the runway. Before takeoff, the

co-pilot calls the tower and receives a clearance for takeoff as well as instructions for

climbing to join the filed flight plan. After takeoff, the co-pilot switches to appropriate

ATC enroute control centers, maintaining constant communications with ATC. On

approach to landing, the co-pilot communicates with the destination airport tower until

landing, then ground control until reaching the destination gate. Any of the enroute ATC

centers may change the aircraft course, usually to avoid bad weather or traffic. The co-pilot

enters course changes into the flight management computer, which then updates the flight

parameters such as expected fuel usage.

The co-pilot is also responsible for calling out any warning lights or instrument anomalies

as they occur, as well as calling out checklist items. This provides a backup to both pilot

and flight management system should the co-pilot notice a problem first. The co-pilot

normally performs any manual aircraft flight or FMS setting required. Again, this allows

the pilot to spend more time supervising the actions instead of becoming involved with

actually performing the tasks.

Emergency Handling

If any anomaly during flight requires significant manual flight control, the workload of

both pilot and co-pilot will increase dramatically. Decisions must be made quickly (e.g.,

where to land given no engines or severe icing), and the aircraft may rapidly transition

between flight configurations (e.g., violently maneuvering to avoid traffic). The pilot is

responsible for making the final decisions regarding emergency handling, but the co-pilot

will often offer advice. The pilot may choose to fly the plane manually during

emergencies, with the co-pilot constantly reporting aircraft status to ATC, and working to

assist the pilot wherever possible.

69

5.1.3 Why Fully Automate?

There is one main reason to remove pilots from the cockpits of commercial aircraft: pilot

error. NTSB (National Transportation Safety Board) accident report statistics [23] show

that pilot error is at least a contributing factor in the vast majority of aviation accidents in the

United States. Pilot error is caused by a number of factors, including inadequate cockpit

communication, lack of training, pilot’s inability to make decisions quickly under pressure,

or work overload during critical phases of flight. These factors will be magnified during an

actual inflight emergency because pressure and workload increase dramatically.

Today’s complex aircraft introduce new difficulties in combining a capable flight

management system with a human cockpit crew. Two contributing factors to pilot error

result: lack of FMS understanding and decrease in pilot proficiency. Several major

aviation accidents have been caused by the pilot’s lack of understanding of the FMS.

Human factors researchers continue work to improve FMS user interfaces, but this is a

difficult task because pilots are rarely computer or control experts. Also, in modern

commercial aircraft, the only time the pilot must manually fly the aircraft is when the FMS

is incapable of safely controlling the plane. Typically, these situations will be the most

difficult to control for the human pilot also, and since the pilot does not get as much

practice as he/she did before modern FMS existed, he/she will likely not be able to respond

as quickly as if he/she were in constant manual control of the aircraft. To address this

problem, today’s commercial pilots train extensively in simulators and occasionally turn off

the FMS during flight. However, in order for today’s pilots to accumulate as much manual

flying experience as pilots in older aircraft, the FMS would always be turned off, in which

case one might debate the utility of having such a fancy FMS at all.

Given the problems with pilot error and difficulty of maintaining pilot proficiency in

today’s aircraft, I conclude the obvious: take the pilots out of the cockpit. Human pilot

error will certainly be eliminated; however, the replacement system must be designed so

that we don’t simply transfer the pilot error to the FMS. I propose that, by using a

“perfected” version of the CIRCA system, a more complete set of navigation, guidance,

and control modules, and many years of work specifying flight knowledge and testing the

system, an FMS may be developed that will produce far fewer errors than are produced in

human-piloted aircraft today, even in emergency situations.

70

5.2 Current CIRCA Aircraft Model

The Aerial Combat (ACM) [25] Flight Simulator has been used for all CIRCA aircraft

domain tests to-date. ACM simulates an F-16 aircraft, using a six degree-of-freedom

nonlinear dynamic model to compute aircraft motion parameters given the complement of

actuator inputs. I selected ACM for three reasons: 1) ACM runs on any UNIX

workstation, 2) ACM is free, and 3) Source code is available. I modified ACM to

communicate via UNIX socket, and have created a knowledge base which allows CIRCA

to guide the aircraft during flight around an airport pattern, as illustrated in Figure 5-2. In

this section, I describe issues associated with the current aircraft model, including the

aircraft knowledge base and the low-level controller used by the CIRCA planner. I also

describe CIRCA’s performance for the small group of tests performed thus far.

FIX41

8 36

Navigation AidRunway

FIX0

N SE

W

FIX1

FIX2

final approach

FIX3

Figure 5-2. Aircraft Flight Pattern Flown during CIRCA Testing.

5.2.1 Knowledge Base Description

My initial goal for defining the CIRCA knowledge base was to define a simple set of

discrete-valued features that would allow CIRCA to guide the aircraft around the airport

pattern, and also demonstrate CIRCA’s ability to recognize and react to “anomalous”

situations. The use of a simple set of feature-value pairs illustrates the utility of combining

a low-level controller with CIRCA, because with no controller, CIRCA’s knowledge base

would have required much more feature and value detail (and still wouldn’t have flown the

aircraft even decently without a lot of work). Also, by including a simple set of features

used to model anomalies, I was able to demonstrate how the addition of CIRCA to a simple

flight controller allowed the system to better react to these problems.

Table 5-1 lists the feature types and their values present in the current CIRCA knowledge

base. With only these features, the planner is able to direct the aircraft around the pattern

(from takeoff through full-stop landing), assisted by the low-level controller, of course. In

summary, features for desired altitude (zero = ground level; positive = 5000 ft.) and

71

heading are used as references by the aircraft controller during flight.15 Because it is

discrete by nature, gear position is directly actuated (and subsequently sensed) by CIRCA

with no intermediate controller. The NAVAID (Navigational Aid) frequency and

Omnibearing Selector (OBS) are also controlled directly by CIRCA, using the preset

discrete values corresponding with the VOR/ILS (for frequency) and the location “corners”

illustrated in Figure 5-2.

Table 5-1. Current Aircraft Knowledge Base Features and Values.

Feature ValuesGear Position up, downAltitude zero, positiveHeading North, South, East, WestLocation Fix0, Fix1, Fix2, Fix3, Fix4, Fix5, Fix6Omnibearing Selector Fix0, Fix1, Fix2, Fix3, Fix4, Fix5, Fix6NAVAID Frequency VOR, ILSCollision-Course Traffic True, NilSwerving (to avoid traffic) True, NilOn Course True, Nil

Two types of emergencies have been simulated in CIRCA: gear failure and collision-

course traffic. The gear failure is modeled simply by the inability to transition the gear to

the “down” position before landing. Collision-course traffic is modeled very simply by a

feature representing the detection of collision-course traffic, and by an action to execute a

standard “swerving” maneuver. Of course, the swerve maneuver will cause the aircraft to

deviate from its planned course, as modeled by the “on course” feature, so a correction

action must be taken to resume course after the offending traffic has passed. These models

are very simple, but they illustrate how CIRCA can be used to plan reactions to key

emergencies that would simply be ignored by a controller blindly following a preset

reference trajectory, as described below in Section 5.2.3.

5.2.2 Aircraft Controller

I interfaced the ACM F-16 flight simulator [25] to a set of linear Proportional-Derivative

(P-D) controllers [14] to calculate actuator values that achieve the commanded reference

altitude and heading. Of course, the nonlinear dynamics of the aircraft are not even closely

modeled with my primitive controller set, but so long as the primary actuators function and

15 The continuous time reference r(t) is generated from the discrete altitude and heading by a very simple linearfunction connecting the two endpoints (e.g., “zero” and “positive”, or “North” and “West”).

72

the aircraft attitude doesn’t vary significantly from level flight (especially pitch), the current

controllers perform decently.

CIRCA currently has access to three basic controllers: takeoff/climb, cruise, and final

approach/landing. In addition to the continuous-valued actuators, the controller

automatically controls the aircraft afterburner, flaps, and brakes with discrete commands

based on current aircraft state. So, during the initial takeoff/climb phase, the controller

lowers the flaps 10 degrees, then turns on the afterburner (in addition to 100% normal

throttle) and initiates PD control using the “takeoff” set of controller gains. Then, after the

aircraft achieves a “safe” (1000 ft) altitude AGL (above ground level), the afterburner shuts

down and flaps retract. When nearing the “cruise” altitude, the controller switches to gains

for the cruise flight, allowing the aircraft to follow the specified heading and altitude

commands. Then, for final approach and landing, the controller mode switches to an

“autoland” controller, using the ILS heading and glide slope offsets as feedback for

controlling heading and altitude. When the aircraft comes within five miles of the airport,

the speed brake is deployed, then flaps are extended. After touchdown, the wheel brakes

are automatically set, stopping the aircraft at runway heading.

5.2.3 CIRCA Performance during Flight

The CIRCA knowledge base and aircraft controller set was initially debugged and tested

during flight around the pattern with absolutely no anomalies. Once this task was

performed successfully, CIRCA’s ability to fly was further tested with two emergencies:

“gear fails on final approach”, and “collision-course traffic on final approach”. Using these

basic emergency situations, variations of the knowledge base allowed tests of each

algorithm to detect and handle the classes of “unplanned-for” states (as described in Section

3.2.2 and [1]), as well as tests of CIRCA’s model of probability (as described in Section

3.2.1 and [2]). In both emergency situations, CIRCA was able to notice the problem and

react appropriately, replanning for a go-around procedure when gear failed and extending

pattern legs to avoid collision-course traffic.

Recent tests [19] have used the CIRCA aircraft flight knowledge base to illustrate planner-

scheduler negotiations, using extended traffic avoidance maneuvers plus some additional

highly-improbable events (e.g., “flight into a tornado”) to overload the scheduler. Due to

the knowledge base extensions in [19], CIRCA can now avoid traffic via a standard

avoidance maneuver at any position in the pattern. I hope to continue extending CIRCA’s

73

capabilities in this direction, combining “standard” and “custom” maneuvers when

necessary to help CIRCA better react to inflight anomalies.

5.3 Proposed Aircraft Model and Capabilities

Current FMS are quite capable of flying aircraft in many situations. However, as

discussed in Section 5.1, “flight planning” is inflexible since it can only draw from a

limited database of plans, and the simulation (used in “performance prediction”) operates

strictly at best-effort speed. Both these limitations prevent fully-automated operation, so I

propose that those two modules are “weaknesses” of current FMS that may perform better

if replaced by a system such as CIRCA, as shown in Figure 5-3. In this illustration, the

Guidance, Control, and Navigation modules are independent of CIRCA (i.e., considered

part of CIRCA’s “environment”). The flight planning and performance prediction modules

have been replaced by CIRCA, which will ideally build and execute plans that can output

similar quantities, except with more flexibility and real-time guarantees.

Guidance Control

Kn owledge Base

Performance Optimization

r(t)

NavigationSensordata

Nav Radio Tuning

u(t)attitude, thrustsensor data

x, xreference

x, x, wind

.

CIRCA Planner, Scheduler, Dispatcher Subsystems

CIRCA RTS

ATC CIRCA ABS

CIRCA data conversion

actions

features

ControllerStatus

Figure 5-3. Integration of CIRCA with a Flight Management System.

I will not be able to take advantage of the modules from an operational FMS, so, for my

thesis research, the functionality of my system will be very simple and approximate

compared to current FMS module functionality. I plan to continue using the ACM

simulator throughout my research, building on the simple PD controller set used for past

tests. Because my research concentrates on the basic algorithms used in CIRCA for

planning and plan execution, my near-term goal with respect to the aircraft model is to add

new features as needed for testing CIRCA algorithm functionality. However, I feel it is

74

important to model features realistically and work toward an integrated CIRCA-FMS

system, so that my thesis work will have a better chance of being applicable to a future

FMS-like system. I believe that by keeping the CIRCA model small but realistic, I will

have a better chance to reuse parts of the model in later research.

The proposed CIRCA aircraft knowledge base will be built upon the existing model

described above. I now have a very basic knowledge base model of altitude and heading,

as well as gear and simple “locations” for flight around the pattern. In future modeling, I

intend to keep the spirit of the FMS models, describing aircraft state to the planner in terms

of altitude, heading, longitude, and latitude, and leaving all attitude calculations to the very

primitive “guidance” module (built into the controllers currently). For my thesis work, the

planner (and ABS) will specify aircraft trajectory in terms of position and constant

velocities, also leaving acceleration computations to the guidance module (which will “catch

up” with the commanded positions and velocities after periods of acceleration or

deceleration).

Because “flight around the pattern” is very restrictive (too few locations to give the planner

many choices), I propose to extend the “location” model to include normal flight between

airports separated by quite a large distance, so that multiple “legs” of the flight will be

necessary (following paths along a system of “airways”). Then, I plan to enhance both the

knowledge base and controller set so that CIRCA can handle the following anomalous

situations, described below: low fuel, cabin depressurization, complete engine failure, and

rerouting due to bad weather. Since I have not yet incorporated associated features for

these tasks, I cannot yet provide an explicit list of the feature names and values. However,

each of these anomalous situations will not require too many new features (e.g., fuel

quantity, cabin pressure, oxygen tank level, etc.). As a start, I plan to develop the model

for each anomaly independently of the others. However, in the final tests of CIRCA, I

plan to combine anomalous situations to show how CIRCA can continue to function in the

best possible manner even though multiple problems have occurred.

5.3.1 Low Fuel

Typical flight plans are built to ensure plenty of fuel will be available during the flight.

However, if either a system failure occurs (e.g., fuel leak) or the aircraft is significantly

rerouted, the system will need to be able to select a course that will not let the fuel get too

low. Fuel quantity changes between its extreme limits (full/empty) much more slowly than

75

a quantity such as altitude, for example. A major test of the proposed algorithm to

efficiently attach time stamps to states (Section 3.3.1) will involve combining fuel temporal

transitions with the faster-acting transitions present in the current aircraft knowledge base.

Also, as described in Section 4.4, I wish to test a new algorithm to be used during action

scoring, even with a symbolic set of planner feature values.

5.3.2 Cabin Depressurization16

When an aircraft cabin depressurizes at a high cruise altitude, passengers must breathe from

oxygen stored in tanks. There will be enough oxygen to support the passengers for a

reasonable amount of time, so the “transition” from oxygen tanks “Full” to “Empty” will

certainly occur more slowly than many other modeled transitions. A typical reaction to

depressurization would be to recompute a trajectory to a lower altitude, then divert to a

nearby airport if it will be safe to do so. This feature of the aircraft model will

simultaneously test several CIRCA algorithms, including the time stamp and action scoring

algorithms in Sections 3.3.1 and 4.4, as well as the ability of CIRCA to build and switch to

a contingency plan (e.g., to lower altitude if flight continues until oxygen is nearly empty)

or to dynamically replan (e.g., if state is “safe”, but a new plan is needed to divert to a

nearer airport).

5.3.3 Engine Failure17

Complete engine failure is perhaps one of the most feared emergencies in aviation. Such

failures are rarely expected (or else the plane wouldn’t be flying), and reactions must be

very quick, because a powered aircraft will not be able to maintain altitude, even in best

glide configuration. As discussed in Section 5.1, current FMS continually calculate and

display the set of nearest airports, and may even compute whether the aircraft can stay aloft

long enough to reach that airport. However, the FMS stops there, neither automatically

diverting to the “best” airport nor selecting the best “off-field” site to crash-land.

I believe the engine failure emergency will clearly illustrate the utility of having an available

planning system in conjunction with a set of prebuilt plans. Although I have not completed

development of the CIRCA knowledge base model of engine failure, I would expect the

16 Although I will be simulating an F-16 aircraft, I will assume the aircraft is pressurized for passengers, since thatwould be the case in commercial aircraft.17 The simulated F-16 has only one engine, so complete engine failure occurs when one engine fails.

76

following scenario: 1) CIRCA will include in all plans a TAP to quickly18 detect engine

failure, 2) If engine failure occurs, CIRCA will quickly switch to a plan that will set up a

best glide configuration and point the aircraft toward the nearest airport, effectively buying

time for the planner, 3) The planner will take the current state data and replan in time to

execute the plan (e.g., before the aircraft has lost so much altitude that it cannot turn

elsewhere). If the aircraft has sufficient altitude to reach the airport, or if the “best” landing

spot is straight ahead, the new plan will be identical to the executing contingency.

However, if terrain or population centers are not uniform, step 3) will allow the system to

select a flight path that will lead to a relatively desirable off-field landing site.19

5.3.4 Rerouting due to Bad Weather

Bad weather can result in flight plan changes ranging from a simple “divert around an

isolated thunderstorm cell” to “destination and/or alternate airport closed due to

ice/snow/fog”. I will certainly be unable to add a complete weather model to the ACM

simulator, but I do plan to simulate each of these two particular weather-based situations

during my research by modifying the ACM software to report isolated thunderstorms and

airport closings. Because it is virtually impossible to predict the exact location of an

isolated thunderstorm, I believe CIRCA’s combination of contingency plan storage and

online planning will be clearly illustrated by this example. Offline, CIRCA will build

reactions (or contingency plans if scheduling is difficult) to turn away from thunderstorm

cells. Then, based on fedback feature data describing the location and extent of the

thunderstorm cell, CIRCA will dynamically replan to divert around the storm. During a

“normal” diverting procedure, replanning will not be overly time-limited, since the aircraft

is flying away from the storm.

Normal FMS flight plans contain a trajectory to one alternate airport should the destination

airport close [17]. However, if a large weather system results in multiple airport closings,

the FMS will not have planned a route for any other airport. To mimic FMS operation and

enhance chances of safety, CIRCA will build a set of plans to fly to the destination airport

and one nearby alternate (via contingency planning). Then, if both these airports close,

18 In this paragraph, I use “quickly” to describe a task that will be completed in guaranteed real-time.19 Ideally, the contingency plan set would already contain a complete description of the best airport and offieldlanding sites for all points along the trajectory. However, for a multi-thousand-mile flight, I hypothesize it willbe infeasible to build and store contingency plans that account for the terrain features and population densities atall enroute positions.

77

CIRCA will react by automatically entering a holding pattern if necessary (instead of

landing), then dynamically replanning to reach another open airport.

5.4 Future Work -- Flying a “Real” Airplane Safely

After the research outlined in this proposal has been completed, there are still numerous

technical issues that will need to be addressed before safe, fully-automated flight is

possible. In this section, I describe methods by which the aircraft models used in a

CIRCA-like system may be augmented, leading to better reactions and thus a safer fully-

automated system.

5.4.1 Building a Comprehensive Aircraft Knowledge Base

CIRCA will be using a very limited knowledge base during tests. To generate even near-

optimal flight plans, the knowledge base must contain information to help it select an

efficient and safe path at all points during a flight. This requires the aircraft to avoid

“obstacles”, either airborne or ground-based. Avoiding airborne obstacles requires the

flight planner to consider airspace restrictions (e.g., military operation areas) and air traffic

control instructions. To avoid ground-based obstacles such as mountain peaks or radio

antennas, the planner must employ geographical knowledge, including terrain elevation and

type (e.g., desert), population densities, and even “tall” building locations. “Geographical”

knowledge combined with knowledge of airport facilities will also help the planner select

the best landing sites should the aircraft need to land somewhere other than the destination

airport.

In this proposal I make claims that a CIRCA-like flight planning system will help a fully-

automated system respond accurately and quickly to inflight anomalies that may lead to

emergencies. Pilots spend a significant amount of time studying NTSB accident reports so

that, if they every encounter a similar emergency, they may use this information to help

them react optimally and quickly. NTSB reports typically contain a description of the

situation in which the accident (or incident) occurred, the contributing factors (causes), and

actions that might have avoided the accident. I believe incorporating the full set of

situations and appropriate reactions proposed in the NTSB accident reports will be the key

to making a fully-automated aircraft “safe”, particularly with respect to responding quickly

and accurately to potentially dangerous emergency situations.

78

5.4.2 Building the Control System

Certainly, research groups in companies that design current FMS will have a much better

set of controllers and state estimators than I could ever hope to build independently. If a

CIRCA-like system is to ever be used on a real aircraft, researchers will need to work with

a major FMS designer to gain access to their technology.

Eventually, the aircraft control system may be composed of nonlinear and/or linear

feedback controllers which are automatically invoked by methods such as the current gain

scheduling [16] and its variants (e.g., [26]) or methods like the neural-network-based

approach described in [29]. With an advanced set of such controllers, the control system

itself will be able to detect and correct for low-level sensor or actuator anomalies.

However, such a system will try its best to follow the specified reference inputs, so the

guidance and higher-level systems must always be aware of the controller’s capabilities

based on the current system state (e.g., if the engines are out, the reference altitude rate of

change must not exceed that imposed by the “best glide” limit).

One of the advantages of state estimation (instead of using direct sensor values) is the

ability to maintain an accurate measure of system state even if some sensors fail or become

noisy. By using a redundant, comprehensive set of sensors to measure system state

(including system diagnostic measurements), the state estimator will be able to provide

accurate values of aircraft parameters, or if not, will be able to detect faulty estimates and

react with some combination of controller parameter changes and the transmission of faulty

state parameters to the higher-level planning/plan execution system.

5.4.3 Incorporating the System in “Real” Aircraft

Flying is a difficult endeavor because of both system complexity and the potentially

catastrophic consequences of reacting too slowly or incorrectly. Before pilots can be taken

out of commercial aircraft, extensive tests will be required. The key capability introduced

by a CIRCA-like system is the ability for the system to detect and respond “appropriately”

to anomalous situations, both small problems and major emergencies. Because it would be

infeasible to prove that the fully-automated system would react properly in absolutely all

situations, extensive testing is perhaps the only way to gradually gain trust in the system.

79

I would imagine three main phases to system testing. First, the “fully-automated” FMS

would be connected to a simulator that had the ability to realistically simulate a large group

of emergency situations. Next, the full-automation capabilities would be assessed with

respect to pilot capabilities, running the full-automation capabilities in parallel with the

standard FMS routines (augmented by pilot commands). These new FMS computations

will not interfere with the standard FMS, so if they do not operate correctly, the flight will

not be compromised in any way. By comparing the automated and pilot-commanded

responses, the fully-automated system may be better be debugged. Finally, the “fully-

automated FMS” may be put into service, but a pilot will still have the ability to revert to

manual control of the aircraft.20 If/when the fully-automated FMS has demonstrated the

capability to reliably operate without pilot intervention and has gained the trust of the FAA,

it will be time to think about taking the human pilot out of the cockpit.

20 Ideally, this system would make commercial aviation more safe, because one has two separate “systems”, FMSand pilot, that can handle both regular and anomalous-situation flight. Of course, pilots will need to be trained tounderstand the operation of the fully-automated FMS, and many user interface issues will arise. Otherwise, the newsystem could compromise safety, not improve it.

80

=====================================================

CHAPTER 6

SUMMARY=====================================================

I have proposed research to develop a system that can simultaneously consider issues from

the AI planning, real-time, and control systems fields, focusing on the problem of

achieving safe, fully-automated control of a traditionally piloted vehicle. Incorporating

real-time constraints into such a system necessitates the careful consideration of time during

planning, predictable execution characteristics for all system processes, and explicit

scheduling of critical actions and control loops to guarantee meeting deadlines. Interfacing

a planner and controller requires that the planner contain knowledge describing controller

capabilities and limitations, and that a common language exist for efficient communication

between the two systems.

To address these problems, I will work in the context of CIRCA, the Cooperative

Intelligent Real-time Control Architecture, which was explicitly designed to address issues

involved with planning for an environment requiring real-time response guarantees.

Originally, CIRCA combined a planner, scheduler, and real-time plan executor such that it

could build and schedule plans that were guaranteed to meet critical response deadlines. In

previous work, CIRCA always assumed it could build each plan to maintain safety

indefinitely, allowing the planner to deliberate as long as it needed. This is an unrealistic

assumption in many domains, so I propose augmentations to CIRCA which will allow it to

limit planning deliberation time while achieving the best quality plans possible. The new

version of CIRCA will include planning, scheduling, and real-time plan execution

subsystems as before, but also will include new Dispatching and Abstraction modules.

The Dispatching Subsystem will allow the planner to build and store plans offline, helping

CIRCA achieve faster response when a new plan is required. The Abstraction Subsystem

will contain the functions required to translate between CIRCA commands and the language

of the controllers and state estimators used for each domain.

Since it is unrealistic to assume complete and correct knowledge, I have augmented CIRCA

to detect and react to important unplanned-for situations that may arise, including deadend,

removed (low-probability), and imminent-failure states. To-date, CIRCA has relied on

“coincidental” real-time planner response to these states, but this is not adequate for time-

81

critical domains. In this proposal, I have described a method to allow predictably fast

responses to important subclasses of unplanned-for states using prebuilt reaction plans

stored in CIRCA’s new RTS plan cache or Dispatcher Subsystem. For other, less time-

critical unplanned-for states, I have proposed online CIRCA planning using algorithms to

limit planner deliberation time.

As a first step to limiting planner deliberation time, I have built an approximate model of

probability into CIRCA. This model allows the removal of improbable states from

consideration when necessary and directs CIRCA to plan using a best-first search strategy

based on state probability. This preliminary work on the uses of state probability has led

me to the development of a model that explicitly trades off planning speed with accuracy

during planning, allowing a design-to-time approach to limiting deliberation time. Because

the parameters used for the design-to-time calculations will be imprecise, I have proposed

combining this approach with an anytime policy to guarantee that the planner will stop its

deliberation before its deadline passes. The planner will expand state-space in best-first

order based on a flexible utility function which combines state probability, time horizon,

and proximity to failure. In this manner, when interrupted by the anytime monitor, CIRCA

will be confident that the planner has expanded the “best” states it had time to consider.

I have proposed an architecture which combines basic planning, real-time, and control

systems methods, and have argued that this is the best approach for achieving fully-

automated control of a complex system. By considering how each of these three fields

addresses a typical problem, I have identified standard inputs and outputs from each type of

system, and constructed a basic interconnected system which includes the modules from

each. I have described how the proposed version of CIRCA maps to this interconnected

system, and then describe how the interconnected system may function in the context of

CIRCA. Plan caching, scheduling, and planner deliberation time limiting address the

problem of imposing real-time constraints in planning and plan execution. Control

engineers must typically be very careful about specifying time and resource requirements

for their systems, so, for my thesis research, I assume associated real-time constraints have

been addressed and handled prior to CIRCA execution.

The interface between a planner and a controller has been described in terms of the inputs

and outputs of typical planning and control systems. To connect the two, planned actions

executed by CIRCA’s RTS will include directives that control the reference trajectory input

to the controllers, and feature feedback to CIRCA from the controllers will include values

82

derived from the state estimators. CIRCA’s Abstraction Subsystem (ABS) will contain

functions to build abstract feature/value pairs from sensor and state estimator data. The

ABS will also contain the functions required to translate the high-level CIRCA trajectory

commands (or actions) in terms of discrete straight-line position and velocity vectors into

the continuous dynamically-feasible reference functions to be used by the controller.

My long-term research goal is to apply this research in CIRCA to the problem of achieving

safe, fully-automated aircraft flight. For my thesis, I have proposed tests using an aircraft

simulator that will demonstrate how CIRCA can help a fully-automated aircraft achieve its

primary goal of remaining safe (i.e., not crashing), even in the presence of system failures

and environmental anomalies. Because I will not be able to develop a comprehensive

aircraft model during my thesis research, I hope to continue CIRCA and aircraft model

development past my thesis research, eventually implementing the system in a carefully-

monitored “real” aircraft. I have proposed future work necessary to allow the automated

aircraft to incorporate knowledge regarding “unexpected” situations and proper responses

to these situations from the large databases of NTSB accident reports. When this work is

complete, I predict that the fully-automated aircraft will be better “trained” than human

pilots, thus the safety of the fully-automated aircraft will also surpass that of a human-

piloted aircraft. At this point, I will be able to argue much more strongly for “taking the

pilots out of the cockpit”, and at this point I may also be very old.

Table 6-1 summarizes the tasks I hope to complete before graduating. The first column

shows a list of tasks to be accomplished, in the order they are to be tackled. As advertised

in the introduction, I will not be promising completion dates, because I have always

underestimated the time required in previous scheduling attempts. Instead, I provide a final

column describing how I will know that task is sufficiently complete for my thesis.

83

Table 6-1. Proposed Research Task Summary.

Non-specific Completion Date

Task Description 0 1 2 3 4 5 6 7 8 9 Task is done when:

Classify "unhandled" states; detect andreact to important classes of them x

(done)

Build initial state probability model x(done)

Interface CIRCA to ACM flightsimulator x

(done)

Build initial CIRCA Abstractionmodule x x

Abstraction code split fromCIRCA RTS

Implement CIRCA Dispatcher/RTSCache x

Plans are stored and fetched asdictated by planner

Complete/implement planner timebounding algorithm (Section 3.3.2)

xPlanner computes time limit, design-

to-time parameters, and imposesanytime limit on planning

Test CIRCA’s ability to respond in atimely fashion using the appropriatecombination of the RTS cache,Dispatcher, and time-limited Planner

x

Aircraft switches appropriatelybetween plans for Engine Failure

(RTS Cache and Replanning if time),Bad Weather (Dispatcher then

Replanning), and Depressurization(Replanning only)

Develop/implement primitiveABSTRIPS-like subgoaling in planner x

ABSTRIPS-like code implemented(not a major research item)

Test subgoaling with aircraft simulatorx

Desired “waypoint” subgoals aredeveloped for normal flight and

anomalies (i.e., when replanningfor unhandled states)

Complete/implement state timestamp algorithm x

Algorithm based on that in Section3.3.1 has been implemented

Complete/implement time-basednumerical model into action scoring andprobability computations (Section 4.4)

xAction scoring utility implemented

& probability model acceptablymodified to handle numerical features

Test new time stamp, action scoring,and probability algorithms

xAircraft uses new algorithms to

respond appropriately to low fuel,cabin depressurization, engine failure

Perform final tests of “complete”CIRCA in flight simulation x

CIRCA successfully controls theaircraft for any modeled combinationof traffic, gear, fuel, depressurization,

engine failure, and bad weatheremergencies

Write thesis x x x x I am called "Dr."

84

I feel my thesis research will provide the most significant contribution to the “real-time AI”

community. Other researchers have addressed issues associated with time-bounded

planning or time-bounded plan execution, but few have addressed the two simultaneously.

CIRCA already schedules plans to meet the real-time execution deadlines computed during

planning. To improve CIRCA’s ability to react quickly and accurately in complex

domains, I have proposed a combination of online and offline planning, caching critical

responses in advance, and employing an algorithm to compute and impose planner

deliberation time limits. Using either a design-to-time or anytime algorithm to limit

planning, one basic question often arises: “What happens if the planner doesn’t even have

enough time to compute an approximate plan?” I have directly addressed this issue with the

CIRCA plan cache, which will contain plans to handle the states requiring fastest response

times (e.g., members of the “imminent failure” set). In this fashion, CIRCA actively

increases available deliberation time, minimizing the chance that available time will expire

before CIRCA can create at least a minimal plan.

I believe it is always important to simultaneously consider the theoretical and practical

implications of system design. I have approached my research from both sides, working to

develop a realistic model of the fully-automated flight problem, and also considering the

more theoretical issues required to achieve both the computational accuracy and efficiency

that will be required for fully-automating any complex dynamic system. By carefully

studying the operation of current flight management systems (designed primarily by control

engineers) while developing “better, faster” planning and plan execution systems, I feel I

will be able to help bridge the gap between control and AI planning researchers, who rarely

collaborate because they don’t seem to understand each other (except in ATL, of course).

85

=====================================================

CHAPTER 7

REFERENCES=====================================================

[1] E. M. Atkins, E. H. Durfee, and K. G. Shin, " Detecting and Reacting to Unplanned-

for World States," Proceedings of AAAI Fall Symposium on Plan Execution: Problems

and Issues, pp. 1-7, November 1996.

[2] E. M. Atkins, E. H. Durfee, and K. G. Shin, "Plan Development in CIRCA using

Local Probabilistic Models," Uncertainty in Artificial Intelligence: Proceedings of the

Twelfth Conference, pp. 49-56, August 1996.

[3] C. Boutilier and R. Dearden, “Using Abstractions for Decision-Theoretic Planning

with Time Constraints,” Proceedings of the Twelfth National Conference on Artificial

Intelligence, pp. 1016-1022, 1994.

[4] D. J. Brudnicki and D. B. Kirk, “Trajectory Modeling for Automated En Route Air

Traffic Control (AERA),” Proceedings of the American Control Conference, pp. 3425-

3429, June 1995.

[5] A. R. Cassandra, L. P. Kaelbling, and M. L. Littman, "Acting Optimally in Partially

Observable Stochastic Domains," Proceedings of the Twelfth National Conference on

Artificial Intelligence, 1994.

[6] T. L. Dean, “Decision Theoretic Planning and Markov Decision Processes”, a tutorial

presented at the Summer Institute on Probability and Artificial Intelligence, Corvalis,

Oregon, 1994. (Found at http://www.cs.brown.edu/people/tld/ )

[7] T. L. Dean, L. P. Kaelbling, J. Kirman, and A. Nicholson, “Planning with Deadlines

in Stochastic Domains,” Proceedings of AAAI, pp. 574-579, July 1993.

[8] R. E. Fikes, and N. J. Nilsson, “STRIPS: a new approach to the application of

theorem proving to problem solving,” Artificial Intelligence, vol. 2, no. 3-4, pp. 189-208,

1971.

86

[9] A. J. Garvey and V. R. Lesser, “Design-to-time real-time scheduling,” IEEE

Transactions on Systems, Man and Cybernetics, vol. 23 no. 6, pp. 1491-1502, 1993.

[10] M. L. Ginsberg, "Universal Planning: An (Almost) Universally Bad Idea," AI

Magazine, vol. 10, no. 4, 1989.

[11] F. F. Ingrand and M. P. Georgeff, "Managing Deliberation and Reasoning in Real-

Time AI Systems," in Proc. Workshop on Innovative Approaches to Planning, Scheduling

and Control, pp. 284-291, November 1990.

[12] E. Horvitz and M. Barry, “Display of Information for Time-Critical Decision

Making,” Proceedings of UAI-95, August 1995.

[13] Krishna and K. G. Shin, Real-Time Systems, McGraw-Hill, 1996.

[14] B. C. Kuo, Automatic Control Systems, sixth edition, Prentice-Hall, Englewood

Cliffs, New Jersey, 1991.

[15] N. K. Kushmerick, S. Hanks, D. Weld, “An Algorithm for Probabilistic Least-

Commitment Planning,” Proc. of AAAI, pp. 1073-1078, July 1994.

[16] D. A. Lawrence and W. J. Rugh, “Gain Scheduling Dynamic Linear Controllers for a

Nonlinear Plant,” Automatica, vol. 31, no. 3, pp. 381-390, March 1995.

[17] S. Liden, “The Evolution of Flight Management Systems,” Proceedings of the 1994

IEEE/AIAA Thirteenth Digital Avionics Systems Conference, IEEE, pp. 157-169, 1995.

[18] M. L. Littman, T. L. Dean, and L. P. Kaelbling, “On the Complexity of Solving

Markov Decision Problems,” Proceedings of UAI-95, August 1995.

[19] C. B. McVey, “Development of Feedback for Real-Time Scheduling and Planning in

CIRCA,” Directed Study Report, University of Michigan, December 1996.

87

[20] D. J. Musliner, E.H. Durfee, and K.G. Shin, "World Modeling for the Dynamic

Construction of Real-Time Control Plans", Artificial Intelligence, vol. 74, no. 1, pp. 83-

127, 1995.

[21] D. J. Musliner, “Scheduling Issues Arising from Automated Real-Time System

Design,”. University of Maryland Technical Report CS-TR-3364, UMIACS-TR-94-118,

1994.

[22] D. J. Musliner, “CIRCA: The Cooperative Intelligent Real-Time Control

Architecture,” Ph.D. Thesis, The University of Michigan, Ann Arbor, MI, 1993.

[23] NTSB/ARC-94/02, Annual Review of Aircraft Accident Data: U.S. Air Carrier

Operations Calendar Year 1992, National Transportation Safety Board, June 1994.

[24] J. R. Quinlan, "Induction of Decision Trees," Machine Learning, vol. 1, pp. 81-106,

1986.

[25] R. Rainey, ACM: The Aerial Combat Simulation for X11. February 1994.

[26] O. R. Reynolds, H. Pachter, and C. H. Houpis, “Full Envelope Flight Control

System Design using Qualitative Feedback Theory,” Journal of Guidance, Control, and

Dynamics, vol. 29, no. 1, pp. 23-29, January-February 1996.

[27] S. J. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, Prentice-

Hall, Englewood Cliffs, New Jersey, 1995.

[28] E. D. Sacerdoti, “Planning in a Hierarchy of Abstraction Spaces,” Artificial

Intelligence, vol. 5, no. 2, pp. 115-135, 1974.

[29] R. M. Sanner and J. J. E. Slotine, “Function Approximation, 'Neural' Networks,

and Adaptive Nonlinear Control,” Proceedings of the IEEE Conference on Control

Applications, vol. 2, pp. 1225-1232, 1994.

[30] M. J. Schoppers, "Universal Plans for Reactive Robots in Unpredictable

Environments," in Proc. Int'l Joint Conf. on Artificial Intelligence, pp. 1039-1046, 1987.

88

[31] J. M. Schreur, “B737 Flight Management Computer Flight Plan Trajectory

Computation and Analysis,” Proceedings of the American Control Conference, pp. 3419-

3429, June 1995.

[32] R. A. Slattery, “Terminal Area Trajectory Synthesis for Air Traffic Control

Automation,” Proceedings of the American Control Conference, pp. 1206-1210, June

1995.

[33] J. Tash and S. Russell, “Control Strategies for a Stochastic Planner,” Proceedings of

AAAI, vol. 2, pp. 1079-1085, 1994.

[34] D. Tilden, “GPS and Air Traffic Control: Start with a Clean Sheet of Paper,”

Proceedings of ION GPS, vol. 1, pp. 909-911, 1994.

[35] S. Zilberstein, "Real-Time Robot Deliberation by Compilation and Monitoring of

Anytime Algorithms," AAAI Conference, pp. 799-809, 1994