reasoning about and in time when building plans for safe, fully-automated aircraft flight
TRANSCRIPT
Thesis Proposal:
Reasoning About and In Time when Building Plans
for Safe, Fully-Automated Aircraft Flight
Ella M. Atkins
University of Michigan
1101 Beal Ave.
Ann Arbor, MI 48109
Co-advisors:
Edmund H. Durfee and Kang G. Shin
Thesis Committee:
Edmund Durfee, Kang Shin,
Dan Koditschek, Mike Wellman, and N. Harris McClamroch
ABSTRACT
Achieving safe, fully-automated control of a complex system requires fast, accurate
responses to maintain safety while also driving the system toward its objective (goal).
Researchers from the planning, real-time systems, and control systems fields have different
definitions of success. Planning researchers concentrate on high-level goal achievement,
using discrete-valued features to represent knowledge and a variety of search engines to
build action plans based on possible worlds that might be reached. Real-time researchers
consider problems from the resource requirement and scheduling perspective. Control
systems researchers use mathematical models in conjunction with sensor feedback to
determine actuator commands. I feel it is crucial to interface these fields in a way that
highlights the strengths of each. I propose a system that uses an AI planner to build a high-
level plan, which is then explicitly allocated resources by a real-time scheduler. This plan
will dictate a feasible set of state trajectories which are then achieved by a controller. One
big challenge in this endeavor is identifying an appropriate interface language. For my
research, I plan to concentrate on two questions, “How can the planner identify trajectories
that are feasible for the controller(s)?”, and “How can we best consider the real-time issues
associated with planning, scheduling, and executing these plans in a dynamic
environment?”
I plan to conduct all my thesis research in the context of the Cooperative Intelligent Real-
time Control Architecture (CIRCA), which combines a traditional AI planner, scheduler,
and real-time plan execution module to provide guaranteed performance when controlling
complex real-world systems. I propose to extend CIRCA by imposing planning execution
time bounds, and by implementing a plan storage system so that CIRCA can achieve a
near-optimal balance between online (time-constrained but reactive) and offline planning.
I propose to study these research issues by using CIRCA to help achieve safe, fully-
automated aircraft flight. This domain is certainly complex, given its highly nonlinear
dynamic properties and the criticality of real-time response to avoid a crash in the worst
case. During flight, safety involves not only reacting to ordinary circumstances, but also
reacting to a daunting set of anomalies. Today’s Flight Management Systems can already
completely control an aircraft under ordinary circumstances. I have augmented CIRCA so
that it can detect and respond to a variety of anomalies, and have begun testing CIRCA’s
ability to control an aircraft simulator. I propose that careful consideration of planning,
real-time, and control systems interfaces as well as associated temporal issues will move us
closer to making safe, fully-automated flight a reality.
1
=====================================================
CHAPTER 1
INTRODUCTION=====================================================
Throughout our recorded existence, one attribute of the world has remained predictable --
the constant passage of time. Changes in time are easy to model and measure, as should be
the case since time has been the basis for centuries of work in mathematics, physics, and
engineering. Today’s real-time systems experts base much of their work on the precept
that deadlines can only be met by carefully allocating the available computational resources
to complete the tasks at hand. Control engineers carefully construct all their models so they
can guarantee a stable and predictable system response using both sensor feedback and
their prediction of how the system state may change as a function of time.
At its inception, AI planning research focused only on modeling discrete changes in high-
level quantities, such as those found in the “blocks world” and “robot planning” STRIPS
examples [8], rather than modeling them as functions of time. Today’s AI researchers have
recognized the importance of accurately handling time during planning, and have responded
via mechanisms such as those to impose limits on planner deliberation time. However,
many planners still rely heavily on the assumption that the world may be modeled by a set
of highly discretized features, and that accurate world models can be constructed from a set
of state transitions do not explicitly consider either the relative or absolute passage of time.
In such models, if the feature discretizations are “natural” (i.e., the quantity is completely
modeled because it was discrete by nature, as in an aircraft model feature called “gear
status” with values “up” or “down”), then perhaps the planner can get away with not
modeling time explicitly. However, often discretized feature value boundaries are artificial
devices used to promote tractability when modeling or working with continuous quantities
(e.g., fuel quantity in an aircraft), in which case much information is lost.
Researchers have proposed planners using techniques such as Markov Decision Processes
(MDP) [6] to produce states corresponding with constant discrete time steps (∆t) in the real
world. And, in several simplified cases, these models can be shown to have desirable
properties, including computational tractability and the ability to accurately model changes
in discrete state features over time. However, a typical real-world domain model will
produce a very complex MDP [18]. Also, in some cases, the Markov assumption [27]
2
required for an MDP is difficult to satisfy, at which time the MDP becomes “partially
observable” (a POMDP [5]), and even more difficult to solve [5].
My academic research goal is to improve the capabilities of AI planning systems such that
they may accurately reason about all temporal characteristics associated with their domains,
where the planner’s “domain” actually includes both the plan execution system (e.g.,
computational resources, sensors, actuators, etc.) as well as the physical environment and
properties of the system to be controlled. The Cooperative Intelligent Real-time Control
Architecture (CIRCA) [20], [22] combines a traditional AI planner, scheduler, and real-
time plan execution module to provide guaranteed performance for the control of complex
real-world systems. With sufficient resources and perfect domain knowledge, CIRCA can
build and schedule control plans that, when executed, are assured of responding quickly
enough to any events so that CIRCA remains safe (its primary task) and whenever possible
reaches its goals. I chose CIRCA as a basis for my work because previous researchers
had explicitly designed the system to consider at least certain aspects of time -- not so much
from the “accurate world model” perspective I discuss above, but via explicit plan
scheduling and real-time plan execution modules to provide timeliness guarantees about the
actions to be executed. By improving the current knowledge representation and planning
algorithms, I hope to extend CIRCA so that it can build a sufficient world model within
time constraints, even for a complex dynamic system.
This research goal was developed as a prerequisite to fulfill a more comprehensive goal I
had prior to my arrival at Michigan: to help produce a system capable of safe, fully-
automated aircraft flight. Current FMS incorporate many concepts from the real-time and
control systems fields, but have very limited capabilities with regard to building plans and
reacting appropriately, particularly when anomalies (e.g., failed actuators, environmentally-
induced emergencies) arise. I believe my proposed work toward the complete and accurate
consideration of temporal aspects while planning will be crucial for the full automation of
aircraft, and in turn, I believe the complexity of the aircraft flight domain will help me better
identify aspects that need to be considered during the development of the time-sensitive
planner I propose in this document.
1.1 Problem Statement
When asked what I do for research, I typically respond with “I’m working to take the pilots
out of commercial aircraft”. This answer gets the attention of most everyone, and all are a
3
bit skeptical about the safety associated with fully automated aircraft, particularly scientists
and engineers. After all, who would want to fly in an aircraft with no pilot? The answer,
no one -- yet! I first came upon the fully-automated aircraft goal while training for my
private pilot license. I read, in shock, that the vast majority of aviation accidents, even in
commercial air carrier operations, are caused at least in part by pilot error. I asked myself
why pilot error would dominate the list of factors causing accidents. The answer contains
many aspects, ranging from pilot physical incapacitation to inadequate coordination among
the cockpit crew to a pilot’s lack of understanding of aircraft’s systems. The FAA sets
stringent standards, but they cannot screen out all pilots who might possibly commit some
erroneous act.
To date, the technical approach has been to improve cockpit Flight Management Systems
(FMS) [17] to minimize pilot error in tasks which can be easily handled with available
technology. As a start, such systems were built so that a pilot need not worry about
mistakes in mundane tasks such as fuel calculations and holding an altitude during cruise.
As controls and computing technologies have improved, FMS have evolved into today’s
systems that are able to completely operate an aircraft from takeoff through landing. Given
this current capability, why is the pilot still around? In summary, airline pilots are around
to enhance safety and to coordinate with air traffic controllers (a task which is in the
process of being partially automated by others [4], [32]). Current FMS work fine under
normal flight circumstances, but the pilots still must manually reprogram or override the
FMS and fly manually when any of a great number of anomalies occur during flight.
Motivated by my desire to produce a safe FMS that does not require pilot intervention as a
“fallback position”, one of my major research goals is to identify classes of anomalies that
may be present during flight, as well as all classes of actions that may be selected to
adequately handle such anomalies, and to appropriately use all technologies required to
allow an autonomous aircraft to accurately invoke the proper responses for all anomalies as
well as or better than a good pilot. These goals will carry me well beyond the Ph.D.
process, but during my research at Michigan I hope to build a good technical foundation for
carefully building safety-oriented FMS flight plans that will eventually help produce such
an FMS or equivalent applicable for any pilot-operated vehicle, ranging from aircraft to
automobiles to robots exploring deep space or the ocean floor.
To successfully control a fully-automated complex system situated in a dynamic
environment, technical approaches from the fields of artificial intelligence (AI), real-time
4
systems, and control systems can be combined to form a highly robust and capable system.
As discussed earlier, real-time and control experts have based much of their best work on
the careful consideration of system behavior as a function of time, thus I propose that time
is a basic key to the development of a robust system to build plans of action for any fully-
autonomous system. To attack this problem, I propose that much of my Ph.D. research
focus on the development of the necessary tools to help a planner consider all aspects of
time during deliberation. These temporal aspects include the following: 1) Meta-level
consideration of planning deliberation time, 2) Construction of a world model that
represents how all modeled domain features change in the world without loss of accuracy
due to discretization, and 3) Explicit scheduling of each plan to allow execution timing
guarantees for critical planned actions. Currently, architectures tend to focus on only one
or two of these aspects, but not all three. CIRCA already performs scheduling of critical
actions. I plan to augment CIRCA so it reasons about and effectively uses available
deliberation time, while accurately modeling domain feature changes over time. Of course,
the more accurate the domain model, the more deliberation time required, so appropriate
tradeoffs must be made when a planner strives for both accuracy and quick execution.
When CIRCA’s planner is augmented as described above, it will have the capability to
represent the temporal aspects required for effective deliberation, scheduling, and domain
modeling. However, to test its capabilities fully, I will need to interface to a dynamic
system with sufficient complexity. I wish to model the dynamic system in a feasible way,
such that I can maximize the computational capabilities of the overall system using the
CIRCA planning and scheduling algorithms along with the appropriate control systems
technology. Thus, when developing a CIRCA knowledge base, I must carefully specify
timings for each feature test and action, as was the case for previous versions of CIRCA,
which had a basic philosophy of “correctly” interfacing an AI planner with real-time
scheduler and plan execution system. Now, to correctly interface the control system with
the planner and real-time system, I need to identify data to be shared between the real-time
system and controllers and also between the AI planner and controllers (e.g., knowledge
base features which allow reasoning about reference trajectory). Although my model of an
aircraft will not become very complex during my Ph.D. research, I will concentrate on
modeling select quantities “correctly” so that the aspects of temporal reasoning and
combination of planning, real-time, and control systems technology are effectively
demonstrated.
5
1.2 Technical Approach
Typically, specialists from the AI planning, real-time systems, and control systems
research fields have looked at problems from their specific area of expertise, applying
“black boxes” to the other fields. Because specialists in each field do not design universal
interfaces to their systems, researchers in a different field find it very difficult to interface
with their systems. For example, many control engineers assume high-level reference
trajectories are available, but they don’t provide a representation of their system which
might easily allow a planning system to reason about the controller’s capabilities and
limitations in a way to promote generation of exclusively feasible reference trajectories. AI
researchers often build a planner’s knowledge base using discrete features that have little
natural relation to the dynamics or time requirements associated with the real-world system
to be controlled.
For my research, I will consider interfaces among these three fields in terms of safely
controlling a piloted dynamical system, and will demonstrate research results on an aircraft
simulator. I will define what is meant by system “state” in the languages of a planner, real-
time scheduler, and classical controller. Briefly, a planner considers state as a certain set of
features that are true in the world, usually discrete in nature, but possibly using temporal
and probabilistic models. A real-time scheduler models the world as a set of computational
resources and a group of periodic and aperiodic tasks that must be allocated resources to
meet their deadlines. Finally, a typical control system models the world as a continuously
changing state vector, input, and output sets, with states including system quantities such
as continuous-valued position and velocity, inputs including desired (or reference) state
directly measured or estimated using sensor measurements, and outputs including
commands to the system’s actuators.
I will assume that a group of state estimators and controllers exist that can successfully
observe and control the system within certain regions of the controller’s state space, and
that the designer of each controller has specified its capabilities and limitations. My work
begins by abstracting this information to the planner, maintaining the continuous nature of
each controller’s state-space only so much as is necessary for the planner to accurately
develop a new plan which considers the complete set of controller capabilities. I also
assume that each controller and state estimator will consume a known set of computational
resources and that these resources have already been allocated and scheduled. Under these
6
assumptions, the online scheduler need only be concerned with the resource requirements
associated with the execution of planned actions.
As discussed in Section 1.1, I will be performing my work in the context of CIRCA. In
CIRCA, reasoning about inherently continuous quantities has not been extensively done to-
date, although there is an implemented scheduler and a first-cut notion in the planner of
time-dependent transitions. I believe the key to incorporating these quantities and
addressing associated temporal issues will be found in the CIRCA planning module that
performs “action scoring” -- deciding which action to perform in a certain state. Basically,
the newly selected action must be applicable from that state, meaning that it must result in a
state which falls within the controllable state space region specified for that action’s
controller. This introduces a new challenge that is addressed later in this proposal:
Creating a representation of planner state that will be sufficiently expressive to provide the
values needed for action scoring as well as to maintain an accurate probability model.
A basic goal of this research is to implement a system which simultaneously addresses the
three temporal issues discussed in Section 1.1, including planner deliberation time, world
state changes over time, and scheduling of actions to meet deadlines. By starting with
CIRCA, I have in place a system which reasons about the computational resources during
plan execution -- the CIRCA planner computes action deadlines, then a schedule is built to
guarantee that critical actions happen fast enough. In the original CIRCA, the planner’s
deliberation time was reduced by clever expansion of only states that were reachable from
the initial state, but there were no algorithms implemented to reason about the planner’s
deliberation time. Also, state transition times were modeled using only worst-case constant
limits, but this assumption often resulted in overconservative scheduling, and there was
never any notion of absolute time, which, as I discuss in Section 3, is important for certain
classes of state feature sets. I propose a combination of MDP and the existing fast but
potentially inaccurate CIRCA state expansion algorithms to help achieve a good balance
between planner deliberation time and level of temporal modeling accuracy present during
planning. In this manner, if a large amount of time is available for planning, an MDP-
based state model will be developed. Conversely, if the planner’s deliberation time is
short, the world model will be developed with minimal temporal modeling, as is done in
the current version of CIRCA.
For my thesis research, I will work with fully-automated aircraft only in simulation, since a
safe fully-automated system is far from perfection. The simulator I have been using is
7
cheap (free), and uses a reasonably complex model of nonlinear aircraft dynamics. Also,
the source code is available, so additions to aircraft capabilities (e.g., to model anomalies)
and building an interface to CIRCA and any low-level controller modules are relatively
easy. Unfortunately, much of the work to completely identify the set of anomalies that may
occur during flight will remain after I graduate, but my goal is to identify a small and
realistic set of anomalies, and model those correctly to test the CIRCA capabilities and
rather primitive control law set I will be using.
1.3 Proposal Outline
The purpose of this document is to summarize my research to-date, and to describe in detail
all research I hope to accomplish for my dissertation. To start, I describe the CIRCA
system and its proposed evolution (Section 2), starting with CIRCA as it existed when I
first came to Michigan, followed by a discussion of how I have augmented CIRCA to-date
and my vision for CIRCA upon completion of my dissertation work. I describe my
rationale for proposed changes and additions to CIRCA only briefly in Section 2, with
further clarification presented as I address more detailed issues regarding my research
goals.
I devote a section of this proposal to each of my basic research goals. In each of these
sections, I first describe relevant background and work completed, and how it may help me
achieve the specific goal at hand. Next, I describe work to be done during my dissertation
research. Since I cannot solve all the world’s problems in a few scant years, I finally
discuss major issues that will still remain when I’m done with my Ph.D. research. In
Section 3, I describe a method by which CIRCA may reason about the crucial aspects of
time while planning, including planning deliberation time, system state changes as a
function of time, and real-time issues associated with plan execution in a dynamic
environment. Section 4 concentrates on the integration of AI planning, real-time
scheduling, and control systems technologies. I include an outline for the methodology by
which important quantities may be transferred between modules, and describe how these
quantities may be used in the context of CIRCA.
Section 5 addresses my goal of helping achieve fully-automated aircraft flight. For my
thesis work, I will use an aircraft simulator to illustrate situations that must be modeled and
handled in CIRCA and to provide an interesting testbed for the implemented CIRCA
algorithms. Although I plan to eventually use my CIRCA research to help achieve
8
automated flight, I will consider the aircraft primarily as a testbed for my dissertation
research, since I will need many years to develop a model that includes all documented
anomalous situations.
Finally, Section 6 provides a summary of my proposed research, in terms of goals and
methods to achieve these goals, as well as items that have been completed versus still to be
completed. I also present an ordered list of small steps I will take to accomplish my
research goals, although these steps will remain without explicit deadlines since I have
traditionally been very bad at predicting research task resource requirements.
9
=====================================================
CHAPTER 2
CIRCA=====================================================
I plan to conduct all thesis research within the context of the Cooperative Intelligent Real-
time Control Architecture (CIRCA). In this chapter, I describe the evolution of CIRCA,
including the system as I inherited it (Section 2.1), CIRCA as it operates today (Section
2.2), and the proposed architecture which will be operational before my Ph.D. work is
complete (Section 2.3). The first version of CIRCA had one basic purpose -- to combine a
traditional AI planner with separate real-time system such that the planner could develop a
plan, schedule it, then execute it in guaranteed real time. To-date, I have modified CIRCA
to handle a variety of new circumstances, including multiple subgoals, “unhandled” states
(to be described), and probabilistic state transition models. Concurrent work [19] has
allowed parallel execution of the scheduler and planner. For my thesis work, I propose a
future version of CIRCA that will allow a more realistic treatment of temporal issues
associated with limited planning deliberation time and reasoning accurately about the world.
2.1 Background
The CIRCA system [20], [22] was designed to provide guarantees about system
performance even with limited sensing, actuating, and processing power. When
controlling a complex system in a dynamic environment, a real-time plan execution system
may not have sufficient resources to be able to react in all situations. Based on a user-
specified domain knowledge base, CIRCA’s main goal was to build a plan to keep the
system "safe" (i.e., avoid catastrophic failure) while working to achieve its goals if
possible. CIRCA then used its knowledge about system resources to build a schedule that
guaranteed meeting critical deadlines. This schedule was then executed on a separate real-
time processor. Figure 1 shows the general architecture of the CIRCA system. The AI
subsystem (AIS) contained the planner and scheduler. The "shell" around all AIS
operations consisted of meta-rules controlling a set of knowledge areas, similar to the PRS
architecture [11]. Working memory contained tasks to be executed, including planning and
downloading plans from the AIS to the real-time subsystem (RTS), which subsequently
executed the scheduled plan.
10
Real-Time Subsystem
TAP Schedule
Environment Interface Functions
TAP schedules
handshake
data control commands
Environment
Sensors Actuators
AI Subsystem
Planner
Scheduler
Meta-level Control Knowledge
Knowledge Base
initial state / goals
state transitions
Figure 2-1. CIRCA -- Original Version.
The CIRCA domain knowledge base included a set of transitions which modeled how the
world can change over time, specification of the initial (or startup) state, and one goal state.
To minimize domain knowledge complexity, the world model (i.e., reachable set of states)
was created incrementally based on the initial state set and available transitions. Figure 2-2
shows a typical state set expanded during a planning cycle. CIRCA began planning by
selecting one of the initial states and building a list of descendants resulting from state
transitions. This original planner used lookahead search to select actions and backtracked if
the action did not ultimately help achieve a subgoal or avoid catastrophic failure (e.g.,
aircraft crash). Based on the assumption that it is infeasible to either build or schedule
Universal Plans [30] to handle all states (as discussed in [10]), CIRCA minimized planner
memory and time usage by expanding only states produced by transitions from initial states
or their descendants. State expansion terminated whenever all features of the specified goal
state were reached in at least one reachable state while avoiding all failure states.
The original CIRCA knowledge base contained three possible transition types: action,
event, and temporal. All CIRCA transitions had a set of preconditions, discrete feature-
value pairs that must be matched before that transition can occur, and a set of
postconditions, feature-value pairs that will be true after that transition occurs. Action
transitions corresponded with commands that CIRCA may explicitly execute (e.g., put
aircraft landing gear down), while temporal and event transitions corresponded to state
changes not initiated by CIRCA (e.g., collision-course air traffic appears). Each event
transition created a nondeterministic branch in state space -- at any time after the state
becomes true, the event may or may not happen. CIRCA always had to expand both the
branch where the event occurred and that where the event did not occur, since it had no
11
knowledge about likelihood of that event actually happening before some other transition
occurs. Temporal transitions were similar to the event transitions, except that there was a
nonzero constant minimum delay between the time a state was entered and the time that
transition could occur.
F
tt
ac
tt = temporal transitionttf = temporal to failureac = action transition
I = Initial States
tt
tt
ttf
F
tt
ttf
ac
tt
tt
.... Gac
G = Goal State
I
I
D
D
D
D = Deadend States
F = Failure States
Figure 2-2. States Expanded during Planning.
Minimum delay until a transition may occur is particularly important when a temporal
transition to failure (TTF) is involved. In this case, CIRCA must schedule an action that
will be guaranteed to execute before that temporal transition has a chance of occurring,
effectively preempting the TTF (thus avoiding the catastrophic situation). CIRCA "plays it
safe" by assuming the action must be guaranteed to occur before the delay has passed. Note
that the notion of absolute preemption of any transition was only possible when an action
could be guaranteed to complete execution before the temporal transition’s minimum delay
passed, so event transitions could never be preempted and there was no notion of “event
transition to failure”.
Once CIRCA finished expanding the set of reachable states, a list of planned actions and
states in which to execute each of these actions was compiled. This list was called a control
plan and was represented as a set of test-action pairs (TAPs). Typical tests to determine
system state involved reading sensors and comparing the sensed values with certain preset
thresholds, while actions involved sending actuator commands or transferring data between
CIRCA modules. CIRCA minimized the set of TAP tests using ID3 [24], using a list of all
states in which that TAP action should be executed as positive examples and all other
expanded states as negative examples. When the AIS planner created a TAP, it stored an
12
associated execution deadline, which is used by a distance-constrained scheduler [21] to
create a periodic TAP schedule that guarantees system safety when TTFs are present. If the
scheduler was unable to create a schedule to support all deadlines, the AIS backtracked to
the planner, whose only option was to search through the other possible action sets that
could preempt all TTFs until either the scheduler was successful or else all possibilities
were exhausted, in which case the planner failed.
Presuming the planner and scheduler are successful, the AIS downloaded the TAP plan to
the RTS, which acknowledged receipt of the plan, began executing the plan as specified in
the schedule, and notified the AIS when/if the goal state was reached, although in the
original CIRCA this message did not have great significance, since the one goal was
always the final system goal to be achieved, and there was little feedback describing what
went wrong if the goal was never reached.
2.2 Present-Day System
Figure 2-3 shows CIRCA as it exists today. Several differences from Figure 2-1 may be
noted, including the splitting of the AIS into separate modules, new feedback from the RTS
to AIS, and the presence of a new module called “Abstraction Subsystem (ABS)”.
Previously, the AIS was structured so that a meta-level set of Knowledge Areas (KAs)
were used to control a potentially complex set of tasks. However, we have seen over the
years that there just aren’t that many KAs (or individual posted tasks) to sort through,
particularly since many tasks must be executed in a specific sequence, even with complex
problem domains. Also, the KA system required a substantial amount of storage overhead,
slowing AIS execution as well as making the CIRCA code unnecessarily confusing for the
new user. This meta-level KA structure has now been removed from the code, and the AIS
has been split into two components: the “Planning Subsystem” and the “Scheduling
Subsystem”, as shown in Figure 2-3. As discussed in [19], the scheduler was split from
the planner so the two could execute in parallel, and the scheduler code was enhanced to
provide helpful numerical feedback to the AIS regarding plan schedulability, as opposed to
the “yes/no” answer given in the past. The planner has also been substantially modified, as
discussed below.
13
Knowledge Base
initial state / goals
temporal/action transitions
TAP list w/ timings
Real-Time Subsystem
Environment Interface
TAP plan executorScheduling Subsystem Schedule Manager
Scheduling routines
Planning Subsystem
Planner
Plan downloading
Status orTAP Schedule
Detected Anomalies;Execution Status
TAP plansfeaturevalue data
action commands
Abstraction Subsystem
"Environment"Sensors Actuators
State Estimators Controllers
Abstractor
De-Abstractor
Controller & actuator commands
Sensor &state data
Figure 2-3. CIRCA -- Current Version.
During my initial work with CIRCA, I identified several key items that needed
improvement before CIRCA could be expected to successfully control a system in which
domain knowledge was either incomplete or even slightly incorrect. First, in the original
CIRCA, the AIS might as well have developed and scheduled its plan offline, then just
stopped executing, because there was no informative feedback from RTS to AIS that would
help the AIS react should the executing plan need modification. This problem raised an
interesting research question: Given that the RTS only executes the TAP plan explicitly
created by the AIS, how can we make the AIS include TAPs that will detect world states
not sufficiently handled by the executing plan? As I describe in Section 3.2, we first
developed a classification of all possible world states using the planner’s available
feature/value set, then identified subclasses of these states that are important for the RTS to
detect. As shown in Figure 2-3, we have added the appropriate software to first detect
these “anomalous” or “unhandled” states via planned TAPs, then we feed back this
information to the Planning Subsystem (formerly the AIS), which reacts by replanning
based on this state feedback.
In studying the problems that arise due to incomplete or incorrect models, we also decided
that CIRCA needed an accurate representation of probability in its state models. The
14
original CIRCA had action, event, and temporal transitions to build its model of the world,
but no notion of the relative likelihood when multiple transitions matched a certain state,
except when an action was guaranteed to preempt one or more temporal transitions. I have
implemented software which preserves the basic forward chaining planner model while
also maintaining approximate state probabilities, as I discuss later in Section 3.2. Using
the current model of probability, event and temporal transitions have been collapsed into a
single transition type -- “temporal transitions”, which can completely mimic event and
temporal transitions from the original CIRCA.
To help CIRCA’s planner with complex planning problems, I have implemented the
capability to handle multiple subgoals. So, instead of relying on just one plan to move the
system from initial state all the way to its goal, CIRCA can now build a group of smaller
plans to reach the goal, thus the planner is actually working in parallel to the RTS (i.e., the
RTS runs the first subgoal’s plan while the AIS builds the second subgoal’s plan, etc.).
We have plans to automate the subgoaling process, discussed below in Section 2.3.
However, to-date, the user has specified the sequence of subgoals to be achieved, and
plans are simply built from this sequence, using all goal states from the previous plan as
initial states in the new planning cycle.
Finally, Figure 2-3 shows a new software module called the “Abstraction Subsystem
(ABS)”. In the past, CIRCA was exclusively used to control mechanisms with a relatively
simple set of sensors/actuators, so details of the conversion from numerical
sensor/actuator/controller values to the discrete CIRCA feature values remained hidden
from the basic architecture. I believe this abstraction process is one of the key issues
associated with fully-automating a complex piloted-vehicle such as an aircraft, so I have
promoted this abstraction software to a separate CIRCA module, and in the simplest case,
this module will simply pass values to and from the environment with minimal processing.
With the new ABS, we allow separation of domain-specific numerical calculations from the
RTS, so we can specify TAP execution more simply in terms of the discrete feature values
present in the actual TAP instructions. This is particularly important because, as discussed
in future sections, I expect feature abstraction to become more computationally complex in
two ways: 1) Environment and controller “state-space” may not correspond in a simple
uncoupled way to CIRCA feature space, and 2) We may seek two CIRCA representations
of environment state: CIRCA feature values, and parameters to allow the planner’s action
scoring algorithm to perform better based on feature value relationships (see Section 4.4).
Finally, maintaining an ABS module may help the system with predictive sufficiency issues
15
[22] by giving the ABS scheduled autonomy to sample the environment with sufficient
frequency such that current feature values are always available, thus optimizing the number
of actual sensor reads performed.1 Currently, the ABS still reads feature values from the
environment each time the RTS requests a value, but I hope to implement a more
appropriate model for the automated aircraft flight domain in the future, as I discuss in
more detail in Section 5.
2.3 Proposed CIRCA
Several other issues will require architectural changes before CIRCA can be considered
ready to safely control a fully-automated aircraft. The final version of CIRCA I plan to
adopt during the remainder of my thesis research is shown in Figure 2-4. Differences
between this version and the current CIRCA (Figure 2-3) include a new “Dispatching
Subsystem” module and additional calculation components within the RTS and Planning
Subsystem. In summary, these additions will allow CIRCA to automatically break an
overall goal into subgoals, and also will allow storage of and quick access to contingency
plans to achieve guaranteed response in select situations even when the original executing
plan does not contain a planned response. In this section, I describe the proposed CIRCA
in the context of a generic CIRCA planning problem, leading from the user-specified
overall goal through the final set of CIRCA interactions with the environment. In this
manner, I hope to convey an accurate vision of how CIRCA will function, then later refer
to elements of this overall procedure when describing more specific research issues.
Figure 2-5 illustrates the technique by which the proposed CIRCA will solve a problem,
with a generic example shown in Figure 2-5a and specific flight domain example shown in
Figure 2-5b. CIRCA solves the problem hierarchically, starting with the overall
“objective” (e.g., “fly-from-x-to-y”), and working down to the eventual product of
individual commands (scheduled TAPs). For simplicity, I expand only one node at every
level; all other nodes in each level would be expanded analogously. In the following
paragraphs, I describe how CIRCA builds the Figure 2-5 structure at each level in terms of
the Figure 2-4 modules and inter-module feedback/feedforward data.
1 The ABS must sample world features and output controller commands at a certain minimum frequency determinedby system dynamics. CIRCA may eventually dynamically schedule the ABS functions along with planned TAPs tomaximize resource usage efficiency. However, for my research, I will assume the user has allocated sufficientresources for the periodic ABS tasks, in the same fashion as I will assume the controller and state estimator sethave been allocated sufficient resources.
16
Real-Time Subsystem
Environment Interface
TAP plans
Knowledge Base
initial state / goals
temporal/action transitions
Dispatching Subsystem Plan message building Scheduled plan storage Plan downloading
Planning Subsystem
Feedback handler
Scheduling Subsystem
TAP list w/ timings
Contingency plans
TAP plan executor
plan handlingdirectives Schedule Manager
Scheduling routinesTAP schedules
status-3
status-1
status-2
handshakehandshake
PlannerSubgoal creation/storage
featurevalue data
action commands
Abstraction Subsystem
"Environment"Sensors Actuators
State Estimators Controllers
Abstractor
De-Abstractor
Controller & actuator commands
Sensor &state data
Figure 2-4. CIRCA -- Proposed Version.
Objective
Task1 Task2 Task i...
TAP 1 TAP 2
...
TAP k ...
startupplan cached
plan1cached plan j
Processed Sensor Info 1
Controller 1
Controller n
Processed Sensor Info m
Fly-from-x-to-y
Takeoff/climb fly-to-fix1 ... approach/land
goto fix1all normal react:
engine out ...react:error x
if ((airport nearby) and (sufficient altitude))set course for nearby airport --no engine
if ((approaching ground)and (at airport))switch to autoland --no engine
if (T)try enginerestart;reportemergencyto ATC
if (approaching ground) and(not at airport))switch to land -- offield, no engine
...
Processed sensorinfo: groundproximity warning
Processed sensorinfo: state estimator
Flight controller:mode parameters /reference trajectory settings (autoland, no engine)
if (deadend state, etc)start cached plan orreplan to handle
a) Generic Example. b) Flight Example
Figure 2-5. Illustration of CIRCA Problem Solving Technique.
17
Layer 1: Objective --> Tasks
The purpose of this layer is to decompose a high-level objective (or goal) into a set of tasks
(or subgoals) to consider during CIRCA’s development of TAP plans. Currently, this
procedure must be done manually, with the user explicitly specifying each subgoal to be
achieved in the CIRCA knowledge base prior to execution. I seek to automate this process
for a variety of reasons, ranging from easing the burden on the user to adding flexibility so
that the Planning Subsystem can dynamically modify its set of subgoals if necessary based
on RTS feedback from the environment (e.g., unhandled states).
This goal decomposition layer corresponds to the new “Subgoal creation/storage”
submodule within the Planning Subsystem (Figure 2-4). Ideally, this submodule could
operate both offline and online, and compute subgoals based on the desirable planning
problem size, where the “desirable” problem size is sufficiently small to allow successful
scheduling of the associated plan. However, planning problem size will be limited by
problem decomposability (i.e., the necessity to have plans that guarantee safety for
extended periods of time).
Perfecting subgoaling algorithms is not the emphasis of my research, so I will implement a
rather simple procedure (similar to ABSTRIPS [28]) that will most likely require future
enhancements not proposed here. I plan to use the existing CIRCA planner in a special
mode to plan from the abstracted initial state to end goal, where “abstracted” means using
only a subset of the available domain features specifically marked as “critical for
subgoaling”, then selecting actions to lead to the final goal without regard for exact timings
or transitions probabilities. For example, as shown in Figure 2-5b, suppose the overall
objective is to fly an aircraft from x to y. The subgoaling module may set up a planner
iteration that performs a high-level connect-the-dots for locations along the route (a
primitive form of guidance), but ignores features that do not directly affect the computation
of subgoal (waypoint) calculations. Note that this subgoaling procedure may be
appropriate only for piloted vehicle domains, so may need to be extended in future research
should CIRCA be used in other systems.
Layer 2: Task --> Scheduled Plan Set
The purpose of this layer is to build and schedule plans to achieve each task (or subgoal)
computed in layer 1. Currently, CIRCA’s planner builds and schedules a single plan for
18
each subgoal, then sequentially downloads them to the RTS as it is ready to execute them.
If RTS feedback indicates an “unhandled” state, one new plan to handle this state is
developed and immediately downloaded, then the planner continues along the subgoal path
until the final goal is reached.
As shown in Figure 2-4, I have proposed the addition of a new “Dispatcher Subsystem”
and also a “Contingency plan” storage area within the RTS. Note that Dave Musliner (at
Honeywell) has extended the RTS software to store a group of contingency plans, but to
date has only written these plans by hand, not interfacing with the CIRCA planning system
at all. By having both a Dispatcher Subsystem and small contingency plan storage area on
the RTS, CIRCA can effectively store and retrieve immediate contingencies associated with
the currently executing plan and also plans to achieve future subgoals. The RTS storage
area will be used exclusively for those contingency plans required for reacting to time-
critical situations, thus real-time guarantees will be associated with switching to these
plans. The Dispatcher storage area will contain all other plans, including contingencies for
the current subgoal that do not require a small guaranteed plan switch time2 as well as all
plans associated with future subgoals. By minimizing the RTS plan storage area, we can
impose tighter plan downloading and plan switching time constraints, but will still be able
to keep a large plan cache via the Dispatcher subsystem.
Plans will be built and stored using the following algorithm:
For each task (i) or subgoal,
- Build a “startup plan” using the specified initial state (or last plan’s goal state set
as the initial state). This plan is equivalent to the single executed CIRCA plan now
created for each subgoal.
- Build a contingency plan to handle unhandled (anomalous) states the planner
decides are either too important or too close to failure to allow time for CIRCA
replanning. I discuss issues involved with developing contingency plans in Section
3.3.
2 Ideally, the Dispatcher plans would also have a larger guaranteed switch time. However, this would be difficultbecause the executing RTS plan would need to contain guaranteed actions to download this next plan. The timerequired to download this plan is a function of plan size, communication link availability, etc. Thus, for mythesis research, I will simply state that switching to a Dispatcher plan is just “much faster” than building and thendownloading a new plan online.
19
Upon startup, CIRCA will build a complete set of plans for all identified tasks offline (i.e.,
before RTS execution of the first plan begins), and will store these plans in the “Dispatcher
Subsystem”, which will also receive sequencing information from the planner to download
plans to the RTS as appropriate. At this point, the RTS begins executing the first TAP plan
(layer 3). If everything goes as expected (including the set of contingencies for which
CIRCA has built plans), the planner is done, and layers 3 and 4 can progress without
further planner intervention. However, if a situation arises which either has no
contingency or otherwise requires further planning (e.g., if a contingency plan cannot
provide goal achievement, only safety), the planner will receive RTS feedback to this
effect. Then, depending on available time before system safety is jeopardized, layer 1 will
develop a new set of tasks if necessary, then execute the basic planning algorithm for each
new task and required contingency. The Dispatcher will immediately download the first
scheduled plan allow prompt response to the unhandled state, and subsequent plans will be
scheduled and stored in the Dispatcher as they become available. More details on this
procedure and associated timeliness issues are discussed in Section 3.3.
Layer 3: Scheduled Plan --> TAPs
As discussed above, layer 2 has been responsible for creating all plans, and the dispatcher
has ensured that the correct plan will be sent to the RTS for execution. Layer 3 of
execution is performed on the RTS, that takes a set of downloaded TAPs and executes
them as specified in the schedule, with all guaranteed TAPs executing cyclically before the
deadlines and all “if-time” or best-effort TAPs executing whenever time is available.
Currently, each schedule contains a special TAP which is used to check whether the
subgoal has been satisfied and switches to a new "startup" plan (as depicted in Figure 2-5)
if the subgoal has been satisfied. Even if an "unhandled" state is detected, the RTS
continues executing the current plan and switches whenever the new plan becomes
available. The proposed version of CIRCA will continue this basic procedure as before,
except that each RTS plan must now contain special TAPs responsible for managing the
receipt and storage of new contingency plans, as well as TAPs for identifying and
switching to the appropriate contingency plan when that particular situation arises.
20
Layer 4: TAP --> Environment I/O
This layer describes the procedures used to execute the instructions within each RTS TAP,
specifically those TAPs that deal with the environment. The “test” part of each TAP
typically requires measurement of a set of environment states, such as “aircraft position” in
an abstract form (as described in Section 5), while the “action” part of each TAP typically
corresponds to acting upon some environmental parameter, ranging from directly
controlling an actuator to setting a parameter of some controller. To execute these TAPs,
the RTS must send its feature value request or command to the ABS (Figure 2-4), which
then either uplinks the appropriate abstracted feature value to the RTS or de-abstracts and
sends the action command. I assume the ABS will always have current feature values via
careful (offline) ABS scheduling, as I discuss in the context of the piloted vehicle domain
in Section 5.
21
=====================================================
CHAPTER 3
MODELING AND REASONING ABOUT TIME=====================================================
3.1 Overview
In this section, I propose algorithms that will yield better temporal modeling and reasoning
in CIRCA. This section is ordered by expected time for completing the proposed research,
where Section 3.2 describes temporal modeling research completed to-date, Section 3.3
includes research I propose for dissertation work, and Section 3.4 describes future work
that is beyond the scope of my thesis, but that needs to be considered before a “complete”
temporal reasoning architecture can exist.
In this section, I focus on three important aspects of time: 1) Meta-level reasoning about
planner deliberation time, 2) Construction of a world model that accurately represents
domain feature changes over time, and 3) Explicit scheduling of plans to allow execution
timing guarantees for critical planned actions. For temporal aspect 1), I have no intentions
of inventing a revolutionary algorithm to reason about deliberation time, particularly since
many others are concentrating their research efforts in this area [3], [7], [11], [12], [15],
[33], and [35]. Instead, I plan to use a combination of design-to-time [9] and anytime [7]
strategies, modifying the planner such that it can dynamically alter planner parameters to
control expanded state-space size and halt search if time expires. Limiting online planner
deliberation time in conjunction with extensive offline creation of a set of cached reactive
plans will allow CIRCA to combine the best of exclusively reactive and exclusively
planner-based systems.
I have begun work toward limiting planning time by incorporating an approximate but
relatively fast computational model of probability within the planner. The existing CIRCA
probability model is discussed in Section 3.2. Unfortunately, there are key approximations
in STRIPS-based planners [8] that carry through to our existing probability model (which
has been placed within a STRIPS-like planner). In Section 3.3, I describe an approach by
which one can combine CIRCA’s current probability model with the other extreme: a
Markov Decision Process (MDP) based model [6] in which all states have an explicit time
22
stamp. The MDP model contains a more accurate model of state changes over time, but
such a model significantly increases planning complexity over our current model.
Initial work to better address temporal aspect 2) has also begun. In domains where
knowledge may be incomplete or incorrect, it is important for an automated system to react
even when some “unplanned-for” situation occurs. I describe our work to classify, detect,
and react to unplanned-for states in Section 3.2. Also, when aspects of domain knowledge
are statistically based (e.g., exogenous event occurrence), it is important to have a model
that can accurately represent experts’ probabilistic beliefs, so we allow the specification of
temporal transition probabilities in the CIRCA knowledge base, as described in Section
3.2. Unfortunately, there are still challenges associated with domain feature discretization
as well as the planner’s model of time associated with world state changes. I save
extensive discussion of accurately modeling the set of specific domain features for Sections
4 and 5, but in Section 3.3, I describe how CIRCA can generally improve model accuracy
by enhanced temporal modeling while planning (based on MDP models) combined with
special-purpose domain-specific functions to compute accurate feature values as a function
of time.
CIRCA currently best addresses temporal aspect 3): scheduling to allow real-time
execution of critical actions. In the original version of CIRCA, if the planner could not
successfully find some combination of actions that could be scheduled, it would simply
fail. In Section 3.2, I describe work to relax this hard restriction via the removal of actions
associated with low-probability states. The contingency plan storage capabilities will
facilitate scheduling, since we now will be able to have multiple plans with plan switch
guarantees instead of a single plan only. However, I will be looking at contingency plans
from the completeness perspective, leaving work on scheduling issues for others.
Reference [19] describes ongoing modifications to the scheduler-planner interface that will
further improve CIRCA’s ability to build schedulable plans.
3.2 Research Completed
My initial research efforts have involved improving CIRCA’s ability to schedule plans and
select viable goal paths, even with imprecise knowledge. First, we implemented a model
of probability in which individual state probabilities are computed using temporal transition
probability functions and expected action execution delays. This model and its uses to-date
are described below (Section 3.2.1) and in [2].
23
To minimize the planner’s set of expanded states as well as improve plan schedulability,
CIRCA selects only those states it considers reachable, and is satisfied so long as at least
one path reaches a goal state. In Section 3.2.2, I describe a generic classification of all
world states, and describe algorithms we have implemented in CIRCA that allow it to detect
and react when important “unplanned-for” states are reached.
3.2.1 Incorporation of Probabilistic Model for Planner States
To control a complex system, an agent must build and execute a plan that is capable of
recognizing state changes due to its own actions or external world events, even when these
changes are not completely predictable. Consider an agent capable of safe, fully-automated
aircraft flight control from takeoff through landing. To execute a successful flight, the
agent must have a set of goals, such as destination airport and intermediate positions, and
an accurate model of actions and possible world events. Each event has some chance of
occurring as a function of time and state feature values. A valiant goal for any agent is to
build and execute plans that yield a high probability of successfully reaching the specified
goals. My objective in this research was to calculate approximate state probabilities and use
them to guide CIRCA’s planner along highly-probable goal paths while ignoring low-
probability occurrences when necessary.
The CIRCA planner builds the reachable state set based on action and temporal transitions
specified in the domain knowledge base. In CIRCA’s current model, state probability is
computed locally based on the probability of its parent state and applicable state transition
probability functions3. Probabilities are propagated from initial states throughout the
expanded world state set in this fashion.
I use a simple model for action transition probabilities, by assuming action transitions will
affect state features following a constant delay after being executed on the RTS. Figure 3-1
shows the cumulative probability function used for actions as a function of time.4 To
construct this function, the user specifies a delay (tdelay) between the time the action is
initiated (time=0) and the time at which the action changes state features (time=tdelay).
Then, the total delay between reaching the state prompting action execution and the time
3 A temporal transition is “applicable” from a state if all temporal transition preconditions are matched.4 We assume all state features are observable and that if an action is initiated, it will be executed properly.Otherwise, we could not specify action probabilities that reach 1.0.
24
that action affects state features is: t (total delay) = tdelay + (delay between when the
state is reached and when this action begins executing in the schedule).
time
1.0
0 tdelay
p(t)
Figure 3-1. Action Transition Cumulative Probability.
Because they are not explicitly commanded, CIRCA cannot assume such tight control over
temporal transitions, some of which may not be precisely modeled. For this reason,
CIRCA allows the user to define a cumulative probability function for each temporal
transition, where time=0 is defined as the time at which that transition's preconditions were
first satisfied. Figure 3-2 shows two examples of temporal transition probability functions
and their associated cumulative probability functions. In Figure 3-2a, the transition has a
high probability of occurring immediately. This probability decays over time, leaving a
cumulative probability asymptote of Pmax < 1.0. The value (1 - Pmax) corresponds to
the probability that this event will never occur. As an example, consider the state in which
an aircraft collision avoidance alarm (indicating nearby traffic) has just sounded. The
probability p(t) that the transition “collide with other aircraft” will occur immediately jumps
to its maximum value, but decays in time, since either the other aircraft will pass or else the
collision will have already happened.
Figure 3-2b shows an example for which a delay occurs between the time the state is
reached and the time this transition may happen (i.e., p(t) > 0). The asymptote of the
cumulative probability function is 1.0, indicating this transition will occur if given
sufficient time. A simple example of this transition type is travel between distinct locations.
Suppose an aircraft flies along the coast from Los Angeles to Portland, Oregon. At a point
along the flight, the aircraft state changes to “Location: San Francisco”, at which time the
aircraft heads directly for Portland. The probability that the temporal transition “Arrive in
Portland” will occur is near zero for a certain amount of time, even with a tremendous
tailwind and maximum thrust. The peak of p(t) occurs at the expected arrival time based
on average calculations, while the width of p(t) increases as the uncertainty in wind,
aircraft performance, and/or course deviations increases.
25
cum_prob(t)
a) "Event" Transition.time
1.0
0 tdelay time
Pmax1.0
0
cum_prob(t)
b) Delayed Transition. time
p(t)
0 time
p(t)
0ε
Figure 3-2. Temporal Transition Probability Functions.
Because CIRCA allows multiple temporal transitions from any state, probabilistic
dependencies between these transitions must be considered. These dependencies exist
because the occurrence of one temporal transition changes the current state, thus no other
temporal transition may later occur from that state. In previous CIRCA versions, the
number of temporal transitions modeled in the knowledge base was minimized by making
preconditions minimal so that temporal transitions could occur in different combinations
from many states. However, CIRCA now must accurately capture transition probability
dependencies, so the user must make preconditions more specific, increasing the number of
temporal transitions in the CIRCA knowledge base. The following procedure defines how
the user specifies temporal transitions and their probabilities:
(1) Define temporal transition sets, where a “set” contains all temporal transitions with a
certain set of preconditions. Each transition set's preconditions must be sufficiently
specific such that no state can match the preconditions of two different transition sets.
(2) For each transition set, specify the probability function for each transition. The sum of
all cumulative probability function asymptotes in each set must be ≤ 1.0 (100%). We
assume the user has sufficiently restricted preconditions to explicitly define any features
on which transition probabilities depend.
State probabilities are computed recursively during state expansion, with a "parent" state
and applicable outgoing transitions used to determine "offspring" probabilities. The
planner begins with an initial state set and no knowledge of relative probabilities within this
set, so we assume a non-informative prior distribution. The planner expands each initial
state, using the available transitions to develop the set of offspring states and initialize their
probabilities. Offspring states eventually become parent states to be expanded, continuing
until all states that are reachable with (probability > ε) have been expanded.
26
A set of zero or more action and temporal transitions match the preconditions of any parent
state. The states resulting from all matching temporal transitions and any planner-selected
action are the offspring set. Figure 3-3 illustrates the three possible situations. In Figure
3-3a, a TTF exists, so the planner has chosen a preemptive (guaranteed) action. Offspring
states, P1-Pn, result from that action and all applicable temporal transitions. State Pn is a
failure state that must have probability less than a small value ε. Figure 3-3b illustrates the
case when a non-preemptive action is selected. Offspring states result from that action and
the (n-1) temporal transitions. Finally, Figure 3-3c illustrates the case when no action is
selected, a possibility if no TTF exists and the planner selects no action along a goal path.
In this case, all n offspring states result from temporal transitions.
a) Action Preempts TTF.
Pinit
P1
acP2
tt
ttf
....
Pn<
tt
i=1
n-1Pi = Pinit
c) No Action Planned.
Pinit
P1
ttP2
tt
tt
....
Pn
tt
i=1
nPi ≤ Pinit
b) Non-preemptive Action Planned.
Pinit
P1
acP2
tt
tt
....
Pn
tt
i=1
nPi ≤ Pinit
Figure 3-3. Possible Transitions from Parent to Offspring.
The algorithm in Table 3-1 is used to locally compute probabilities for each reachable state.
A more detailed description of the algorithm is provided in [2]. The two major
approximations in the local probability computation algorithm -- estimating constant state
probability values at “critical” times, and propagating offspring probability only when the
offspring has not yet been expanded -- allow state probabilities to be computed quickly, but
not as accurately as other models of probability (such as MDP-based approaches).
Proposed improvements for both approximations are discussed in Section 3.3.
CIRCA uses state probabilities in two ways: finding highly-probable goal paths and
removing improbable states. In the previous version of CIRCA, the planner expanded
states in depth-first order. The planner selected actions primarily to avoid TTFs and
secondarily to achieve goals. In the worst cases, the only goal-reaching states would have
probability near 0, and in some situations, no schedule may be possible that avoids all
failure transitions, since the original CIRCA had to include even those that were highly
27
improbable. Although the current CIRCA model is not perfect, it still has a better chance of
reaching its goals because probability considerations prevent this worst case. Quantifying
how much better the new CIRCA performs involves evaluating the probabilistic model for
a given domain, as well as estimating the presence of cycles, etc., that degrade calculated
state probability accuracy.
Table 3-1. Original State Probability Calculation Algorithm.
1. Select the most probable state for expansion and let Pinit be this state's probability. (O(m))2. Select an action by scoring all potential action candidates. (O(nf*na) )
3. Create a list of offspring states for temporal and action transitions. (O(nt))4. Compute critical time (t) for transition probabilities. Critical time is defined as follows for the
possible transition sets shown in Figure 3-3: Case a): t = preemptive action execution deadline;Case b): t = non-preemptive action average delay time; Case c): t = temporal transition asymptotic
probability. (O(1))5. Create a list of cumulative probabilities for the offspring states (O(nt)).6. Scale each probability by Pinit (Pi = Pi * Pinit ). (O(nt))
7. For each unexpanded offspring state, add any previously existing probability due to other parent states tothe newvalue (Pi = Pi + Piold). (O(nt) )
Overall complexity: O(m + nf*na + nt), where m=number of unexpanded, reachable states that couldbecome parent states, nf=number of features, na=number of action transitions, nt = number oftemporal transitions)
3.2.2 Detection and Handling of Unplanned-for States
Autonomous control systems for real-world applications require extensive domain
knowledge and efficient information processing to build and execute situationally-relevant
plans. To enable guarantees about safe system operation, domain knowledge must be
complete and correct, plans must contain actions accounting for all possible world states,
and response times to critical states must have real-time guarantees. Practically speaking,
these conditions cannot be met in complex domains, where it is infeasible to preplan for all
configurations of the world, if indeed they could even be enumerated. Realistic planners
use heuristics to bound the expanded world state set, coupled with reactive mechanisms to
compensate when unexpected situations occur. For this research, I focused on the question
of how an autonomous system can know when it is no longer prepared for the world in
which it finds itself, and how it can respond. In this section, I first identify the different
classes of “unhandled” states the planner may identify, then describe methods by which
CIRCA can detect and respond to these states appropriately.
28
Figure 3-4 shows the relationship between subclasses of possible world states. Modeled
states have distinguishing features and values represented in the planner’s knowledge base.
Because the planner cannot consider unmodeled states without a feature discovery
algorithm, unmodeled states are beyond the scope of this paper. “Planned-for" states
include those the planner has expanded. This set is divided into two parts: "handled" states
which avoid failure and can reach the goal, and "deadend" states which avoid failure but
cannot reach the goal with the current plan.
All World StatesModeled
Planned-for
"Handled" --can reach goalDeadend
RemovedImminent Failure
World States Actually Reached
Figure 3-4. World State Classification Diagram.
A variety of other states are modelable by the planner. Such states include those identified
as reachable, but “removed” because attending to them along with the “planned-for” states
exceeds system capabilities. Other modeled states include those that indicate “imminent
failure;” if the system enters these states, it is likely to fail shortly thereafter. Note that
some states might be both “removed” and “imminent-failure”, as illustrated in Figure 3-4.
Finally, some modeled states might not fall into any of these categories, such as the states
the planner considered unreachable but that are not necessarily dangerous. We are working
to find other important classes or else show no other modelable state classes are critical to
detect. As illustrated by the boldly outlined region in Figure 3-4, states actually reached
may include any subclass. To assure safety, the set should only have elements in the
“planned-for” region. When the set has elements outside this region, safety and
performance depend on classifying the new state and responding appropriately. For this
reason, we provide more detailed definitions of the most important classes.
A "deadend" state results when a transition path leads from an initial state to a state that
cannot reach the goal, as shown in Figure 3-5. The deadend state is safe because there is
no transition to failure. However, the planner has not selected an action that leads from this
29
state via any path to the goal. Deadend states produced because no action can lead to a goal
are called "by-necessity", while those produced because the planner simply did not choose
an action leading to the goal are called "by-choice”.
InitialState
Deadend State
... GoalState
temporal
temporal oraction
Figure 3-5. "Deadend state" illustration.
A planner that generates real-time control plans needs to backtrack whenever scheduling
fails. If backtracking is unsuccessful, another option is to prune the most improbable states
from consideration and then replan. A "removed" state set is created when the planner has
purposefully removed the set of lowest probability states during backtracking, as illustrated
in Figure 3-6. In the first planner iteration, all states with nonzero probability are
considered, as depicted by the "Before Pruning" illustration. A low probability transition
leads to a state which transitions to failure. This failure transition is preempted by a
guaranteed action.
Before Pruning After Pruning
Removed State
low probability temporal
( < prob << 1)
Failure State
...
temporal
preemptive action
InitialState
... GoalState
temporal oraction ( < prob < 1)
InitialState
... GoalState
temporal oraction ( < prob < 1)
Removed downstream states
ε
ε
ε
Figure 3-6. "Removed state" illustration.
Suppose the scheduler fails. The planner will backtrack and build a new plan without low-
probability states. The resulting state diagram -- "After Pruning" -- is shown in Figure 3-6.
Due to the low probability transition, all downstream states are removed from
consideration. The preemptive action is no longer required, giving the scheduler a better
chance of success.
30
During plan development, all temporal transitions to failure (TTF) from reachable states are
preempted by guaranteed actions. If preemption is not possible, the planner fails.
However, the planner does not worry about TTF from any states it considers unreachable
from the initial state set. The set of all modelable states considered unreachable that also
lead via one modeled temporal transition to failure are labeled "imminent-failure".5 Actually
reaching one of the recognizable imminent-failure states indicates either that the planner’s
knowledge base is incomplete or incorrect (i.e., it failed to model a possible sequence of
states), or that the planner chose to ignore this state in order to make other guarantees.
Figure 3-7 shows a diagram of a reachable state set along with an isolated state (labeled
“Imminent-failure”) leading via one temporal transition to failure. This state has no
incoming transitions from a reachable state, so the planner will not consider it during state
expansion. However, if this state is reached, the system may soon fail. The imminent-
failure unhandled states are important to detect because avoiding system failure is
considered CIRCA’s primary goal.
Initial State
GoalState
temporal or action ( < prob < 1)
Failure State
Imminent Failure State
temporal
...
...ε
Figure 3-7. "Imminent-failure state" illustration.
A critical premise in our work is that a planner cannot be expected to somehow just “know”
when the system has deviated from plans---it must explicitly plan actions and allocate
resources to detect such deviations. For example, to make real-time guarantees, CIRCA's
planner must specify all TAPs to be executed, including any to detect and react to
unhandled states. In our implementation, after the planner builds its normal plan, it builds
TAPs to detect deadend, removed, and imminent-failure states. Other unhandled states,
such as those “modeled” but outside “planned-for”, “removed”, and “imminent-failure”
regions in Figure 3-4, are not detected by CIRCA. On reaching one of these unhandled
states that is not detected by CIRCA, the system may eventually transition back to a
planned-for state (where the original plan executes properly), transition to an imminent-
5 Note that it is also possible that modelable states could lead directly to failure with a known transition, or thatmodelable states could lead directly to failure with transitions that are not known to the planner, or thatunmodelable states could lead directly to failure with an unknown transition. We exclude these cases from the“imminent-failure” set because the planner is incapable of classifying them in this way.
31
failure state (where CIRCA will detect the state and react), or simply remain safe forever
without reaching the goal. The algorithms to build lists for deadend, removed, and
imminent-failure states are described in detail in [1]. To summarize, CIRCA builds a list of
each class of unhandled state, then uses ID3 [24] with that unhandled state list as the set of
positive examples and a subset of the reachable states (depending on unhandled state type)
as the set of negative examples. ID3 returns what it considers a minimal test set, which is
then used as the TAP test to detect that unhandled state class.
When any of the three unhandled state detection TAP tests (for deadend, removed, and
imminent-failure states) return “true”, the RTS feeds back current state features to the
planner along with a message stating the type of unhandled state detected. The planner then
builds and schedules a new TAP plan which will handle this state, and subsequently
downloads this new scheduled plan to the RTS. By detecting all unhandled states which
may reach failure, presuming we have accurately modeled all TTFs, the system will always
be able to initiate a reaction to avoid impending doom. However, this is predicated by a
new plan being developed to avert disaster faster than disaster could strike. To-date, I have
assumed that the planner could simply develop a plan fast enough, so tests have worked
because CIRCA’s planner responded “coincidentally” in real time. “Coincidental” real time
responses are insufficient for critical reactions, particularly when failure is involved. A
major part of my proposed thesis research is devoted to addressing this timeliness problem
when unhandled states arise, as discussed below.
3.3 Proposed Research
CIRCA must consider many aspects of time simultaneously when exercising control in any
time-critical domain. In Section 3.3.1, I briefly recap CIRCA’s current algorithms for
modeling the passage of time in the problem domain. I then propose a more accurate MDP-
based modeling procedure, which will unfortunately require more planning resources for
each problem. To address the accuracy vs. complexity tradeoff which arises, I propose a
hybrid model that contains elements of both approaches.
Originally, CIRCA’s RTS was split from the AIS to allow careful scheduling of reactive
actions to keep the system safe indefinitely. But, the only way the system remained safe
indefinitely was to assume all state transitions and reactions were completely and accurately
modeled, and that all necessary reactive actions could be successfully scheduled. Unlike
traditional robot systems where either a simple movement sequence or a “STOP” action will
32
allow maintenance of a sort of “safe haven”, there is no such thing as an “indefinitely safe
state set” for dynamic systems like an aircraft in flight. So, we cannot reasonably just
assume our system will be safe forever. Whenever any state is detected (handled or
“unhandled”, as discussed above), we must either have preplanned a reaction set or we
must build a new plan within the time bounds associated with how long the system can
remain safe while executing the current RTS plan.
In Section 3.3.2, I discuss current technology for placing bounds on planning time, and
describe a procedure by which CIRCA’s planner may reason about planning time bounds
and modify planner parameters to actually achieve these bounds. By allowing CIRCA to
explicitly bound deliberation time, I may make more concrete statements regarding
replanning response time, moving past the current claim of strictly “coincidental” real-time
response. However, given that the planner must still trade off planning speed for accuracy,
I have proposed a CIRCA system which promotes the building and storing of contingency
plans so that a quick switch to a contingency plan (produced under less time pressure
offline) will be possible instead of always forcing the planner to build a plan online. At
system startup, CIRCA will perform a significant amount of planning offline (i.e., before
the RTS begins execution of its first plan, such as when the aircraft sits motionless at the
gate). Section 3.3.3 describes the procedure by which the planner may determine which
plans to build offline, and also discusses how these plans will be stored and accessed.
Figure 3-8 illustrates the different scenarios that may arise if CIRCA leaves the “planned-
for” state set. Each oval in the figure represents a set of states, with “planned-for” ovals
representing all states with goal-seeking reactions (i.e., “handled” in Figure 3-4), while the
blank ovals represent unhandled state sets. As shown in the figure, I will presume that
CIRCA can detect some state in each unhandled state set (using the algorithm from Section
3.2.2); otherwise it would not know it had left the planned-for set. Once detecting the
unplanned-for state, in cases where a TTF occurs quickly (or “fast”), CIRCA will switch to
an RTS cache plan in guaranteed real-time. If the TTF occurs a bit slower (i.e., so that the
dispatcher will have time to download the plan, but the planner would not necessarily have
time to build a new plan), CIRCA will download and begin executing a plan stored in the
Dispatcher. If either no TTF exists (i.e., states in the blank oval are deadend only) or else
the TTF is “very slow”, CIRCA will allow online replanning, with time limit
corresponding to the time before the “very slow” TTF may occur.
33
“Fast” ttf
tt
tt
“Slow” ttftt
Action:Execute RTS Cache Plan
Action: Download & Execute Dispatcher Plan
Planned-for States
Failure
Planned-for States
No ttf or “Very Slow” ttf
Planned-for States
Action: Replan
Detected Unplanned-for State set
Figure 3-8. Proposed Plan Switching Logic in CIRCA.
3.3.1 Accurately Representing Time in the Planner’s World Model
In the original version of CIRCA, the planner modeled states without explicit time stamps
in each, capturing the world’s changes over time using feature changes specified by
nondeterministic event transitions, temporal transitions with a known minimum delay, and
actions. I have modified CIRCA to include a model of probability, so that nondeterminism
in the original model has been replaced by likelihood estimates. However, in keeping with
the original CIRCA philosophy, the current model with probability estimates also does not
explicitly model time in states, allowing cycles to minimize state-space size. In both
versions, actions are given explicit timing requirements to guarantee safety based on the
planner’s model of the world.
To illustrate a key deficiency in the current CIRCA modeling process, consider the simple
task of flying an aircraft in a holding pattern, as illustrated in Figure 3-9. Assume we pick
up the planner’s state expansion process when the aircraft is at Location 1, with exactly 1/2
tank of fuel. Note that in this example, I ignore other model features for simplicity.
Assume the CIRCA knowledge base contains only actions to travel between holding pattern
locations, a temporal transition for fuel usage, and temporal transitions to failure,
34
corresponding with the simplified consequences of the aircraft failing to actively control its
trajectory, possibly resulting in a crash (failure). CIRCA would build the state diagram
shown in Figure 3-10. Define min-∆ as in [22]: the minimum delay before which a
temporal transition can occur, corresponding with the maximum action response time
allowable for preempting the transition. Note that in our current probabilistic model, min-∆
is the time at which a temporal transition’s probability exceeds some small value ε. As
shown in Figure 3-10, the “fly-to-fix-x” actions must be guaranteed to preempt the TTF
from each state (labeled ttf1-ttf4 in the figure). The min-∆ of each TTF is smaller than the
min-∆ of the fuel usage transition, corresponding to the realistic model in which the
minimum time before the aircraft may crash from Location x given no action is smaller than
the time required for the aircraft to deplete its fuel from 1/2 tank to 1/4 tank. As a side
effect of preempting the TTFs, these actions will also preempt the fuel usage transitions
with larger min-∆ (as shown in Figure 3-10, where bold lines depict guaranteed actions and
thin lines represent preempted temporal transitions).
LOCATION 1 LOCATION 2
LOCATION 4 LOCATION 3
Figure 3-9. Illustration of Aircraft Holding Pattern.
FUEL 1/2
LOCATION 1
FUEL 1/2
LOCATION 2
FUEL 1/2
LOCATION 3
FUEL 1/2
LOCATION 4
FUEL 1/4
LOCATION xFAILURE
fly-to-2 fly-to-3 fly-to-4
ttf-3
ttf-1
ttf-2 ttf-4
fuel-use fuel-use fuel-use
fuel-use
fly-to-1
Figure 3-10. State Diagram for Aircraft Holding Pattern -- Current CIRCA model.
35
This side effect introduces a crucial inaccuracy into the model when a state cycle is present
along with preempted temporal transitions: the possibility of completely ignoring a
temporal transition that may eventually happen. In the Figure 3-10 example, the TTFs will
continue to be properly reset and preempted as the aircraft successfully flies around the
pattern. However, fuel continues to be used, and CIRCA has no concept of this fact since
it contains no model of how many cycles will be completed.6 The end result is that CIRCA
never believes the fuel quantity decreases the entire time the plane is in the holding pattern,
so CIRCA would believe the system could be safe indefinitely, when it really will
eventually run out of fuel.
Perhaps the easiest fix for this particular example is to model the fuel quantity with many
more discrete values. With sufficient discretization, the fuel-use transition would have a
min-∆ less than all TTFs, avoiding preemption and the problem illustrated in Figure 3-10.
However, it may take, say, 6 minutes (1 1/2 minute legs) to completely fly around a
holding pattern once. On a Boeing 747, it might take 3 hours for the fuel to decrease by
1/4 tank, so it would require more than 30 discrete values of fuel per 1/4 tank (or >120
values overall) just to achieve a min-∆ for the “fuel-use” transition slightly less than that of
the TTF that must be preempted. And, if there were an even quicker TTF elsewhere in
state-space, the discretization of fuel (as well as all other slowly-changing quantities) must
increase even further.
So, increasing the level of feature discretization isn’t an optimal solution because it may
expand the state-space size quite a bit, requires the user to be careful that situations such as
that in Figure 3-10 cannot occur, and requires the user to specify many more transitions in
the knowledge base (at least one per new discrete feature value). I propose that adding a
time stamp to each state is a better way to solve this problem. Markov Decision Processes
(MDP) [6] employ a model which attaches a time stamp to each state. In this manner, there
are never cycles in state-space, since any one value of time can occur only once. Figure 3-
11 shows how the aircraft holding pattern problem would map to an MDP model. The
state-space would be quite large, as depicted by the “...”, because instead of allowing a
cycle, the MDP must continue to expand all the states necessary until reaching the goal.
For this example, the MDP would create new states that would look like exact copies of
these states (except for the time stamps), complete with preemption, until the time stamp
6 Reference [22] addresses the problem of a persistent temporal transition, but the author only considers the casewhere there is a clear path along which CIRCA can backtrack. Due to the existence of a cycle, such linearbacktracking is impossible, so his approach does not solve the problem illustrated by this example.
36
was sufficiently large that the fuel-use transition is no longer preempted by the proposed
action. Assuming the holding pattern continued until the fuel quantity actually became 1/4
tank, given the above numbers (e.g., 6 minute holding pattern cycle time; 3 hours per 1/4
tank of fuel), the MDP model would expand 30 copies of the complete holding pattern (or
120 states), then the planner would notice that the fuel-use transition was no longer
preempted and would react accordingly.
T = 10FUEL 1/2LOCATION 1
T = 11.5FUEL 1/2LOCATION 2
T = 13.0FUEL 1/2LOCATION 3
T = 14.5FUEL 1/2LOCATION 4
T = yFUEL 1/4LOCATION x
FAILURE
fly-to-2 fly-to-3 fly-to-4 fly-to-1
ttf-3
ttf-1
ttf-2 ttf-4
fuel-use fuel-use fuel-use
fuel-use
T = 16FUEL 1/2LOCATION 1
ttf-1
fuel-use
...
Figure 3-11. State Diagram for Aircraft Holding Pattern -- MDP model.
The current CIRCA model is intractable in the worst case (i.e., exponential search
required). However, the MDP model is even more intractable, if such a comparison can be
made, because of CIRCA’s current ability to model many sequences of world changes with
cycles. Presuming the planner was never time-limited, I would propose using an MDP-
based model in CIRCA to achieve more accurate plans. Unfortunately, CIRCA’s planner
often must operate reactively, and, as discussed in Section 3.3.2 below, I am working to
impose planning time limits when necessary. To achieve a compromise between the
precise MDP representation and the inadequate model currently in CIRCA, I propose a
model depicted by the example in Figure 3-12. In this proposed CIRCA model, each state
contains a representation of time (T) as in the MDP model, but, instead of creating a new
state for each time step, the planner minimizes the state set by expressing the state time
stamp as a range of times. When the planner detects that all features in a potential new state
match a previously expanded state (except time), the planner decides whether a separate
new state is necessary based on outgoing transition probabilities (as described in the Table
3-2 algorithm). The planner builds up a range of times for which each state of a cycle will
be valid, and branches out of the old state set whenever the relative probability between
outgoing transitions changes significantly. In the Figure 3-12 example, this branch occurs
at the state where T=190 (LOCATION 1), which is the critical time where the fuel-use
temporal transition is no longer preempted by the fly-to-x action.
37
...
...
fly-to-1 (t < 188.5)
T = 10,16,...,184FUEL 1/2LOCATION 1
T = 11.5,...,185.5FUEL 1/2LOCATION 2
T = 13,...,187FUEL 1/2LOCATION 3
T = 14.5,...,188.5FUEL 1/2LOCATION 4
T = yFUEL 1/4LOCATION x
FAILURE
fly-to-2 fly-to-3 fly-to-4
ttf-3
ttf-1ttf-2 ttf-4
fuel-use fuel-use fuel-use
T = 191.5,...FUEL 1/4LOCATION 1
ttf-1 fuel-use
T = 190,...FUEL 1/2LOCATION 1
T = yFUEL EmptyLOCATION x
fuel-usettf-1
fly-to-1 (t >= 188.5)
Figure 3-12. State Diagram for Aircraft Holding Pattern -- Proposed CIRCA model.
The algorithm in Table 3-2 shows how, at a specific time t, the planner can decide whether
to branch to a new state or return in a cycle to an existing state. This approach clearly saves
time over the MDP approach when many instances of Case 2 exist, since in this situation,
state k is expanded only once after multiple times t have been incorporated into the range
Tk . However, this algorithm does not necessarily save time over the MDP approach,
particularly when many instances of Case 3 exist, where the system must propagate the
new time range through the descendants of the already-expanded state, effectively
performing a computation for each time step as is done in MDP models. Fortunately, even
with Case 3, this algorithm may still save time over the MDP model if a descendant
requires no branch to a new state, because the action selection process and associated
timing have previously been computed.
As shown in Table 3-2, the planner’s decision of whether to branch or not in Case 3 is
based on whether a state’s outgoing probabilities change “significantly”, a rather
ambiguous term. Consider the extreme cases. If CIRCA classifies any minute change in
probability as “significant”, all Case 3 instances will most likely cause branching, and the
state-space will begin to resemble MDP (except for savings possible from Case 2).
Conversely, if CIRCA classifies all probability changes as “insignificant”, one achieves a
model similar to that in the current CIRCA, in which all states with matching features
(excluding T) are considered identical. By varying the probability variation tolerance
(perr ), (i.e., my definition of “significant”), CIRCA can move along the spectrum from
the existing “fast but inaccurate” model to the MDP “slow but accurate” model.
38
Table 3-2. Overview of Algorithm to handle State Time Stamp Ranges.
Let min(Ti) = the minimum of the time range T for state i; max(Ti) = the maximum of the time range T for state i;
t = the current time associated with the potential new state j .
Case 1: Features of a new state j do not match those of any existing (old) state:1) Create new state j with current features.2) Set min(Tj) = max(Tj) = t .3) Place state j on the stack to be expanded.
Case 2: Features of a new state j match those of old state k (except T, of course),and state k has not yet been expanded (so no descendants exist yet):
Note: Since CIRCA can later choose an action that considers the entire time range T in state k, no new state j need be created.1) Set min(Tk ) = minimum (min(Tk), t ). Note that CIRCA cannot necessarily be assured of
expanding states in strictly chronological order, since it will perform best-first search based on a notionof utility discussed further in Section 3.3.2.
2) Set max(Tk) = maximum (max(Tk), t).3) Leave state k on the stack to be expanded.
Case 3: Features of a new state j match those of old state k (except T),and state k has already been expanded (so descendants may exist):
1) If no temporal transitions match state k (so either no descendants or only one action possible), setmin(Tk) and max(Tk) as prescribed in Case 2, since there will be no change in probabilities.
2) Otherwise, consider any action (and any existing deadline) previously selected for state k. Compute alloutgoing transition probabilities with respect to the new time t using this old action.
3) If no new probabilities change “significantly” from the old values:a) Augment the range T as specified in Case 2 above,b) Put k’s descendants with modified time ranges back on the state stack for subsequent consideration(as if the state k were a new state), since this new time could affect descendant actionchoices/probabilities. Note: this does not mean new states will be created for the descendants; it justmeans the planner will need to modify the range T and check downstream probabilities.
4) If new probabilities do “significantly” change relative to each other:a) Create a branch (as is done at t=190 in Figure 3-12), so that state j becomes a new state.b) Set min(Tj) = max(Tj) = t.c) Place state j on the stack to be expanded.
I will work to improve the algorithm in Table 3-2 during future dissertation work. I plan to
concentrate on improving planning efficiency when Case 3 is present, since Cases 1 and 2
have very simple handling procedures already. By considering methods for detecting state
cycles and precomputing time range bounds for each state, I hope to make the planner
better minimize its computations when propagating a new time range (thus potentially new
set of outgoing probabilities) through descendant states.
39
3.3.2 Limiting Planner Deliberation Time
CIRCA will be performing much of its planning offline, as discussed in 3.3.3 below.
However, since CIRCA cannot be expected to compute responses to all states offline,
CIRCA must occasionally plan dynamically (online). I shall assume CIRCA’s planner will
require no deliberation time limiting unless it is operating online, in which case the planner
will know its initial state from RTS state feedback, and it will be constrained by how long
the system will remain safe given the currently executing RTS plan.
As I stated in the introduction, numerous researchers are working on the problem of
limiting planner deliberation time. Because it is such a complex problem, I plan to make
several severe assumptions throughout my dissertation work. Briefly, my assumptions
include the following:
-- Computation of deliberation time and setting of planning parameters takes insignificant time compared to the total time available for the planning process.
-- The scheduling process has known average execution time, given an average number of tasks submitted to it.7 So, we can subtract this time and include this costin the initial value of deliberation time propagated through the planning process.
-- If the planner’s first-cut plan is not schedulable, time required for planner-scheduler negotiations to produce a schedulable plan is insignificant.
With these assumptions, the deliberation time computed only applies to a single planning
cycle, and can be directly used to set planner parameters and guide the planner’s best-first
search. Of course, these assumptions need to be addressed in detail before CIRCA can be
relied upon to always produce plans in a timely fashion, so I discuss future work on each
in Section 3.4.
I propose an approach to limiting planner deliberation time that combines elements from
design-to-time [9] and anytime [7] algorithms. As shown in Figure 3-13, upon receipt of a
state for which plans must be developed online, the planner first computes available
deliberation time. This quantity is used in a design-to-time fashion to set up CIRCA
planning parameters. Finally, the planner executes using a best-first search until
deliberation time (tdelib) expires. Each of these procedures is described in more detail
below.
7 I use average instead of worst case because scheduling algorithms are NP-complete in the worst case.
40
Compute available deliberation time ( tdelib )
Design-to-time:Set planner parameters ( p (tdelib) )
Anytime:Plan using best-firstsearch until t = tdelib
initial state(s) planned actions
tdelib tdelib, p
Figure 3-13. Proposed Algorithm for Limiting Planner Deliberation Time.
Computing Available Deliberation Time
I plan to use the planner’s initial state (fedback from the RTS) to quickly compute an initial
estimate of a planning time limit, then potentially modify this estimate based on
environment changes during planning. Since CIRCA’s main goal is always maintaining
system safety, the limiting factor for deliberation time is how long the system will be
guaranteed to remain safe executing the current RTS plan.
To compute available deliberation time (tdelib), I propose that CIRCA perform an
approximate lookahead projection, using the fedback initial state and the currently executing
RTS plan to specify action transition choices and timings. The nearest TTF (with
probability above ε) will correspond with the deliberation time limit. This is a very
approximate computation, but will serve the basic purpose of obtaining an approximate
tdelib value for my research.
Setting Planner Parameters as a Function of Available Deliberation Time
In Section 3.3.1, I have described a procedure by which CIRCA can vary its model from
the current approximate but simple state-space to the more accurate but resource-intensive
MDP-based model. A single numeric parameter (called perr in Section 3.3.1) may be
roughly used to control CIRCA’s state-space size, so I propose modifying this parameter
before planning begins based on available deliberation time (tdelib).
To compute the value of perr , CIRCA’s planner should have at its disposal an average
state branching factor based on the available set of temporal transitions, and at least an
approximate function relating perr to the available deliberation time (perr = f(tdelib) ). I
have not yet constructed this function, and doubt that an exact function of this nature exists.
However, an approximate function f(tdelib) may be sufficient because this design-to-time
approach will be used in conjunction with the anytime approach applied during planning.
41
Again, I will work to improve this function, but do not propose to perfect it during my
dissertation work.
Anytime Planning using Best-First Search
In the original version of CIRCA, search proceeded depth-first, so there was no guarantee
that the resulting goal path was any more desirable than other possible goal paths.
Reference [2] discusses the basic conversion to best-first search based solely on state
probability estimates. However, in this work, “best” is based completely on state
probability, with state expansion occurring in decreasing order of state probability.
The planner may combine its knowledge about probabilities, temporal delays, and
proximity to failure to achieve a better mechanism for controlling the best-first search.
State expansion may be ordered by decreasing utility u(s), as shown in Equation (3-1),
where p(s) = probability of reaching state s, t(s) = minimum time before the system can
reach state s, pf(s,n) = probability of reaching failure in n (or fewer) time steps from state
s. The constants a, b, c, and n (if constant) are as yet undetermined. By expanding states
in this order, I hope to plan for the most “important” states, achieving a balance between
state probability, system safety (i.e., prioritizing expansion to handle states that can reach
failure), and the time horizon considered by the planner (i.e., near-term states are handled;
far-term states will be handled by subsequent plans).
u (s) = a * p(s) + (b / t(s)) + c * pf(s, n) (3-1)
3.3.3 Achieving Timely Reactions via Plan Caching
Ideally, CIRCA could build all required actions for maintaining system safety indefinitely
into a single scheduled control plan. However, as discussed earlier, this is infeasible for
domains with a large set of actions to schedule. Currently, CIRCA builds its control plans
to handle “expected” states, with online planning required both to achieve future subgoals
and to react when “unhandled” states are reached. By building and storing a set of plans in
advance, CIRCA has a better chance of responding quickly to environmental events, thus
improving overall performance in complex domains. In this section, I address specific
questions associated with having plan storage areas in addition to available online planning.
First, I discuss a method by which CIRCA can decide which plans to build offline vs.
online (or reactively). Next, I discuss how CIRCA will split plan storage between the RTS
42
and Dispatching Subsystem caches. Finally, I discuss issues associated with reorganizing
or rebuilding the plan cache when a world state prompts online reactive planning.
Planning Offline vs. Online?
Ideally, a planner could build and store all necessary plans to achieve its goals offline, so
that it could deliberate as long as necessary to ensure development of all required reactive
plans. I argue that, in a complex domain, CIRCA cannot hope to schedule a complete set
of reactions in a single scheduled plan executing with limited resources. To address this
problem, I propose that CIRCA plans be built extensively offline, caching scheduled plans
for execution should the appropriate situations arise.
CIRCA builds plans for a set of sequential subgoals (determined by the user now;
proposed in Section 2.3 to be created automatically in the future). In some domains, these
subgoals may be structured so that the system will indefinitely remain safe while the
planner builds its next subgoal plan [20]. However, in dynamic domains such as aircraft
flight, CIRCA would be limited to one subgoal for the entire flight if CIRCA required
indefinite safety within each subgoal plan. CIRCA may not be able to schedule such a
comprehensive plan, in which case multiple plans without indefinite safety within each
must be created. Since CIRCA will not have an indefinite amount of time to plan online, I
propose that basic plans for the entire sequence of schedulable subgoals be developed
offline and stored in the Dispatcher plan cache. In this manner, if all goes as expected, the
planner will do all its work before plan execution begins, so no approximate planning will
be necessary.
Unfortunately, if the domain is sufficiently complex, CIRCA also cannot presume to build
and store contingency plans even offline for all modelable states (as illustrated in Figure 3-
4). I propose that CIRCA build its offline plans based on the “reachable” state concept it
uses now, and then use the world state classification discussed above in Section 3.2.2 to
identify states for which contingency plans should exist, versus classes of states for which
it will be acceptable to reactively plan online.
Recall that, in Figure 3-4, I divided the modelable world states into “handled”, “deadend”,
“removed”, “imminent-failure”, and all others (which are considered neither reachable nor
close to failure). I imposed mechanisms to detect the deadend, removed, and imminent-
failure state classes, and replan should one be reached. I propose to build a set of
43
contingency plans offline to handle all states which are likely or lead quickly to failure (i.e.,
a subset of the removed and imminent failure states), since these are the set for which a
guaranteed response may be required. Conversely, the deadend states have planned
reactions that keep them from quickly leading to failure. For these states, I propose to
allow the planner to reactively (online) build a new plan, using the algorithms discussed in
Section 3.3.2 to limit deliberation time as appropriate for each particular deadend state
reached.
So far, I have discussed building normal plans offline for all predetermined subgoals, as
well as contingency plans for all removed and imminent-failure states. It may appear there
will never be much online planning, but this is not necessarily the case. The “goal” in
CIRCA’s contingency plans will be primarily to postpone failure when a removed or
imminent-failure state is reached, so deadend states requiring replanning may frequently
result. When any “unhandled” state results in the current subgoal becoming unattainable
using existing cached plans, a substantial amount of online replanning may be required
either using a different set of subgoals or at least to specifically guide the system back to the
original subgoal path. In these cases, the planning deliberation time bounding procedures
become quite crucial, and we also have to reorganize the Dispatcher cache.
Plan storage in Dispatching Subsystem vs. RTS
Two key features will distinguish the RTS cache from the Dispatcher cache: available
storage space and plan access time. The RTS cache will be relatively small, so that a
minimum number of plans are downloaded for each subgoal and so that the RTS can
switch to a plan within its cache within a guaranteed time bound. Conversely, the
Dispatcher cache will be able to store a much larger set of plans, but these plans will need
to be downloaded to the RTS just before execution, so more time will be required for the
RTS to switch to a Dispatcher plan.
Above, I proposed that CIRCA plan offline for all subgoals, and that for each of these
subgoals, a “startup” plan will be created to handle normal situations, and contingency
plans will be created to handle some subset of the removed and imminent-failure state sets.
Figure 3-14 shows the proposed storage scheme for all cached plans. For each subgoal, a
“startup” plan plus all contingencies will be cached. As shown in Figure 3-14, for the
current subgoal (Subgoal 1 in the figure), the “startup” plan begins execution on the RTS,
and the RTS cache contains plans that specifically handle “unplanned-for” states that lead
44
quickly to failure.8 All other contingency plans for that subgoal (i.e., states that will lead to
failure, but not so quickly) as well as plans for future subgoals are stored in the Dispatcher.
“Fast” ttf
tttt
“Slow” ttf
tt
Failure
No ttf or “Very Slow” ttf
Executing Plan
RTS Cache Plan
Dispatcher Plan
SUBGOAL 1
“Fast” ttf
tttt
“Slow” ttf
tt
Failure
No ttf or “Very Slow” ttf
SUBGOAL 2
...
“Fast” ttf
tttt
“Slow” ttf
tt
Failure
No ttf or “Very Slow” ttf
SUBGOAL n
Figure 3-14. CIRCA Plan Storage -- RTS ready to begin execution of Subgoal 1.
Because plan switching should occur seamlessly when a subgoal has been achieved, the
RTS plan cache will contain two partitions: one for the current set of critical contingency
plans, and another for the “startup” and contingency plans for the next subgoal in the
sequence. The new subgoal’s set of plans will be sent from Dispatcher to RTS as part of
the current RTS plan. Outdated plans (i.e., either old or unattainable subgoal plan sets)
stored on the RTS will simply be overwritten as the Dispatcher downloads new plan sets.
Rebuilding the Cache when “Unplanned-for” states occur
Figure 3-14 shows how the CIRCA plan caches are organized during “normal” plan
execution. However, the planner may need to be invoked online in situations where the
current subgoal set is no longer achievable (e.g., deadend states). In these cases, the
planner will be responsible for selecting a new set of subgoals if necessary, then building
online a new set of plans to be executed. As new plans are built and scheduled online, they
will be downloaded to the Dispatcher along with indexing information regarding the new
sequence of subgoals to be achieved. If time limits prevent the planner from building
comprehensive sets of startup and contingency plans prior to beginning the execution of
these new plans on the RTS, the planner must notify the Dispatcher, which will then send
at least the next “startup” plan to the RTS for execution.
The goal of the planner will be to stay ahead of the system’s progression through subgoals,
so that it will be able to eventually rebuild both startup and contingency plans for all new
8 For my thesis research, I propose allowing user-specified constant values to classify TTFs as “quick”, “slow”, or“very slow”, although CIRCA should eventually calculate these automatically . As a start, I propose classifyingonly removed and imminent failure states by TTF delay, and assume all deadend states will not require acontingency plan.
45
subgoals. To best achieve this objective, when creating the new subgoal set based on
unplanned-for state feedback, the planner will have a significant bias toward directing the
system back to its original goal path. If the planner is successful in this endeavor, online
replanning will be held to a minimum, since once the system is back on the original subgoal
path, the cached plan set will be able to continue execution as normal. Details of an
algorithm to implement this subgoaling/planning heuristic have not yet been developed, but
such an algorithm will exist before my thesis work is complete.
3.4 Future Work
In Section 3.3.2, I discussed limiting planner deliberation time in the context of several
major assumptions associated with limiting CIRCA process execution time. Because
achieving time-limited planning alone is a complex problem, I feel these assumptions are
necessary, but each must be addressed before CIRCA can be used to control complex real-
world systems. In the following sections, I discuss potential methods for relaxing each
assumption. First, I address my assumption that the meta-level computation of deliberation
time and planning parameters is insignificant. Next, I address the problem of predicting
scheduler execution time, which I will assume to be constant or linear in the number of
TAPs to be scheduled. Finally, I address the problem of allotting time to scheduler-planner
negotiations when the scheduler fails with the first set of planned TAPs.
3.4.1 Timely Computation of Planner Parameters and Time Constraints
The CIRCA planning subsystem will contain a meta-level module to compute the available
planning deliberation time and parameters based on the fedback world state for which a
plan is being developed, as discussed in Section 3.3.2. The algorithm will involve a
lookahead search to identify the closest path to failure possible given the currently
executing plan. Unfortunately, the lookahead search process itself may take a significant
amount of time.
Lookahead search is necessary for identifying the time from any state to failure because
CIRCA bases its knowledge of the world on state transitions. I believe the most efficient
route to the nearest-term failure is a best-first search where “best” is based primarily on
time horizon of that state. In fact, time horizon will be the sole criteria I will use for
lookahead during my thesis research. Using time horizon, the more involved the
lookahead search becomes, the longer the path (temporally speaking) will be from the initial
46
state. However, CIRCA will need to account for time associated with this lookahead
procedure, then truncate the search if it becomes too costly, where “costly” is not known a
priori (since CIRCA is still in the process of computing available deliberation time).
To address this problem, I would begin by identifying a utility function which traded off
more lookahead search with the disutility of ending the search and beginning the planning
process without a specific deliberation time limit. This utility will most likely involve some
ratio r = (time already spent performing lookahead) / (current time horizon of lookahead).
To compute r, CIRCA would need to include at least an average time to expand one new
state, branching factor, and time step size between states. The ratio r may be used to
identify an approximate amount of lookahead desired in advance, then if the actual r is
different from that predicted using averages, lookahead can terminate earlier or later than
originally estimated.
Once the planning deliberation time is computed, CIRCA will compute the planning
parameters then monitor time passage during anytime planning. These processes will cause
no timeliness difficulties, since planning parameter computation will take constant time and
monitoring time during planner state expansion will be accounted for by the anytime
process used to truncate planning.
3.4.2 Reasoning about Scheduler Execution Time
Except in a few special cases, real-time scheduling algorithms are NP-complete. CIRCA
will not necessarily build a set of TAPs which fall into one of these special cases, so
CIRCA’s scheduling algorithm is NP-complete. My assumption of a constant or linear
scheduling time is particularly bad given the complexity. In the future, if the scheduler
remains NP-complete, the anytime approach proposed for planning may be extended to
include both the planner and scheduler, with appropriate tradeoffs used to assess the utility
of continued planning versus starting the scheduler with the existing action set.
Others [19], [21] have worked to optimize the CIRCA scheduler so that it is relatively fast
given TAP maximum periods and worst-case execution times. Heuristics include reducing
TAP maximum periods to shorten the required schedule length (based on the least common
multiple of all assigned TAP periods), and performing utilization and conflict checks prior
to scheduling so that failures may be identified early. Ongoing research efforts [19] are
beginning to allow relaxation of worst-case requirements for low-priority (or utility) TAPS
47
to help the scheduler succeed. This procedure to relax TAP execution time and period
requirements may allow the scheduler to execute within imposed time limits.
3.4.3 Achieving Efficient, Timely Planner-Scheduler Negotiations
In the original version of CIRCA, whenever the scheduler failed, it returned an
uninformative “fail” message, at which time the planner backtracked through its set of
guaranteed actions in hopes of finding actions that were easier to scheduler. I improved
CIRCA’s ability to find a schedulable plan by allowing the removal of improbable states
(and any associated guaranteed actions), while others [19] have enhanced the scheduling
procedures and scheduler-planner feedback to allow the planner to better reason about how
a plan needs to change before attempting to schedule again. Both these additions have
improved the planner-scheduler negotiation process, both in terms of planning speed and in
terms of the final plans produced. However, these additions do not make negotiation time
negligible, as I state in my rather strict assumptions (Section 3.3.2) to limit planning
deliberation time.
Negotiations between planning and scheduling may involve both replanning and
rescheduling for each iteration. Achieving bounds on planner-scheduler negotiations
involves several issues, including: 1) bounding planning time, 2) bounding scheduling
time, and 3) bounding the number of planning and scheduling iterations required during a
negotiation. I will be addressing 1) during my dissertation work, as described in Section
3.3.2. I discuss how others are beginning to address issue 2) above (Section 3.4.2). To
address issue 3), CIRCA would have to be able to reason about the convergence properties
associated with the negotiation process (i.e., limits on how plan schedulability improves as
a function of number of planner-scheduler iterations). I have not yet carefully considered
how such properties may be established for CIRCA.
48
=====================================================
CHAPTER 4
INTERFACING PLANNING, REAL-TIME,
AND CONTROL SYSTEMS TECHNOLOGIES=====================================================
In this section, I describe methods to improve CIRCA performance by incorporating
methods from planning, real-time, and control systems research areas. Section 4.1 gives a
brief overview of the strengths of each field along with a generic model each system uses
for its world. Section 4.2 describes how I believe all three of these technologies may be
generically combined, and describes how the proposed CIRCA maps to this generic model.
Then, for the remainder of this section (Sections 4.3-4.5), I describe model component
interfaces in the context of CIRCA.
To address specific issues in the interfaces, I consider the inter-module links between in
CIRCA in the context of pairwise combinations of planning, real-time algorithms, and
control systems. To date, CIRCA has focused primarily on the interface between planning
and real-time scheduling technologies. In Section 4.3, I describe both existing and
proposed methods to combine AI planning and plan execution under the constraints present
in systems requiring real-time response guarantees. Section 4.4 proposes a method for
combining planning and control technology. The planner must contain sufficient
knowledge describing how the controller functions, then must build a sufficiently accurate
world model to allow proper selection of actions to be executed. I present my ideas which,
if implemented properly, will allow the planner to efficiently build its world model from
symbolic feature values while maintaining the precision required to construct a valid plan.
The other pair to interface, control and real-time systems, has no section devoted to it in
this proposal because control engineers already incorporate real-time constraints when
implementing their systems. Typically, the controller and state estimator functions will
each have a constant predetermined execution period based on system dynamics and
controller convergence properties. Also, the worst-case execution time is relatively easy to
compute and will most likely be close to the average execution time because typical control
loops involve only limited branching (e.g., based on controller mode). Because of these
properties, a valid execution schedule for controllers and state estimators can be built
offline (even by hand). CIRCA models the controllers and state estimators as part of its
49
environment, so it effectively assumes the offline scheduling will allow the controllers and
state estimators to operate as described in the its knowledge base and in the Abstraction
Subsystem interface to the environment.
For my thesis work, I will be attempting to develop a generic interfacing strategy that can
easily accommodate modifications to any of the basic planning, real-time scheduling, and
control algorithms placed into the system. I plan to address basic interfacing issues in the
context of a complex task: automated aircraft flight. However, due to time and
accessibility constraints, I will be limited to a fairly simple set of controllers and knowledge
base transitions. In Section 4.5, I describe research and testing that will be required to
fully verify that this interface is sufficient to handle the spectrum of available planners,
schedulers, and controllers.
4.1 Background
In this section, I provide a basic description of how typical planning, real-time, and control
systems view the world, as well as what each system can compute and what each assumes
is available or true in the world. By showing that each system has concentrated on
different but equally important aspects of the autonomous control problem, I hope to
convincingly illustrate exactly why one would want to combine the three technologies.
These system descriptions are used in later sections to show how the three systems can be
usefully combined.
4.1.1 AI Planning Systems
Figure 4-1 illustrates the components and interconnections for a typical AI planning system
[27]. System input is some sort of user-specified domain knowledge, which may be
represented in the form of rules or transitions, preferences, fitness functions, etc. The
planner then typically performs some sort of search process to develop one or more actions
it deems appropriate based on domain knowledge and possibly the current system state.
When the plan of one or more actions is complete, it is executed by the plan executor,
which then will cause the system to act in its environment, where “environment” is loosely
defined, ranging from internal computing tasks (e.g., database management) to directly
operating an actuator that causes physical movement in the environment. The “state”,
specified in the language of the planner, may be fed back to the executor and planner to
help decide which action or plan to build or execute next.
50
Planner Plan Executor EnvironmentDomainKnowledge
Plans Actions State
Figure 4-1. Traditional AI Planning System.
The main strength of most AI planning systems is their ability to take high-level knowledge
that might be written by some domain expert, and efficiently search through the space of
possible actions and states to determine appropriate high-level reactions as a function of
state. One key to planning efficiency and accuracy is effective discretization of the
environmental properties, often using a symbolic state feature representation. To apply a
planning system to any domain, the system assumes that its relatively simple set of actions
will be sufficient for controlling the system, so that each “action” may require significant
processing (hidden in the “Environment” in Figure 4-1) before any actuator commands are
developed. Also, the state fed back to planner and executor is not a set of sensor values,
but instead processed sensor data that has been abstracted to the format used to represent
state in the planner and knowledge base.
4.1.2 Real-Time Systems
Real-time algorithms [13] focus their efforts on allocating computational resources to
provide guarantees regarding system performance. As shown in Figure 4-2, typical input
includes a set of tasks to be executed, along with a set of execution constraints. Tasks
correspond with sets of functions that will require system resources. Constraints include
minimum parameters to be achieved after scheduling that task, including features such as
task periods, deadlines, and backups (replications/versions) for reliability. For a system
with distributed resources (e.g., multiple CPUs, I/O channels, etc.), the real-time algorithm
develops an initial task allocation based on available resources and task constraints, then
attempts to schedule these tasks, negotiating between scheduling and allocation as needed.
Because not all task constraints may be possible to satisfy, feedback regarding which task
constraints were impossible may be made available, in case the task list developer wishes to
modify the input task list. Once the schedule has been developed, execution begins, and
online algorithms feed back any changes in available resources (e.g., a CPU fails) to the
allocation and scheduling modules, which may then recompute the task allocation and
schedule(s) if necessary.
51
Execution Platform
Task Allocator
Scheduler
Task Listw/ Constraints
Task SetsTaskSchedule
AvailableResources
Status Negotiation
Figure 4-2. Traditional Real-Time System.
Task allocation and scheduling are crucial to systems in which constraints must be
guaranteed (e.g., an airplane controller must operate at a certain frequency to ensure
stability). Many systems have been carefully designed by hand to meet real-time
constraints, but allocating/scheduling manually is very expensive and is virtually
impossible to do with real-time constraints. Thus, the strength of working in a real-time
paradigm is the ability to efficiently manage resources dynamically for any task set.
However, as shown in Figure 4-2, the real-time system assumes task specification in the
form of a specific set of constraints such as deadlines, etc., as well as assuming that
execution platform computational resources are predictable and easily measured.
4.1.3 Control Systems
Figure 4-3 shows the components and their interconnections for a traditional feedback
control system [14]. The input to the system includes a reference trajectory (r(t)) to be
tracked and sensor feedback (y(t)) from the plant (or environment). The output from the
system is a set of actuator commands (u(t)) which operate on the plant. To compensate for
imprecision in sensor measurements, a state estimator is invoked to compute the best
estimate of state (y(t) ) from measured state (y(t)), applied actuator forces (u(t)), past state
estimates, and a dynamical model of the system. The trajectory offset or error (e(t) = r(t) -
y(t) ) is then used by the controller to compute the next actuator value set.
Controller Plant
StateEstimator
-
+r(t)
y(t)
e(t) u(t) y(t)
Figure 4-3. Traditional Control System.
52
The control systems field is quite mature compared to both AI planning and real-time
systems. After the plant dynamics have been sufficiently described, well-defined
mathematical methods may often be employed to design a controller set (linear or nonlinear)
which will provide response guarantees in terms of stability and tracking. However, such
guarantees are possible only if a certain minimum set of sensors providing y(t) is available,
and if the reference trajectory r(t) is actually achievable (i.e., within the set of controllable
or at least stabilizable regions of the controller’s state-space). Input r(t) is usually
considered a continuous function, and is specified in the same form as the controller state,
including quantities that fully describe the system’s position and velocity in all dimensions
(translational and rotational).
4.2 Combined Planning, Real-time, and Control System
In this section, I address the question of how an autonomous system might interface the
planning, real-time, and control systems algorithms described above. I propose a generic
method of connecting these three systems to accomplish the overall task of autonomous
control, illustrating operation in the context of a piloted vehicle. The discussion in this
section occurs at a fairly high level, with more interface details described below in Sections
4.3 and 4.4.
In Section 4.1, I identified input quantities that are assumed to exist: domain knowledge
for the planner, task and constraint sets for real-time system, and reference trajectories (r(t))
for the control system. In each case, the inputs must be “acceptable” before the system will
succeed (e.g., domain knowledge must be sufficiently complete and correct; task set must
be schedulable; r(t) must be accessible). I base my work on the assumption that the easiest
of these three types of inputs for a human user to provide is a comprehensive domain
knowledge base, and I further assume that a symbolic knowledge representation will assist
the user in this endeavor. Although providing comprehensive domain knowledge is
difficult, the more difficult alternative is to build by hand a comprehensive set of tasks,
negotiation functions, and constraint sets for the scheduler, and/or to construct by hand a
comprehensive set of acceptable reference trajectories for all possible objectives (or goals)
the system may be trying to achieve.9
9 Many systems currently exist where sufficient backups (e.g., human pilots) allow system input that may not becomprehensive. However, I am working to achieve safe, fully-automated aircraft flight, in which case any gaps ininput completeness must be explicitly detected and handled by the system itself.
53
Consider a system in the form of Figure 4-4. Major components from planning, real-time,
and control systems are connected such that the only overall system inputs are the
knowledge to the planner and the sensor feedback from the plant (or environment). The
only system module missing from this diagram is the “allocation” module from real-time
systems. During my thesis research, I will be using uniprocessor scheduling algorithms to
simplify the overall problem, but task allocation may be straightforwardly added in future
research.
Working from left to right in Figure 4-4, the first module is the planner, which uses its
input knowledge and any feedback from the plan executor (explained below) to construct
one or more plans. These plans are scheduled then executed by the plan executor, which
corresponds with both the “plan executor” module in Figure 4-1 and the “execution
platform” module in Figure 4-2. The basic responsibility of the plan executor is to execute
its planned actions within real-time constraints. Action commands are somehow translated
to the continuous language of reference trajectories, then the controller computes actuator
commands for the plant (i.e., environment). Sensor feedback is sent to the state estimator,
which provides the current state both to the controller and plan executor. The plan executor
uses this feedback to determine the planned action to execute next and provides any state
feature feedback to the planner.
Planner Scheduler PlanExecutor Controller(s)
StateEstimator
Plant(Environment)
Figure 4-4. Combined AI Planning / Real-Time / Control System.
The Figure 4-4 depiction is intended to show how basic planning, real-time, and control
strategies may be combined. Because I am working in the context of CIRCA, I will begin
by comparing the proposed architecture for CIRCA (recapped in Figure 4-5) with Figure 4-
4. Then, in Sections 4.3 - 4.5, I speak in the context of CIRCA modules. The Figure 4-4
planner and scheduler map directly to CIRCA’s Planning and Scheduling Subsystems.
The Figure 4-4 plan executor maps to a combination of the CIRCA Real-Time Subsystem
54
(RTS) and Abstraction Subsystem (ABS), where the RTS actually executes the plans, but
the ABS does all language conversion required to make the connection between
controller(s)/state estimator(s) and the plan executor. As shown in Figure 4-5, CIRCA
considers the controllers and state estimators to be part of its environment. This
representation was chosen to allow flexibility in controller and state estimator design for
each domain, and simply means that CIRCA researchers (such as myself) will concentrate
on the ABS / controller interface language, not the actual inner workings of controllers or
state estimators.
Real-Time Subsystem
Environment Interface
TAP plans
Knowledge Base
initial state / goals
temporal/action transitions
Dispatching Subsystem Plan message building Scheduled plan storage Plan downloading
Planning Subsystem
Feedback handler
Scheduling Subsystem
TAP list w/ timings
Contingency plans
TAP plan executor
plan handlingdirectives Schedule Manager
Scheduling routinesTAP schedules
status-3
status-1
status-2
handshakehandshake
PlannerSubgoal creation/storage
featurevalue data
action commands
Abstraction Subsystem
"Environment"Sensors Actuators
State Estimators Controllers
Abstractor
De-Abstractor
Controller & actuator commands
Sensor &state data
Figure 4-5. CIRCA -- Proposed Version.
4.3 Interfacing Planning and Real-Time Systems
To-date, CIRCA research has focused on the combination of an AI planner, real-time
scheduler, and plan execution module (RTS). Plans are built then scheduled such that
critical actions have associated real-time response guarantees. However, the planner
currently operates without time bounds, assuming the system will remain safe long enough
for any new plans to be built. As discussed in Section 3.3, I believe this is an
unreasonable assumption for complex domains, so I have proposed a method to impose
real-time constraints on the planning process (Section 3.3.2), assisted by the use of offline
planning and plan storage in the Dispatching Subsystem and RTS cache (Section 3.3.3). I
have classified the modelable states (Section 3.2.2) such that the most time-critical
55
responses are available either in the executing plan or a cached plan, leaving more (but not
indefinite) time for replanning should a less-time-critical “unhandled” state be reached.
Using the methods described in Section 3, I believe CIRCA will be much better prepared to
cope with the unruly combination of the intractable planning problem and a complex
domain requiring real-time response. Although issues in guaranteeing response times for
the planner deliberation time calculation process, scheduling, and planner-scheduler
negotiations [19] still need to be addressed (as discussed in Section 3.4), I believe the
combination of methods proposed in this document will provide the basic links between the
planning / plan execution processes and real-time scheduling / execution system.
4.4 Interfacing Planning and Control Systems
In this section, I describe how a planning and control system may be interfaced in the
context of CIRCA. For my thesis research, I will assume all controllers and state
estimators are well-understood and execute reliably. The basic interface between planner
and control system occurs in two places: 1) RTS action output is sent to the controllers via
the CIRCA Abstraction Subsystem (ABS), and 2) State estimator and discrete sensor
values are sent as feedback to the RTS and planner via the CIRCA ABS. Since the planner
is selecting actions that will issue commands to the controller, the planner must have a
model of controller behavior in its transitions as well as a sufficiently accurate world model
(i.e., expanded state set) to ensure that appropriate actions will be sent to the controller. In
this section, I first discuss the functionality required in the ABS to translate between control
system and planning languages. Next, I describe how the CIRCA planner “action scoring”
(or reward assignment) functions may help the planner select only actions that are feasible
given the computed state and modeled controller properties.
4.4.1 CIRCA Abstraction Subsystem
As shown in the CIRCA system diagram (Figure 4-5), the control system components
connect through the Abstraction Subsystem (ABS) to the remainder of CIRCA. As
discussed in Section 2.2, the ABS must execute with real-time guarantees; however, for
my thesis research, I will be assuming the static ABS tasks are pre-scheduled by the user.
The two main ABS tasks include maintaining a current abstract representation of state for
CIRCA’s RTS and planner, and translating RTS action commands to appropriate controller
or discrete actuator commands.
56
To maintain a current abstract state estimate, discrete sensor and state estimator values must
be converted by the ABS into feature-value pairs that can be used by the planner and RTS.
Currently, the ABS directly reads sensor values from the environment. Since each sensor
uniquely describes one CIRCA feature, CIRCA’s ABS simply performs a simple value
comparison to select the appropriate feature value “bin” containing the current sensor
reading. This 1:1 correspondence will exist throughout my thesis work, allowing the same
type of value comparison functions to be used, regardless of whether a sensor or state
estimator is supplying the data.
The planner knows about high-level actions, such as “climbing to a cruise altitude” in an
aircraft, but has no knowledge of the numerical values associated with that action. The
second ABS task is to take an abstract action command, translate it into the language of the
controller or discrete actuator, then output it to the appropriate place in the “environment”.
The ABS may be required to perform functions such as the “guidance” functions described
for aircraft in Section 5.1.1, in which case action translation would involve using a
dynamical model of the system being controlled to convert the high-level command to
appropriate controller reference commands. However, I cannot develop a complex
guidance system during my thesis research, and I would want borrow such a system from
domain experts anyhow. Thus, I will be using a much simpler set of functions in the ABS,
moving “guidance” (see Sections 5.1 and 5.3) into the CIRCA environment. In the current
(and proposed) ABS models, a lookup table is used to convert the text RTS action
command (e.g., “descend to flight level 10”, “autoland”) to the appropriate set of numerical
values to be output to the environment/controllers (e.g., “reference altitude = 10000 ft.”,
“navigation frequency = 109.0; controller mode = autoland”).
4.4.2 “Action Scoring” in CIRCA
For each state expanded, a planner must select an action (if any) based on the goals to be
achieved, including failure avoidance. This decision process should be made quickly
because the planner must run this algorithm once for each state it expands. In CIRCA, the
set of possible action choices is initially narrowed via action transition precondition
matching. However, many actions may remain in this set, including the choice to execute
no action (NO-OP) so long as the state does not transition directly to failure (i.e., no TTF is
present). CIRCA uses “action scoring” to determine the utility of each action (including
NO-OP if applicable), then selects the highest-utility action of the set. Once the best action
57
has been selected, the CIRCA planner computes the periodic timing requirements for that
action based on TTFs, then uses this value during computation of all descendant state
probabilities.
In previous versions of CIRCA [22], “action scoring” was based on lookahead search.
For each action whose preconditions matched the current state, the action scoring function
searched ahead a user-specified number of levels, expanding a small state tree of
descendant states based only on applicable temporal transitions (since actions had not yet
been selected for these states). The primary purpose of lookahead was to determine if the
proposed action would allow the system to come close to failure at a future time, with a
secondary purpose of seeing if goal features may be attained in the future. Unfortunately,
in an inherently dangerous domain such as aircraft flight, failure is almost always possible
in some near-term scenario, so NO-OP tended to gain a significant advantage over other
actions, often incorrectly (e.g., the aircraft preferred to never leave the ground, since it was
“safe” there). Also, since the lookahead used only temporal transitions, all of which might
be preempted with later selection of guaranteed actions, there was no way to predict
whether any past the first-level descendant state would actually be reached. Finally,
lookahead search was a very expensive algorithm to use, especially since action scoring is
performed for each state.
After incorporating the initial state probability model (described in Section 3.2.1), I
abandoned the lookahead search algorithm in favor of a fast action scoring procedure that
simply considered whether the direct descendant of the action achieved any new goal
feature and if that action allowed preemption of any TTF in the current state. This
procedure is much faster than the lookahead process, but works well only when goal
features can be achieved in one action, since CIRCA currently does not contain the ability
to assess a state feature’s “proximity” to the goal value.10
I believe a key to better action scoring is to help CIRCA compute proximity relationships
among feature values.11 For example, suppose a feature “altitude” were modeled with a
10 Lookahead enabled the action scorer to notice goal achievement further downstream so long as, after this initialaction, only temporal transitions were required to achieve the goal feature. However, if more than one action wererequired to achieve a goal feature, neither the lookahead nor the simple “one-step” scoring algorithm would beable to notice that this first action brought the system closer to its goals.11 Proximity relationships are not new to planning, but are new to CIRCA. Quantities such as Manhattan Distancefor the 8-puzzle problem have been used to “score” actions in many systems. However, in such systems, onemetric is typically used to measure proximity for all features. I propose allowing a separate, dynamically-basedmetric for each feature or group of features.
58
symbolic value set of {0, 1000, 2000, 3000, 4000, 5000}. Also, suppose the current state
feature value is (altitude 0), and the goal value is (altitude 5000). Currently, the only way
CIRCA can determine that an “altitude” feature value of “1000” is between “0” and “5000”
is by stringing together a set of temporal transitions that describe altitude changes when
climbing. If, instead, CIRCA was able to employ simple mathematical comparisons, it
would certainly be able to quickly see that an action leading to an altitude of 1000 is
“closer” to the goal (difference 5000 - 1000 = 4000) than doing no action (difference
remains 5000 - 0 = 5000). With discrete transitions for climbing 1000 units of altitude,
lookahead search would have needed 5 levels to discover the goal, while my “newer”
algorithm that does not use lookahead will not even realize a “climb” action would help
achieve the goal. However, if the action scoring function knew relations between different
feature values, then my “newer” scoring algorithm could simply notice that, although the
“climb” action did not immediately reach the goal, it did bring the system “closer” to the
goal by a certain fraction which may be used for scoring purposes.
I propose adding mathematical functions to the CIRCA knowledge base that will allow
computation of proximity relations between the symbolic feature-value pairs. The
functions will mathematically compare an input feature value with the goal, returning a
“utility” between 0-1 describing how close the feature is to the goal.12 Revisiting the
altitude example described above, define the “utility” function for altitude as shown in
Equation 4-1. Using this equation, the initial feature utility is 0, but the initial “climb”
action that creates an altitude of 1000 will have utility 0.2, so “climb” will be selected over
“NO-OP”.
altitude utilitycurrent altitude goal altitude
altitude altitude_ .
_ _
max( ) min( )= −
−−
1 0 (4-1)
So far, I have only addressed the issue of relating feature values to numerical values to help
select actions. What does this buy in terms of interfacing a controller with a symbolic
planner? In one word: flexibility. When CIRCA’s tasks include issuing commands to
controllers, state features will include values describing high-level controller parameters or
modes (e.g., my simple “takeoff”, “cruise”, “autoland” set for aircraft control). While
utility functions normally return values describing feature proximity to the goal, controller
12 For binary-valued features (e.g., values “True” and “Nil), utility may be simply defined as 1 if the feature valuematches the its goal value and 0 otherwise.
59
utility functions may be designed to consider the entire state and incorporate items such as
proximity to the edge of the controllability envelope as well.13
4.5 Future Interface Work
I address remaining interface issues in this section, first by describing how the choice of
planning, real-time, and control algorithms may change the interface, then by describing
future work that must be done to fully validate the proposed CIRCA-based interfaces.
4.5.1 Effects of System Evolution
I have tried to present a fairly general picture of planning, real-time, and control systems
technology, so that the basic interface will not be invalidated by the evolution of techniques
available in any of these areas. However, certainly the choice of algorithms used for
planning, real-time, and control computations will have some effects on the interface
between the systems. I cannot predict how each system type will evolve in the future, so
instead I describe how changes in each system will affect the others, in the context of the
proposed CIRCA system.
Effects of Controller Modifications
A controller modification may cause two types of changes in the rest of the CIRCA system:
different state feature values (because different data is needed or available), and changes in
the planner’s knowledge regarding controller functionality or capabilities. So long as the
new controller properties can be effectively described in a knowledge base and reliably
monitored during execution, the interface between the CIRCA modules and the controller
will not need to change.14
Effects of Modifying Real-time Computations
Currently, CIRCA performs uniprocessor scheduling, and only considers CPU resources.
The algorithm used to schedule the CPU for CIRCA’s RTS can be easily modified without
13 I have not yet developed a more precise definition for these utility/proximity functions, so I am certainlylooking for ideas.14 This flexibility is in part due to the decision to place the controller in CIRCA’s “environment”, then expresscontroller functionality in terms of planner features, state transistions, and associated “action scoring” functions.
60
affecting the rest of the system, since the main output is an ordered list specifying the
schedule [21].
Natural extensions to CIRCA’s real-time scheduling capabilities include the implementation
of algorithms to perform task allocation and to schedule additional resources (e.g., network
traffic, I/O) as well as CPU usage. Adding these new algorithms should not cause
significant modification to other CIRCA algorithms, except for allowing functions to be
split among different processors.
Effects of Modifying the Planner
Because it will be most well-developed, CIRCA’s planning system is perhaps the most
inflexible to modifications. Any “new” planner put into CIRCA would still need to reason
about the real-time requirements of planned actions so the interface to the scheduler could
be nearly identical to the current interface [19].
Unfortunately, switching the planner may require different knowledge base structures, so
both environmental and controller properties would need to be modified to fit into the new
knowledge base. Additionally, the proposed algorithm to limit planning time (Section
3.3.2) relies on the planner’s use of a best-first search strategy to allow anytime bounding
of planning time, as well as the notion of using some variable parameters (e.g., state
probability accuracy) as a mechanism for approximate planning using the design-to-time
approach. Many planners do not have analogous heuristics, in which case the algorithm to
bound planning time would need to be modified appropriately.
As discussed in Section 4.4, a key to using symbolic state representation during planning
for continuous-valued variables is employing an “action scoring” function. Any planner
that could be used would require some set of actions, but, if the action scoring process is
an integral part of the system, it may be difficult to include a hybrid symbolic-numeric
calculation process like that I will be developing for the CIRCA planner.
4.5.2 Testing the Interfaces
During this research, I will be performing only limited tests of the interfaces, so I expect
much more would need to be done. Others [19] have described work that will still needs to
be done with respect to testing the CIRCA planner - scheduler interface. I believe the key
61
to testing the other interfaces is to build a more complex knowledge base and controller set,
then run a very diverse set of tests. Given the numerous possibilities for information
feedback, plan switch procedures (e.g., RTS cache, Dispatcher, replanning), and types of
action scoring functions used, I expect more rigorous sequences of testing can be used to
both find bugs in the existing code, and possibly even point to algorithmic deficiencies that
will need to be addressed in the future.
The CIRCA-based interface between planning, real-time, and control systems is not
specific to the fully-automated aircraft control problem. In future tests, CIRCA should be
given the chance to control different domains, with their own ideas about state values,
controllers, and knowledge base transition properties. If CIRCA performs well in these
domains, then one could better make claims about the generality of CIRCA’s algorithms.
Conversely, if the new domain produces problems for CIRCA, then these tests may result
in improvements to the CIRCA algorithms that have not been foreseen during my research.
62
=====================================================
CHAPTER 5
ACHIEVING SAFE, FULLY-AUTOMATED
AIRCRAFT CONTROL=====================================================
My primary long-term research goal is to help achieve safe, fully-automated flight. In this
section, I describe the fully-automated flight control problem and propose a simplified
model I plan to use for my thesis research. I believe the aircraft domain is a perfect choice
for testing CIRCA because it requires strict real-time response guarantees to maintain safety
(i.e., not crashing). Indefinite safety can never be achieved so long as the aircraft is aloft,
and considering the complete set of possible aircraft states is not feasible. I propose that
CIRCA will need to build and store multiple plans to capably handle all aspects of a flight.
Also, to allow response to all possible anomalies, CIRCA will also need to be able to
replan dynamically within time limits imposed by the reachable set of aircraft states.
In Section 5.1, I describe the aircraft control problem in terms of existing Flight
Management Systems (FMS), tasks that still must be performed by the human cockpit
crew, and then present arguments for full cockpit automation. Understanding current FMS
capabilities and limitations is particularly important because I propose to use CIRCA
basically on top of existing FMS (without the “flight planner” module, of course). CIRCA
will perform many of the functions currently handled by pilots, and it will minimize
duplication of tasks already performed adequately by the FMS. I describe my current
rather primitive CIRCA aircraft model and simulation tests performed to-date (Section 5.2),
followed by the (slightly less primitive) model and subset of possible emergency situations
I will consider during CIRCA testing (Section 5.3). Finally, in Section 5.4, I address
post-dissertation work that will still need to be tackled before safe, fully-automated aircraft
is possible.
5.1 Background
In this section, I describe common practices and available technology for commercial
aircraft flight. I begin with a discussion of capabilities and limitations of modern Flight
Management Systems (FMS). Next, to illustrate the broad range of functionality that
would be required of a fully-automated aircraft system, I describe the role of the human
63
cockpit crew. Finally, I motivate my push for fully-automated aircraft by describing the
most prevalent cause of aviation accidents today: pilot error.
5.1.1 Current Aircraft Control Technology
Today's most advanced commercial aircraft are capable of fully-automated flight from
takeoff roll through full-stop landing provided the original flight plan is not significantly
altered and no anomalous situations arise. In this section, I describe the capabilities and
limitations of state-of-the-art FMS. As described in [17], current FMS have two basic
components: the Flight Management Computer (FMC) and the Control and Display Unit
(CDU). The FMC is responsible for all aircraft computational and control tasks, while the
CDU serves as the main interface between cockpit crew and FMS. Typically, to increase
reliability, each aircraft will contain two independent copies of the entire FMS system, one
near the pilot and one near the co-pilot.
In this section, I focus on FMS tasks that are applicable to a fully-automated aircraft, since
no pilot interface would be required. For more details of FMS tasks related to user
interfacing, see [17]. Several basic functions are performed by the FMC: Flight planning,
Navigation, Performance Optimization, Performance Prediction, and Guidance. Figure 5-1
shows the computation modules of the FMS and how they are connected. I briefly
describe each below; more details are provided in [17] and [31].
Performance Prediction
Guidance Control Flight Planning
Pilot
ATC
Performance Optimization
aircraftdata
r(t)
NavigationSensordata
Nav Radio Tuning
u(t)
attitude,thrust sensor data
plandescentprofile
x, xreference
.
x, x, wind
.
Figure 5-1. Flight Management Computer Tasks.
Flight Planning
Current FMS have the capability to follow flight plans in the format of waypoints,
altitudes, and takeoff and arrival procedures. The flight plan may be entered by the pilot,
uplinked from a ground station, or recalled from a preset database of flight plans. Pilot-
64
entered or uplinked weather data is used by the FMS to compute enroute speeds, fuel
consumption, and arrival times for the flight.
A fully-automated aircraft would need to always be able to build its own flight plan, never
assuming assistance would be available from a human pilot or ground station. Current
FMS rely on a large database of preset flight plans, but since the database includes flight
plans between major airports around the world, there are few different plans for any
particular departure/destination airport pair. As airspace becomes more crowded and
corridors are not so clearly defined (e.g., “free flight” using GPS [34]), it may become
prohibitive to store all possible flight plans in a preexisting database. Instead, it may
become a better policy to build a set of flight plans (primary and backup) using a more
general knowledge base, based on the specific departure and destination airports for the
upcoming flight.
Navigation
Navigation involves determining current aircraft state based on sensor input and output
from a variety of computational modules. Navigation module output includes aircraft
position, velocity, and wind parameters, and is used by several FMS modules (see Figure
5-1) to keep track of how well the aircraft is following its flight plan. A navigation module
does not fully replace the “state estimator” present in feedback controllers, because aircraft
controller state must include additional state values such as (roll, pitch, yaw angles and
rates).
Performance Optimization
This function of the FMS computes aircraft performance parameters that are subsequently
used by the guidance module (see below), such as altitude, airspeed, fuel, and thrust. The
current flight plan, aircraft configuration parameters (e.g., gross weight), and current state
(from the navigation module) are used during these computations.
Performance Prediction
This function performs a faster-than-real-time simulation of the flight using the current
flight plan to predict future attributes of the flight, including arrival time, fuel consumption,
etc. This simulation continues throughout the flight, and if the system predictions violate
65
constraints (e.g., not enough fuel to follow the flight plan as-is), a warning message will
be displayed for the pilot, who then is fully responsible for reacting to this warning.
Subsidiary functions are also provided in this module, primarily to provide information to
the pilot or support the FMS flight performance optimizations described above. These
subsidiary functions are time-consuming [17], and are strictly done on a best-effort basis as
background processes. Quantities computed from these background processes include
predictions of nearest alternate airports, descent path generation (to determine the inflight
location to begin the initial descent from cruise), etc. In a fully-automated aircraft, these
functions would need real-time response guarantees, because no pilot could be relied upon
as a backup for critical decisions (e.g., selecting and entering a course for an alternate
airport).
Guidance
Using the flight plan and the more detailed descent path altitude reference trajectory, the
guidance module is responsible for generating the continuous, time-dependent reference
trajectory in terms of low-level aircraft state. In an aircraft, linear position and velocity are
tightly coupled to aircraft attitude, thrust, and airspeed. In fact, the FMS controls only
these attributes to achieve the desired linear position and velocity. By using the input flight
plan, an approximate dynamic model of the aircraft, and current linear position error
estimates, the guidance module computes the desired roll, pitch, airspeed, and thrust to be
achieved. These values are then sent to the low-level controllers as reference inputs.
Controllers
As discussed above, the low-level controllers will receive reference commands for roll,
pitch, airspeed, and thrust. These controllers then use aircraft-specific feedback control
laws for achieving these commands. Since aircraft dynamics are highly nonlinear, these
controllers are difficult to specify for wide ranges of reference inputs. Techniques such as
gain scheduling [16] allow local linearization of the system, which facilitates the
computation of controller parameters.
Due to the nonlinear and tightly-coupled nature of the reference command attributes, only
certain combinations of state (r(t) = {roll, pitch, airspeed, thrust}) may be successfully
achieved. I have not yet encountered a careful description of how limitations on these
66
regions of “controllable” state space are propagated all the way from controller to flight
planner. I hypothesize that FMS designers have worked around this problem by storing
only preset flight plans that have been shown to behave acceptably during “near-normal”
flight conditions. However, I also hypothesize that the FMS will fail when “abnormal”
conditions (e.g., severe wind shear, actuator loss) result in guidance and/or controller
reference states that are not achievable.
5.1.2 The Role of the Human Cockpit Crew
The flight crew's primary tasks are to monitor instruments and aircraft performance,
communicate with ATC (Air Traffic Control), and make appropriate route changes based
on situations such as ATC directives, instrument indications (including failures), weather,
and other air traffic. In the most modern aircraft, control automation has progressed to the
extent that there are only two cockpit crew members, the pilot and the co-pilot. One person
(the pilot) supervises all flight operations, while the other (the co-pilot) typically handles
any manual flying, navigation, and communication with ATC. In this section, I describe
the tasks typically performed by each member of a two-person cockpit crew, then briefly
discuss procedure changes associated with handling emergencies.
Pilot
Prior to leaving the gate, an initial flight plan from departure point to destination is
approved by the pilot and transmitted to ATC. The FMS flight planner calculates and
displays an initial plan for a standard flight from one airport to another. However, this
program has limited capabilities (as discussed above) with respect to automatically
responding to changes in aircraft performance capabilities, unusual sensor readings (e.g.,
collision-course traffic), or even ATC commands. The pilot or co-pilot must manually
enter course changes whenever a situation warrants a major modification to the original
flight plan.
The pilot's major responsibility during all phases of flight is supervising cockpit functions
as well as "taking control" of the aircraft whenever he/she thinks it is necessary. The
theory is that if the pilot is freed from the time-consuming tasks of route calculation, aircraft
control, and communication, he/she will be better able to perform critical monitoring tasks,
thus discovering and reacting to problems as quickly as possible. This allows the pilot to
"get ahead of the airplane" -- to develop plans for potential conflicts or diversions from the
original flight plan based on dynamic changes in route and/or aircraft conditions.
67
Since the pilot assumes primarily management responsibilities, the co-pilot typically
performs the manual flight functions unless the pilot has taken control (e.g., during
emergencies). From takeoff roll to landing touchdown, the pilot constantly monitors the
flight instruments (e.g., altimeter, airspeed indicator, heading indicator, etc.), looking for
any anomalies in actuator inputs and responses. Additionally, the pilot compares the
aircraft flight behavior (determined visually and from instruments) with the
expected/planned behavior to make sure the cockpit crew has correctly entered the desired
commands to the FMS and/or manual controls.
The pilot is ultimately responsible for all important decisions made during flight, including
aborted takeoffs, missed approach (go-around) calls, and any emergency handling
procedures. Aborted takeoffs may occur so long as sufficient runway remains, when
problems such as a failed engine or critical instrument warning indicate that the plane
should not be flying. A missed approach occurs in many situations, including situations in
which there is an obstacle on the runway, landing equipment such as a gear malfunction, or
inclement weather prohibits landing. Pilots spend many hours explicitly training to handle
emergency situations when or if they arise, because experience is considered invaluable for
making the right decision during a high-pressure emergency requiring quick response.
Because of its importance, both pilot and co-pilot must perform collision avoidance tasks,
using data from ATC, automatic TCAS (Terminal Collision Avoidance System) warnings,
and visual identification of nearby aircraft and/or terrain. This task is particularly important
on approach to landing since the traffic is frequently close together and the airplane is not
too far above the ground. The pilot has the ultimate responsibility to initiate any course
changes required to avoid collisions, although he/she is often assisted by the co-pilot.
Finally, the pilot is responsible for interacting with the rest of the people in the airplane.
He/she supervises the cabin crew, advising them of times to prepare for takeoff or landing,
and any emergency procedures that may be required. The pilot also has the job of
informing and calming the passengers by telling them of various situations, landmarks, etc.
Perhaps the main difficulty with a fully-automated cockpit would be the job of calming the
passengers, especially those that had a fear of computer systems.
Co-Pilot
Perhaps the most common view of a co-pilot is based on his/her role as a “backup system”
for the pilot. I, instead, view both the pilot and co-pilot as people to take over the airplane
68
if the situation is not handled by the flight computers. The primary responsibilities of the
co-pilot include communicating with ATC and manual flying of the aircraft, freeing the
pilot to adequately perform the supervisory tasks discussed above.
The co-pilot generally handles all communication with ATC. Communication begins while
the plane is still at the gate. The flight plan (trajectory) is transmitted to clearance delivery,
who transmits this plan to ATC computers and alters the plan if necessary. ATC then
automatically clears a corridor of airspace for the aircraft for its entire flight, significantly
reducing the chance of mid-air collision. When the plane is ready to leave the gate, ground
control is called, and the plane is guided along taxiways to the runway. Before takeoff, the
co-pilot calls the tower and receives a clearance for takeoff as well as instructions for
climbing to join the filed flight plan. After takeoff, the co-pilot switches to appropriate
ATC enroute control centers, maintaining constant communications with ATC. On
approach to landing, the co-pilot communicates with the destination airport tower until
landing, then ground control until reaching the destination gate. Any of the enroute ATC
centers may change the aircraft course, usually to avoid bad weather or traffic. The co-pilot
enters course changes into the flight management computer, which then updates the flight
parameters such as expected fuel usage.
The co-pilot is also responsible for calling out any warning lights or instrument anomalies
as they occur, as well as calling out checklist items. This provides a backup to both pilot
and flight management system should the co-pilot notice a problem first. The co-pilot
normally performs any manual aircraft flight or FMS setting required. Again, this allows
the pilot to spend more time supervising the actions instead of becoming involved with
actually performing the tasks.
Emergency Handling
If any anomaly during flight requires significant manual flight control, the workload of
both pilot and co-pilot will increase dramatically. Decisions must be made quickly (e.g.,
where to land given no engines or severe icing), and the aircraft may rapidly transition
between flight configurations (e.g., violently maneuvering to avoid traffic). The pilot is
responsible for making the final decisions regarding emergency handling, but the co-pilot
will often offer advice. The pilot may choose to fly the plane manually during
emergencies, with the co-pilot constantly reporting aircraft status to ATC, and working to
assist the pilot wherever possible.
69
5.1.3 Why Fully Automate?
There is one main reason to remove pilots from the cockpits of commercial aircraft: pilot
error. NTSB (National Transportation Safety Board) accident report statistics [23] show
that pilot error is at least a contributing factor in the vast majority of aviation accidents in the
United States. Pilot error is caused by a number of factors, including inadequate cockpit
communication, lack of training, pilot’s inability to make decisions quickly under pressure,
or work overload during critical phases of flight. These factors will be magnified during an
actual inflight emergency because pressure and workload increase dramatically.
Today’s complex aircraft introduce new difficulties in combining a capable flight
management system with a human cockpit crew. Two contributing factors to pilot error
result: lack of FMS understanding and decrease in pilot proficiency. Several major
aviation accidents have been caused by the pilot’s lack of understanding of the FMS.
Human factors researchers continue work to improve FMS user interfaces, but this is a
difficult task because pilots are rarely computer or control experts. Also, in modern
commercial aircraft, the only time the pilot must manually fly the aircraft is when the FMS
is incapable of safely controlling the plane. Typically, these situations will be the most
difficult to control for the human pilot also, and since the pilot does not get as much
practice as he/she did before modern FMS existed, he/she will likely not be able to respond
as quickly as if he/she were in constant manual control of the aircraft. To address this
problem, today’s commercial pilots train extensively in simulators and occasionally turn off
the FMS during flight. However, in order for today’s pilots to accumulate as much manual
flying experience as pilots in older aircraft, the FMS would always be turned off, in which
case one might debate the utility of having such a fancy FMS at all.
Given the problems with pilot error and difficulty of maintaining pilot proficiency in
today’s aircraft, I conclude the obvious: take the pilots out of the cockpit. Human pilot
error will certainly be eliminated; however, the replacement system must be designed so
that we don’t simply transfer the pilot error to the FMS. I propose that, by using a
“perfected” version of the CIRCA system, a more complete set of navigation, guidance,
and control modules, and many years of work specifying flight knowledge and testing the
system, an FMS may be developed that will produce far fewer errors than are produced in
human-piloted aircraft today, even in emergency situations.
70
5.2 Current CIRCA Aircraft Model
The Aerial Combat (ACM) [25] Flight Simulator has been used for all CIRCA aircraft
domain tests to-date. ACM simulates an F-16 aircraft, using a six degree-of-freedom
nonlinear dynamic model to compute aircraft motion parameters given the complement of
actuator inputs. I selected ACM for three reasons: 1) ACM runs on any UNIX
workstation, 2) ACM is free, and 3) Source code is available. I modified ACM to
communicate via UNIX socket, and have created a knowledge base which allows CIRCA
to guide the aircraft during flight around an airport pattern, as illustrated in Figure 5-2. In
this section, I describe issues associated with the current aircraft model, including the
aircraft knowledge base and the low-level controller used by the CIRCA planner. I also
describe CIRCA’s performance for the small group of tests performed thus far.
FIX41
8 36
Navigation AidRunway
FIX0
N SE
W
FIX1
FIX2
final approach
FIX3
Figure 5-2. Aircraft Flight Pattern Flown during CIRCA Testing.
5.2.1 Knowledge Base Description
My initial goal for defining the CIRCA knowledge base was to define a simple set of
discrete-valued features that would allow CIRCA to guide the aircraft around the airport
pattern, and also demonstrate CIRCA’s ability to recognize and react to “anomalous”
situations. The use of a simple set of feature-value pairs illustrates the utility of combining
a low-level controller with CIRCA, because with no controller, CIRCA’s knowledge base
would have required much more feature and value detail (and still wouldn’t have flown the
aircraft even decently without a lot of work). Also, by including a simple set of features
used to model anomalies, I was able to demonstrate how the addition of CIRCA to a simple
flight controller allowed the system to better react to these problems.
Table 5-1 lists the feature types and their values present in the current CIRCA knowledge
base. With only these features, the planner is able to direct the aircraft around the pattern
(from takeoff through full-stop landing), assisted by the low-level controller, of course. In
summary, features for desired altitude (zero = ground level; positive = 5000 ft.) and
71
heading are used as references by the aircraft controller during flight.15 Because it is
discrete by nature, gear position is directly actuated (and subsequently sensed) by CIRCA
with no intermediate controller. The NAVAID (Navigational Aid) frequency and
Omnibearing Selector (OBS) are also controlled directly by CIRCA, using the preset
discrete values corresponding with the VOR/ILS (for frequency) and the location “corners”
illustrated in Figure 5-2.
Table 5-1. Current Aircraft Knowledge Base Features and Values.
Feature ValuesGear Position up, downAltitude zero, positiveHeading North, South, East, WestLocation Fix0, Fix1, Fix2, Fix3, Fix4, Fix5, Fix6Omnibearing Selector Fix0, Fix1, Fix2, Fix3, Fix4, Fix5, Fix6NAVAID Frequency VOR, ILSCollision-Course Traffic True, NilSwerving (to avoid traffic) True, NilOn Course True, Nil
Two types of emergencies have been simulated in CIRCA: gear failure and collision-
course traffic. The gear failure is modeled simply by the inability to transition the gear to
the “down” position before landing. Collision-course traffic is modeled very simply by a
feature representing the detection of collision-course traffic, and by an action to execute a
standard “swerving” maneuver. Of course, the swerve maneuver will cause the aircraft to
deviate from its planned course, as modeled by the “on course” feature, so a correction
action must be taken to resume course after the offending traffic has passed. These models
are very simple, but they illustrate how CIRCA can be used to plan reactions to key
emergencies that would simply be ignored by a controller blindly following a preset
reference trajectory, as described below in Section 5.2.3.
5.2.2 Aircraft Controller
I interfaced the ACM F-16 flight simulator [25] to a set of linear Proportional-Derivative
(P-D) controllers [14] to calculate actuator values that achieve the commanded reference
altitude and heading. Of course, the nonlinear dynamics of the aircraft are not even closely
modeled with my primitive controller set, but so long as the primary actuators function and
15 The continuous time reference r(t) is generated from the discrete altitude and heading by a very simple linearfunction connecting the two endpoints (e.g., “zero” and “positive”, or “North” and “West”).
72
the aircraft attitude doesn’t vary significantly from level flight (especially pitch), the current
controllers perform decently.
CIRCA currently has access to three basic controllers: takeoff/climb, cruise, and final
approach/landing. In addition to the continuous-valued actuators, the controller
automatically controls the aircraft afterburner, flaps, and brakes with discrete commands
based on current aircraft state. So, during the initial takeoff/climb phase, the controller
lowers the flaps 10 degrees, then turns on the afterburner (in addition to 100% normal
throttle) and initiates PD control using the “takeoff” set of controller gains. Then, after the
aircraft achieves a “safe” (1000 ft) altitude AGL (above ground level), the afterburner shuts
down and flaps retract. When nearing the “cruise” altitude, the controller switches to gains
for the cruise flight, allowing the aircraft to follow the specified heading and altitude
commands. Then, for final approach and landing, the controller mode switches to an
“autoland” controller, using the ILS heading and glide slope offsets as feedback for
controlling heading and altitude. When the aircraft comes within five miles of the airport,
the speed brake is deployed, then flaps are extended. After touchdown, the wheel brakes
are automatically set, stopping the aircraft at runway heading.
5.2.3 CIRCA Performance during Flight
The CIRCA knowledge base and aircraft controller set was initially debugged and tested
during flight around the pattern with absolutely no anomalies. Once this task was
performed successfully, CIRCA’s ability to fly was further tested with two emergencies:
“gear fails on final approach”, and “collision-course traffic on final approach”. Using these
basic emergency situations, variations of the knowledge base allowed tests of each
algorithm to detect and handle the classes of “unplanned-for” states (as described in Section
3.2.2 and [1]), as well as tests of CIRCA’s model of probability (as described in Section
3.2.1 and [2]). In both emergency situations, CIRCA was able to notice the problem and
react appropriately, replanning for a go-around procedure when gear failed and extending
pattern legs to avoid collision-course traffic.
Recent tests [19] have used the CIRCA aircraft flight knowledge base to illustrate planner-
scheduler negotiations, using extended traffic avoidance maneuvers plus some additional
highly-improbable events (e.g., “flight into a tornado”) to overload the scheduler. Due to
the knowledge base extensions in [19], CIRCA can now avoid traffic via a standard
avoidance maneuver at any position in the pattern. I hope to continue extending CIRCA’s
73
capabilities in this direction, combining “standard” and “custom” maneuvers when
necessary to help CIRCA better react to inflight anomalies.
5.3 Proposed Aircraft Model and Capabilities
Current FMS are quite capable of flying aircraft in many situations. However, as
discussed in Section 5.1, “flight planning” is inflexible since it can only draw from a
limited database of plans, and the simulation (used in “performance prediction”) operates
strictly at best-effort speed. Both these limitations prevent fully-automated operation, so I
propose that those two modules are “weaknesses” of current FMS that may perform better
if replaced by a system such as CIRCA, as shown in Figure 5-3. In this illustration, the
Guidance, Control, and Navigation modules are independent of CIRCA (i.e., considered
part of CIRCA’s “environment”). The flight planning and performance prediction modules
have been replaced by CIRCA, which will ideally build and execute plans that can output
similar quantities, except with more flexibility and real-time guarantees.
Guidance Control
Kn owledge Base
Performance Optimization
r(t)
NavigationSensordata
Nav Radio Tuning
u(t)attitude, thrustsensor data
x, xreference
x, x, wind
.
CIRCA Planner, Scheduler, Dispatcher Subsystems
CIRCA RTS
ATC CIRCA ABS
CIRCA data conversion
actions
features
ControllerStatus
Figure 5-3. Integration of CIRCA with a Flight Management System.
I will not be able to take advantage of the modules from an operational FMS, so, for my
thesis research, the functionality of my system will be very simple and approximate
compared to current FMS module functionality. I plan to continue using the ACM
simulator throughout my research, building on the simple PD controller set used for past
tests. Because my research concentrates on the basic algorithms used in CIRCA for
planning and plan execution, my near-term goal with respect to the aircraft model is to add
new features as needed for testing CIRCA algorithm functionality. However, I feel it is
74
important to model features realistically and work toward an integrated CIRCA-FMS
system, so that my thesis work will have a better chance of being applicable to a future
FMS-like system. I believe that by keeping the CIRCA model small but realistic, I will
have a better chance to reuse parts of the model in later research.
The proposed CIRCA aircraft knowledge base will be built upon the existing model
described above. I now have a very basic knowledge base model of altitude and heading,
as well as gear and simple “locations” for flight around the pattern. In future modeling, I
intend to keep the spirit of the FMS models, describing aircraft state to the planner in terms
of altitude, heading, longitude, and latitude, and leaving all attitude calculations to the very
primitive “guidance” module (built into the controllers currently). For my thesis work, the
planner (and ABS) will specify aircraft trajectory in terms of position and constant
velocities, also leaving acceleration computations to the guidance module (which will “catch
up” with the commanded positions and velocities after periods of acceleration or
deceleration).
Because “flight around the pattern” is very restrictive (too few locations to give the planner
many choices), I propose to extend the “location” model to include normal flight between
airports separated by quite a large distance, so that multiple “legs” of the flight will be
necessary (following paths along a system of “airways”). Then, I plan to enhance both the
knowledge base and controller set so that CIRCA can handle the following anomalous
situations, described below: low fuel, cabin depressurization, complete engine failure, and
rerouting due to bad weather. Since I have not yet incorporated associated features for
these tasks, I cannot yet provide an explicit list of the feature names and values. However,
each of these anomalous situations will not require too many new features (e.g., fuel
quantity, cabin pressure, oxygen tank level, etc.). As a start, I plan to develop the model
for each anomaly independently of the others. However, in the final tests of CIRCA, I
plan to combine anomalous situations to show how CIRCA can continue to function in the
best possible manner even though multiple problems have occurred.
5.3.1 Low Fuel
Typical flight plans are built to ensure plenty of fuel will be available during the flight.
However, if either a system failure occurs (e.g., fuel leak) or the aircraft is significantly
rerouted, the system will need to be able to select a course that will not let the fuel get too
low. Fuel quantity changes between its extreme limits (full/empty) much more slowly than
75
a quantity such as altitude, for example. A major test of the proposed algorithm to
efficiently attach time stamps to states (Section 3.3.1) will involve combining fuel temporal
transitions with the faster-acting transitions present in the current aircraft knowledge base.
Also, as described in Section 4.4, I wish to test a new algorithm to be used during action
scoring, even with a symbolic set of planner feature values.
5.3.2 Cabin Depressurization16
When an aircraft cabin depressurizes at a high cruise altitude, passengers must breathe from
oxygen stored in tanks. There will be enough oxygen to support the passengers for a
reasonable amount of time, so the “transition” from oxygen tanks “Full” to “Empty” will
certainly occur more slowly than many other modeled transitions. A typical reaction to
depressurization would be to recompute a trajectory to a lower altitude, then divert to a
nearby airport if it will be safe to do so. This feature of the aircraft model will
simultaneously test several CIRCA algorithms, including the time stamp and action scoring
algorithms in Sections 3.3.1 and 4.4, as well as the ability of CIRCA to build and switch to
a contingency plan (e.g., to lower altitude if flight continues until oxygen is nearly empty)
or to dynamically replan (e.g., if state is “safe”, but a new plan is needed to divert to a
nearer airport).
5.3.3 Engine Failure17
Complete engine failure is perhaps one of the most feared emergencies in aviation. Such
failures are rarely expected (or else the plane wouldn’t be flying), and reactions must be
very quick, because a powered aircraft will not be able to maintain altitude, even in best
glide configuration. As discussed in Section 5.1, current FMS continually calculate and
display the set of nearest airports, and may even compute whether the aircraft can stay aloft
long enough to reach that airport. However, the FMS stops there, neither automatically
diverting to the “best” airport nor selecting the best “off-field” site to crash-land.
I believe the engine failure emergency will clearly illustrate the utility of having an available
planning system in conjunction with a set of prebuilt plans. Although I have not completed
development of the CIRCA knowledge base model of engine failure, I would expect the
16 Although I will be simulating an F-16 aircraft, I will assume the aircraft is pressurized for passengers, since thatwould be the case in commercial aircraft.17 The simulated F-16 has only one engine, so complete engine failure occurs when one engine fails.
76
following scenario: 1) CIRCA will include in all plans a TAP to quickly18 detect engine
failure, 2) If engine failure occurs, CIRCA will quickly switch to a plan that will set up a
best glide configuration and point the aircraft toward the nearest airport, effectively buying
time for the planner, 3) The planner will take the current state data and replan in time to
execute the plan (e.g., before the aircraft has lost so much altitude that it cannot turn
elsewhere). If the aircraft has sufficient altitude to reach the airport, or if the “best” landing
spot is straight ahead, the new plan will be identical to the executing contingency.
However, if terrain or population centers are not uniform, step 3) will allow the system to
select a flight path that will lead to a relatively desirable off-field landing site.19
5.3.4 Rerouting due to Bad Weather
Bad weather can result in flight plan changes ranging from a simple “divert around an
isolated thunderstorm cell” to “destination and/or alternate airport closed due to
ice/snow/fog”. I will certainly be unable to add a complete weather model to the ACM
simulator, but I do plan to simulate each of these two particular weather-based situations
during my research by modifying the ACM software to report isolated thunderstorms and
airport closings. Because it is virtually impossible to predict the exact location of an
isolated thunderstorm, I believe CIRCA’s combination of contingency plan storage and
online planning will be clearly illustrated by this example. Offline, CIRCA will build
reactions (or contingency plans if scheduling is difficult) to turn away from thunderstorm
cells. Then, based on fedback feature data describing the location and extent of the
thunderstorm cell, CIRCA will dynamically replan to divert around the storm. During a
“normal” diverting procedure, replanning will not be overly time-limited, since the aircraft
is flying away from the storm.
Normal FMS flight plans contain a trajectory to one alternate airport should the destination
airport close [17]. However, if a large weather system results in multiple airport closings,
the FMS will not have planned a route for any other airport. To mimic FMS operation and
enhance chances of safety, CIRCA will build a set of plans to fly to the destination airport
and one nearby alternate (via contingency planning). Then, if both these airports close,
18 In this paragraph, I use “quickly” to describe a task that will be completed in guaranteed real-time.19 Ideally, the contingency plan set would already contain a complete description of the best airport and offieldlanding sites for all points along the trajectory. However, for a multi-thousand-mile flight, I hypothesize it willbe infeasible to build and store contingency plans that account for the terrain features and population densities atall enroute positions.
77
CIRCA will react by automatically entering a holding pattern if necessary (instead of
landing), then dynamically replanning to reach another open airport.
5.4 Future Work -- Flying a “Real” Airplane Safely
After the research outlined in this proposal has been completed, there are still numerous
technical issues that will need to be addressed before safe, fully-automated flight is
possible. In this section, I describe methods by which the aircraft models used in a
CIRCA-like system may be augmented, leading to better reactions and thus a safer fully-
automated system.
5.4.1 Building a Comprehensive Aircraft Knowledge Base
CIRCA will be using a very limited knowledge base during tests. To generate even near-
optimal flight plans, the knowledge base must contain information to help it select an
efficient and safe path at all points during a flight. This requires the aircraft to avoid
“obstacles”, either airborne or ground-based. Avoiding airborne obstacles requires the
flight planner to consider airspace restrictions (e.g., military operation areas) and air traffic
control instructions. To avoid ground-based obstacles such as mountain peaks or radio
antennas, the planner must employ geographical knowledge, including terrain elevation and
type (e.g., desert), population densities, and even “tall” building locations. “Geographical”
knowledge combined with knowledge of airport facilities will also help the planner select
the best landing sites should the aircraft need to land somewhere other than the destination
airport.
In this proposal I make claims that a CIRCA-like flight planning system will help a fully-
automated system respond accurately and quickly to inflight anomalies that may lead to
emergencies. Pilots spend a significant amount of time studying NTSB accident reports so
that, if they every encounter a similar emergency, they may use this information to help
them react optimally and quickly. NTSB reports typically contain a description of the
situation in which the accident (or incident) occurred, the contributing factors (causes), and
actions that might have avoided the accident. I believe incorporating the full set of
situations and appropriate reactions proposed in the NTSB accident reports will be the key
to making a fully-automated aircraft “safe”, particularly with respect to responding quickly
and accurately to potentially dangerous emergency situations.
78
5.4.2 Building the Control System
Certainly, research groups in companies that design current FMS will have a much better
set of controllers and state estimators than I could ever hope to build independently. If a
CIRCA-like system is to ever be used on a real aircraft, researchers will need to work with
a major FMS designer to gain access to their technology.
Eventually, the aircraft control system may be composed of nonlinear and/or linear
feedback controllers which are automatically invoked by methods such as the current gain
scheduling [16] and its variants (e.g., [26]) or methods like the neural-network-based
approach described in [29]. With an advanced set of such controllers, the control system
itself will be able to detect and correct for low-level sensor or actuator anomalies.
However, such a system will try its best to follow the specified reference inputs, so the
guidance and higher-level systems must always be aware of the controller’s capabilities
based on the current system state (e.g., if the engines are out, the reference altitude rate of
change must not exceed that imposed by the “best glide” limit).
One of the advantages of state estimation (instead of using direct sensor values) is the
ability to maintain an accurate measure of system state even if some sensors fail or become
noisy. By using a redundant, comprehensive set of sensors to measure system state
(including system diagnostic measurements), the state estimator will be able to provide
accurate values of aircraft parameters, or if not, will be able to detect faulty estimates and
react with some combination of controller parameter changes and the transmission of faulty
state parameters to the higher-level planning/plan execution system.
5.4.3 Incorporating the System in “Real” Aircraft
Flying is a difficult endeavor because of both system complexity and the potentially
catastrophic consequences of reacting too slowly or incorrectly. Before pilots can be taken
out of commercial aircraft, extensive tests will be required. The key capability introduced
by a CIRCA-like system is the ability for the system to detect and respond “appropriately”
to anomalous situations, both small problems and major emergencies. Because it would be
infeasible to prove that the fully-automated system would react properly in absolutely all
situations, extensive testing is perhaps the only way to gradually gain trust in the system.
79
I would imagine three main phases to system testing. First, the “fully-automated” FMS
would be connected to a simulator that had the ability to realistically simulate a large group
of emergency situations. Next, the full-automation capabilities would be assessed with
respect to pilot capabilities, running the full-automation capabilities in parallel with the
standard FMS routines (augmented by pilot commands). These new FMS computations
will not interfere with the standard FMS, so if they do not operate correctly, the flight will
not be compromised in any way. By comparing the automated and pilot-commanded
responses, the fully-automated system may be better be debugged. Finally, the “fully-
automated FMS” may be put into service, but a pilot will still have the ability to revert to
manual control of the aircraft.20 If/when the fully-automated FMS has demonstrated the
capability to reliably operate without pilot intervention and has gained the trust of the FAA,
it will be time to think about taking the human pilot out of the cockpit.
20 Ideally, this system would make commercial aviation more safe, because one has two separate “systems”, FMSand pilot, that can handle both regular and anomalous-situation flight. Of course, pilots will need to be trained tounderstand the operation of the fully-automated FMS, and many user interface issues will arise. Otherwise, the newsystem could compromise safety, not improve it.
80
=====================================================
CHAPTER 6
SUMMARY=====================================================
I have proposed research to develop a system that can simultaneously consider issues from
the AI planning, real-time, and control systems fields, focusing on the problem of
achieving safe, fully-automated control of a traditionally piloted vehicle. Incorporating
real-time constraints into such a system necessitates the careful consideration of time during
planning, predictable execution characteristics for all system processes, and explicit
scheduling of critical actions and control loops to guarantee meeting deadlines. Interfacing
a planner and controller requires that the planner contain knowledge describing controller
capabilities and limitations, and that a common language exist for efficient communication
between the two systems.
To address these problems, I will work in the context of CIRCA, the Cooperative
Intelligent Real-time Control Architecture, which was explicitly designed to address issues
involved with planning for an environment requiring real-time response guarantees.
Originally, CIRCA combined a planner, scheduler, and real-time plan executor such that it
could build and schedule plans that were guaranteed to meet critical response deadlines. In
previous work, CIRCA always assumed it could build each plan to maintain safety
indefinitely, allowing the planner to deliberate as long as it needed. This is an unrealistic
assumption in many domains, so I propose augmentations to CIRCA which will allow it to
limit planning deliberation time while achieving the best quality plans possible. The new
version of CIRCA will include planning, scheduling, and real-time plan execution
subsystems as before, but also will include new Dispatching and Abstraction modules.
The Dispatching Subsystem will allow the planner to build and store plans offline, helping
CIRCA achieve faster response when a new plan is required. The Abstraction Subsystem
will contain the functions required to translate between CIRCA commands and the language
of the controllers and state estimators used for each domain.
Since it is unrealistic to assume complete and correct knowledge, I have augmented CIRCA
to detect and react to important unplanned-for situations that may arise, including deadend,
removed (low-probability), and imminent-failure states. To-date, CIRCA has relied on
“coincidental” real-time planner response to these states, but this is not adequate for time-
81
critical domains. In this proposal, I have described a method to allow predictably fast
responses to important subclasses of unplanned-for states using prebuilt reaction plans
stored in CIRCA’s new RTS plan cache or Dispatcher Subsystem. For other, less time-
critical unplanned-for states, I have proposed online CIRCA planning using algorithms to
limit planner deliberation time.
As a first step to limiting planner deliberation time, I have built an approximate model of
probability into CIRCA. This model allows the removal of improbable states from
consideration when necessary and directs CIRCA to plan using a best-first search strategy
based on state probability. This preliminary work on the uses of state probability has led
me to the development of a model that explicitly trades off planning speed with accuracy
during planning, allowing a design-to-time approach to limiting deliberation time. Because
the parameters used for the design-to-time calculations will be imprecise, I have proposed
combining this approach with an anytime policy to guarantee that the planner will stop its
deliberation before its deadline passes. The planner will expand state-space in best-first
order based on a flexible utility function which combines state probability, time horizon,
and proximity to failure. In this manner, when interrupted by the anytime monitor, CIRCA
will be confident that the planner has expanded the “best” states it had time to consider.
I have proposed an architecture which combines basic planning, real-time, and control
systems methods, and have argued that this is the best approach for achieving fully-
automated control of a complex system. By considering how each of these three fields
addresses a typical problem, I have identified standard inputs and outputs from each type of
system, and constructed a basic interconnected system which includes the modules from
each. I have described how the proposed version of CIRCA maps to this interconnected
system, and then describe how the interconnected system may function in the context of
CIRCA. Plan caching, scheduling, and planner deliberation time limiting address the
problem of imposing real-time constraints in planning and plan execution. Control
engineers must typically be very careful about specifying time and resource requirements
for their systems, so, for my thesis research, I assume associated real-time constraints have
been addressed and handled prior to CIRCA execution.
The interface between a planner and a controller has been described in terms of the inputs
and outputs of typical planning and control systems. To connect the two, planned actions
executed by CIRCA’s RTS will include directives that control the reference trajectory input
to the controllers, and feature feedback to CIRCA from the controllers will include values
82
derived from the state estimators. CIRCA’s Abstraction Subsystem (ABS) will contain
functions to build abstract feature/value pairs from sensor and state estimator data. The
ABS will also contain the functions required to translate the high-level CIRCA trajectory
commands (or actions) in terms of discrete straight-line position and velocity vectors into
the continuous dynamically-feasible reference functions to be used by the controller.
My long-term research goal is to apply this research in CIRCA to the problem of achieving
safe, fully-automated aircraft flight. For my thesis, I have proposed tests using an aircraft
simulator that will demonstrate how CIRCA can help a fully-automated aircraft achieve its
primary goal of remaining safe (i.e., not crashing), even in the presence of system failures
and environmental anomalies. Because I will not be able to develop a comprehensive
aircraft model during my thesis research, I hope to continue CIRCA and aircraft model
development past my thesis research, eventually implementing the system in a carefully-
monitored “real” aircraft. I have proposed future work necessary to allow the automated
aircraft to incorporate knowledge regarding “unexpected” situations and proper responses
to these situations from the large databases of NTSB accident reports. When this work is
complete, I predict that the fully-automated aircraft will be better “trained” than human
pilots, thus the safety of the fully-automated aircraft will also surpass that of a human-
piloted aircraft. At this point, I will be able to argue much more strongly for “taking the
pilots out of the cockpit”, and at this point I may also be very old.
Table 6-1 summarizes the tasks I hope to complete before graduating. The first column
shows a list of tasks to be accomplished, in the order they are to be tackled. As advertised
in the introduction, I will not be promising completion dates, because I have always
underestimated the time required in previous scheduling attempts. Instead, I provide a final
column describing how I will know that task is sufficiently complete for my thesis.
83
Table 6-1. Proposed Research Task Summary.
Non-specific Completion Date
Task Description 0 1 2 3 4 5 6 7 8 9 Task is done when:
Classify "unhandled" states; detect andreact to important classes of them x
(done)
Build initial state probability model x(done)
Interface CIRCA to ACM flightsimulator x
(done)
Build initial CIRCA Abstractionmodule x x
Abstraction code split fromCIRCA RTS
Implement CIRCA Dispatcher/RTSCache x
Plans are stored and fetched asdictated by planner
Complete/implement planner timebounding algorithm (Section 3.3.2)
xPlanner computes time limit, design-
to-time parameters, and imposesanytime limit on planning
Test CIRCA’s ability to respond in atimely fashion using the appropriatecombination of the RTS cache,Dispatcher, and time-limited Planner
x
Aircraft switches appropriatelybetween plans for Engine Failure
(RTS Cache and Replanning if time),Bad Weather (Dispatcher then
Replanning), and Depressurization(Replanning only)
Develop/implement primitiveABSTRIPS-like subgoaling in planner x
ABSTRIPS-like code implemented(not a major research item)
Test subgoaling with aircraft simulatorx
Desired “waypoint” subgoals aredeveloped for normal flight and
anomalies (i.e., when replanningfor unhandled states)
Complete/implement state timestamp algorithm x
Algorithm based on that in Section3.3.1 has been implemented
Complete/implement time-basednumerical model into action scoring andprobability computations (Section 4.4)
xAction scoring utility implemented
& probability model acceptablymodified to handle numerical features
Test new time stamp, action scoring,and probability algorithms
xAircraft uses new algorithms to
respond appropriately to low fuel,cabin depressurization, engine failure
Perform final tests of “complete”CIRCA in flight simulation x
CIRCA successfully controls theaircraft for any modeled combinationof traffic, gear, fuel, depressurization,
engine failure, and bad weatheremergencies
Write thesis x x x x I am called "Dr."
84
I feel my thesis research will provide the most significant contribution to the “real-time AI”
community. Other researchers have addressed issues associated with time-bounded
planning or time-bounded plan execution, but few have addressed the two simultaneously.
CIRCA already schedules plans to meet the real-time execution deadlines computed during
planning. To improve CIRCA’s ability to react quickly and accurately in complex
domains, I have proposed a combination of online and offline planning, caching critical
responses in advance, and employing an algorithm to compute and impose planner
deliberation time limits. Using either a design-to-time or anytime algorithm to limit
planning, one basic question often arises: “What happens if the planner doesn’t even have
enough time to compute an approximate plan?” I have directly addressed this issue with the
CIRCA plan cache, which will contain plans to handle the states requiring fastest response
times (e.g., members of the “imminent failure” set). In this fashion, CIRCA actively
increases available deliberation time, minimizing the chance that available time will expire
before CIRCA can create at least a minimal plan.
I believe it is always important to simultaneously consider the theoretical and practical
implications of system design. I have approached my research from both sides, working to
develop a realistic model of the fully-automated flight problem, and also considering the
more theoretical issues required to achieve both the computational accuracy and efficiency
that will be required for fully-automating any complex dynamic system. By carefully
studying the operation of current flight management systems (designed primarily by control
engineers) while developing “better, faster” planning and plan execution systems, I feel I
will be able to help bridge the gap between control and AI planning researchers, who rarely
collaborate because they don’t seem to understand each other (except in ATL, of course).
85
=====================================================
CHAPTER 7
REFERENCES=====================================================
[1] E. M. Atkins, E. H. Durfee, and K. G. Shin, " Detecting and Reacting to Unplanned-
for World States," Proceedings of AAAI Fall Symposium on Plan Execution: Problems
and Issues, pp. 1-7, November 1996.
[2] E. M. Atkins, E. H. Durfee, and K. G. Shin, "Plan Development in CIRCA using
Local Probabilistic Models," Uncertainty in Artificial Intelligence: Proceedings of the
Twelfth Conference, pp. 49-56, August 1996.
[3] C. Boutilier and R. Dearden, “Using Abstractions for Decision-Theoretic Planning
with Time Constraints,” Proceedings of the Twelfth National Conference on Artificial
Intelligence, pp. 1016-1022, 1994.
[4] D. J. Brudnicki and D. B. Kirk, “Trajectory Modeling for Automated En Route Air
Traffic Control (AERA),” Proceedings of the American Control Conference, pp. 3425-
3429, June 1995.
[5] A. R. Cassandra, L. P. Kaelbling, and M. L. Littman, "Acting Optimally in Partially
Observable Stochastic Domains," Proceedings of the Twelfth National Conference on
Artificial Intelligence, 1994.
[6] T. L. Dean, “Decision Theoretic Planning and Markov Decision Processes”, a tutorial
presented at the Summer Institute on Probability and Artificial Intelligence, Corvalis,
Oregon, 1994. (Found at http://www.cs.brown.edu/people/tld/ )
[7] T. L. Dean, L. P. Kaelbling, J. Kirman, and A. Nicholson, “Planning with Deadlines
in Stochastic Domains,” Proceedings of AAAI, pp. 574-579, July 1993.
[8] R. E. Fikes, and N. J. Nilsson, “STRIPS: a new approach to the application of
theorem proving to problem solving,” Artificial Intelligence, vol. 2, no. 3-4, pp. 189-208,
1971.
86
[9] A. J. Garvey and V. R. Lesser, “Design-to-time real-time scheduling,” IEEE
Transactions on Systems, Man and Cybernetics, vol. 23 no. 6, pp. 1491-1502, 1993.
[10] M. L. Ginsberg, "Universal Planning: An (Almost) Universally Bad Idea," AI
Magazine, vol. 10, no. 4, 1989.
[11] F. F. Ingrand and M. P. Georgeff, "Managing Deliberation and Reasoning in Real-
Time AI Systems," in Proc. Workshop on Innovative Approaches to Planning, Scheduling
and Control, pp. 284-291, November 1990.
[12] E. Horvitz and M. Barry, “Display of Information for Time-Critical Decision
Making,” Proceedings of UAI-95, August 1995.
[13] Krishna and K. G. Shin, Real-Time Systems, McGraw-Hill, 1996.
[14] B. C. Kuo, Automatic Control Systems, sixth edition, Prentice-Hall, Englewood
Cliffs, New Jersey, 1991.
[15] N. K. Kushmerick, S. Hanks, D. Weld, “An Algorithm for Probabilistic Least-
Commitment Planning,” Proc. of AAAI, pp. 1073-1078, July 1994.
[16] D. A. Lawrence and W. J. Rugh, “Gain Scheduling Dynamic Linear Controllers for a
Nonlinear Plant,” Automatica, vol. 31, no. 3, pp. 381-390, March 1995.
[17] S. Liden, “The Evolution of Flight Management Systems,” Proceedings of the 1994
IEEE/AIAA Thirteenth Digital Avionics Systems Conference, IEEE, pp. 157-169, 1995.
[18] M. L. Littman, T. L. Dean, and L. P. Kaelbling, “On the Complexity of Solving
Markov Decision Problems,” Proceedings of UAI-95, August 1995.
[19] C. B. McVey, “Development of Feedback for Real-Time Scheduling and Planning in
CIRCA,” Directed Study Report, University of Michigan, December 1996.
87
[20] D. J. Musliner, E.H. Durfee, and K.G. Shin, "World Modeling for the Dynamic
Construction of Real-Time Control Plans", Artificial Intelligence, vol. 74, no. 1, pp. 83-
127, 1995.
[21] D. J. Musliner, “Scheduling Issues Arising from Automated Real-Time System
Design,”. University of Maryland Technical Report CS-TR-3364, UMIACS-TR-94-118,
1994.
[22] D. J. Musliner, “CIRCA: The Cooperative Intelligent Real-Time Control
Architecture,” Ph.D. Thesis, The University of Michigan, Ann Arbor, MI, 1993.
[23] NTSB/ARC-94/02, Annual Review of Aircraft Accident Data: U.S. Air Carrier
Operations Calendar Year 1992, National Transportation Safety Board, June 1994.
[24] J. R. Quinlan, "Induction of Decision Trees," Machine Learning, vol. 1, pp. 81-106,
1986.
[25] R. Rainey, ACM: The Aerial Combat Simulation for X11. February 1994.
[26] O. R. Reynolds, H. Pachter, and C. H. Houpis, “Full Envelope Flight Control
System Design using Qualitative Feedback Theory,” Journal of Guidance, Control, and
Dynamics, vol. 29, no. 1, pp. 23-29, January-February 1996.
[27] S. J. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, Prentice-
Hall, Englewood Cliffs, New Jersey, 1995.
[28] E. D. Sacerdoti, “Planning in a Hierarchy of Abstraction Spaces,” Artificial
Intelligence, vol. 5, no. 2, pp. 115-135, 1974.
[29] R. M. Sanner and J. J. E. Slotine, “Function Approximation, 'Neural' Networks,
and Adaptive Nonlinear Control,” Proceedings of the IEEE Conference on Control
Applications, vol. 2, pp. 1225-1232, 1994.
[30] M. J. Schoppers, "Universal Plans for Reactive Robots in Unpredictable
Environments," in Proc. Int'l Joint Conf. on Artificial Intelligence, pp. 1039-1046, 1987.
88
[31] J. M. Schreur, “B737 Flight Management Computer Flight Plan Trajectory
Computation and Analysis,” Proceedings of the American Control Conference, pp. 3419-
3429, June 1995.
[32] R. A. Slattery, “Terminal Area Trajectory Synthesis for Air Traffic Control
Automation,” Proceedings of the American Control Conference, pp. 1206-1210, June
1995.
[33] J. Tash and S. Russell, “Control Strategies for a Stochastic Planner,” Proceedings of
AAAI, vol. 2, pp. 1079-1085, 1994.
[34] D. Tilden, “GPS and Air Traffic Control: Start with a Clean Sheet of Paper,”
Proceedings of ION GPS, vol. 1, pp. 909-911, 1994.
[35] S. Zilberstein, "Real-Time Robot Deliberation by Compilation and Monitoring of
Anytime Algorithms," AAAI Conference, pp. 799-809, 1994