current trends in automated planning - university of …nau/papers/nau2007current.pdf · current...

■ Automated planning technology has becomemature enough to be useful in applications thatrange from game playing to control of spacevehicles. In this article, Dana Nau discusseswhere automated-planning research has been,where it is likely to go, where he thinks it shouldgo, and some major challenges in getting there.The article is an updated version of Nau’s invit-ed talk at AAAI-05 in Pittsburgh, Pennsylvania.

In ordinary English, plans can be many dif-ferent kinds of things, such as project plans,pension plans, urban plans, and floor plans.

In automated-planning research, the wordrefers specifically to plans of action:

representations of future behavior ... usually aset of actions, with temporal and other con-straints on them, for execution by some agentor agents.1

One motivation for automated-planningresearch is theoretical: planning is an impor-tant component of rational behavior—so if oneobjective of artificial intelligence is to grasp thecomputational aspects of intelligence, then cer-tainly planning plays a critical role. Anothermotivation is very practical: plans are neededin many different fields of human endeavor,and in some cases it is desirable to create theseplans automatically. In this regard, automated-planning research has recently achieved sever-al notable successes such as the Mars Rovers,software to plan sheet-metal bending opera-tions, and Bridge Baron. The Mars Rovers (seefigure 1) were controlled by planning and

scheduling software that was developed joint-ly by NASA’s Jet Propulsion Laboratory andAmes Research Center (Estlin et al. 2003). Soft-ware to plan sheet-metal bending operations(Gupta et al. 1998) is bundled with Amada Cor-poration’s sheet-metal bending machines suchas the one shown in figure 2. Finally, softwareto plan declarer play in the game of bridgehelped Bridge Baron to win the 1997 worldchampionship of computer bridge (Smith,Nau, and Throop 1998).

The purpose of this article is to summarizethe current status of automated-planningresearch, and some important trends andfuture directions. The next section includes aconceptual model for automated planning,classifies planning systems into several differ-ent types, and compares their capabilities andlimitations; and the Trends section discussesdirections and trends.

Conceptual Model for PlanningA conceptual model is a simple theoreticaldevice for describing the main elements of aproblem. It may fail to address several of thepractical details but still can be very useful forgetting a basic understanding of the problem.In this article, I’ll use a conceptual model forplanning that includes three primary parts (seefigure 4a and 4b, which are discussed in the fol-lowing sections: a state-transition system, whichis a formal model of the real-world system forwhich we want to create plans; a controller,which performs actions that change the stateof the system; and a planner, which producesthe plans or policies that drive the controller.

Articles

WINTER 2007 43Copyright © 2007, American Association for Artificial Intelligence. All rights reserved. ISSN 0738-4602

Current Trends in Automated

Planning

Dana S. Nau

AI Magazine Volume 28 Number 4 (2007) (© AAAI)

State-Transition SystemsFormally, a state-transition system (also called adiscrete-event system) is a 4-tuple Σ = (S, A, E, γ),where

S = {s0, s1, s2, …} is a set of states;

A = {a1, a2, …} is a set of actions, that is, statetransitions whose occurrence is controlled bythe plan executor;

E = {e1, e2, …} is a set of events, that is, state tran-sitions whose occurrence is not controlled bythe plan executor;

γ : S × (A ∪ E) → 2S is a state-transition function.

A state-transition system may be representedby a directed graph whose nodes are the statesin S (for example, see figure 5). If s′ ∈ γ (s, e),where e ∈ A ∪ E is an action or event, then thegraph contains a state transition (that is, an arc)

from s to s′ that is labeled with the action orevent e.2

If a is an action and γ(s, a) is not empty, thenaction a is applicable to state s: if the plan execu-tor executes a in state s, this will take the systemto some state in γ(s, a).

If e is an event and γ(s, e) is not empty, then emay possibly occur when the system is in state s.This event corresponds to the internal dynamicsof the system, and cannot be chosen or triggeredby the plan executor. Its occurrence in state s willbring the system to some state in γ(s, e).

Given a state-transition system Σ, the pur-pose of planning is to find which actions toapply to which states in order to achieve someobjective, when starting from some given situ-ation. A plan is a structure that gives the appro-priate actions. The objective can be specified in

Articles

44 AI MAGAZINE

Figure 1. One of the Mars Rovers.

several different ways. The simplest specifica-tion consists of a goal state sg or a set of goalstates Sg. For example, if the objective in figure5 is to have the container loaded onto the robotcart, then the set of goal states is Sg = {s4, s5}. Inthis case, the objective is achieved by anysequence of state transitions that ends at one ofthe goal states. More generally, the objectivemight be to get the system into certain states, tokeep the system away from certain other states,to optimize some utility function, or to performsome collection of tasks.

PlannersThe planner’s input is a planning problem,which includes a description of the system Σ,an initial situation and some objective. Forexample, in figure 5, a planning problem Pmight consist of a description of Σ, the initialstate s0, and a single goal state s5.

The planner’s output is a plan or policy thatsolves the planning problem. A plan is asequence of actions such as

<take, move1, load, move2>.

A policy is a partial function from states intoactions, such as

{(s0, take), (s1, move1), (s3, load), (s4, move2)}.3

The aforementioned plan and policy both solvethe planning problem P. Either of them, if exe-cuted starting at the initial state s0, will take Σthrough the sequence of states �s1, s2, s3, s4, s5�.4

In general, the planner will produce actionsthat are described at an abstract level. Hence itmay be impossible to perform these actionswithout first deciding some of the details. Inmany planning problems, some of these detailsinclude what resources to use and what time todo the action.

What Resources to Use. Exactly what is meantby a resource depends on how the problem isspecified. For example, if Σ contained morethan one robot, then one approach would beto require the robot’s name as part of the action(for example, move1(robot) and move1(robot2)),and another approach would be to considerthe robot to be a resource whose identity willbe determined later.

What Time to Do the Action. For example, in

Articles

WINTER 2007 45

Figure 2. A Sheet-Metal Bending Machine.

(Figure used with permission of Amada Corporation).

order to load the container onto the robot, wemight want to start moving the crane beforethe robot arrives at location1, but we cannotcomplete the load operation until after therobot has reached location1 and has stoppedmoving.

In such cases, one approach is to have a sep-arate program called a scheduler that sits inbetween the planner and the controller (see fig-ure 4c), whose purpose is to determine thosedetails. Another approach is to integrate thescheduling function directly into the planner.The latter approach can substantially increasethe complexity of the planner, but on complexproblems it can be much more efficient thanhaving a separate scheduler.

ControllersThe controller’s input consists of plans (orschedules, if the system includes a scheduler)

and observations about the current state of thesystem. The controller’s output consists ofactions to be performed in the state-transitionsystem.

In figure 4, notice that the controller isonline. As it performs its actions, it receivesobservations, each observation being a collec-tion of sensor inputs giving information aboutΣ’s current state. The observations can be mod-eled as an observation function η : S → O thatmaps S into some discrete set of possible obser-vations. Thus, the input to the controller is theobservation o = η(s), where s is the currentstate.

If η is a one-to-one function, then from eachobservation o we can deduce exactly what stateΣ is in. In this case we say that the observationsprovide complete information. For example, infigure 5, if there were a collection of sensorsthat always provided the exact locations of the

Articles

46 AI MAGAZINE

(North – Q)… …

PlayCard(P3;S,R3)PlayCard(P2;S,R2) PlayCard(P4;S,R4)

FinesseFour(P4;S)

PlayCard(P1;S,R1)

StandardFinesseTwo(P2;S)

LeadLow(P1;S)

PlayCard(P4;S,R4')

StandardFinesseThree(P3;S)

EasyFinesse(P2;S) BustedFinesse(P2;S)

FinesseTwo(P2;S)

StandardFinesse(P2;S)

Finesse(P1;S) Us:East declarer, West dummyOpponents:defenders, South & North Contract: East – 3NTOn lead: West at trick 3 East: KJ74

West: A2Out: QT98653

(North – 3)

East – J

West – 2

North – 3 South – 5 South – Q

Figure 3. A Hierarchical Plan for a Finesse in Bridge.

robot and the container, then this sensorwould provide complete information—or, atleast, complete information for the level ofabstraction used in the figure.

If η is not a one-to-one function, then thebest we can deduce from an observation o is thatΣ is in one of the states in the set η–1(o) ⊆ S, andin this case we say that the observations pro-vide incomplete information about Σ’s currentstate. For example, in figure 5, if we had a sen-sor that told us the location of the robot butnot the location of the container, this sensorwould provide incomplete information.

Offline and Online PlanningThe planner usually works offline, that is, itreceives no feedback about Σ’s current state.Instead, it relies on a formal description of thesystem, together with an initial state for theplanning problem and the required objective.It is not concerned with the actual state of thesystem at the time that the planning occurs butinstead with what states the system may be inwhen the plan is executing.

Most of the time, there are differencesbetween Σ and the physical system it repre-sents—for example, the state-transition systemin figure 5 does not include the details of thelow-level control signals that the controller willneed to transmit to the crane and the robot. As

a consequence, the controller and the planmust be robust enough to cope with differ-ences between Σ and the real world. If the dif-ferences can become too great for the con-troller to handle from what is expected in theplan, then more complex control mechanismswill be required than what we have describedso far. For example, the planner (and scheduler,if there is one) may need feedback about thecontroller’s execution status (see the dashedlines in the figure), so that planning and actingcan need be interleaved. Interleaving planningand acting will require the planner and sched-uler to incorporate mechanisms for supervi-sion, revision, and regeneration of plans andschedules.

Types of PlannersAutomated planning systems can be classifiedinto the following categories, based on whe -ther—and in what way—they can be config-ured to work in different planning domains:domain-specific planners, domain-indepen-dent planners, and domain-configurable plan-ners.

Domain-specific planners are planning sys-tems that are tailor-made for use in a givenplanning domain and are unlikely to work inother domains unless major modifications aremade to the planning system. This class

Articles

WINTER 2007 47

Planner

System

Actions Observations

Controller

Plans

Descriptions of ,s0 , and objectives

Events

Planner

System

Actions

Events

Observations

Controller

Execution status

Plans

Descriptions of ,s0, and objectives

Planner

System

Actions

Events

Observations

Controller

Execution status (if planning is online)

Plans

Scheduler

Schedules

Descriptions of ,s0, and objectives

Figure 4. Simple Conceptual Models for (a) Offline Planning, (b) Online Planning, and (c) Planning with a Separate Scheduler.

includes most of the planners that have beendeployed in practical applications, such as theones depicted in figures 1–3.

In domain-independent planning systems, thesole input to the planner is a description of aplanning problem to solve, and the planningengine is general enough to work in any plan-ning domain that satisfies some set of simpli-fying assumptions. The primary limitation ofthis approach is that it isn’t feasible to developa domain-independent planner that works effi-ciently in every possible planning domain: forexample, one probably wouldn’t want to usethe same planning system to play bridge, bendsheet metal, or control a Mars Rover. Hence, inorder to develop efficient planning algorithms,it has been necessary to impose a set of simpli-fying assumptions that are too restrictive toinclude most practical planning applications.

Domain-configurable planners are planningsystems in which the planning engine isdomain-independent but the input to theplanner includes domain-specific knowledge toconstrain the planner’s search so that the plan-ner searches only a small part of the searchspace. Examples of such planners include HTNPlanners such as O-Plan (Tate, Drabble, and Kir-by 1994), SIPE-2 (Wilkins 1988), and SHOP2(Nau et al. 2003), in which the domain-specif-ic knowledge is a collection of methods for

decomposing tasks into subtasks, and control-rule planners such as TLPlan (Bacchus andKabanza 2000) and TALplanner (Kvarnströmand Doherty 2001), in which the domain-spe-cific knowledge is a set of rules for how toremove nodes from the search space.5

The next two sections are overviews ofdomain-independent and domain-config-urable planners. This article does not include asimilar overview of domain-specific planners,since each of these planners depends on thespecific details of the problem domain forwhich it was constructed.

Domain-Independent PlanningFor nearly the entire time that automated plan-ning has existed, it has been dominated byresearch on domain-independent planning.Because of the immense difficulty of devising adomain-independent planner that would workwell in all planning domains, most researchhas focused on classical planning domains, thatis, domains that satisfy the following set ofrestrictive assumptions:

Assumption A0 (Finite Σ). The system Σ has afinite set of states.

Assumption A1 (Fully Observable Σ). The systemΣ is fully observable, that is, one has completeknowledge about the state of Σ; in this case the

Articles

48 AI MAGAZINE

move1

move2

move1 load

unload

move2

location 1 location 2

s0


s1 s4


s5



s3


s2

move2

move1

take puttake put

Figure 5. A State-Transition System for a Simple Domain Involving a Crane and a Robot for Transporting Containers.

observation function η is the identity function.

Assumption A2 (Deterministic Σ). The system Σ isdeterministic, that is, for every state s and eventor action u, |γ(s, u)| ≤ 1. If an action is applica-ble to a state, its application brings a determin-istic system to a single other state. Similarly forthe occurrence of a possible event.

Assumption A3 (Static Σ). The system Σ is static,that is, the set of events E is empty. Σ has nointernal dynamics; it stays in the same stateuntil the controller applies some action.6

Assumption A4 (Attainment Goals). The onlykind of goal is an attainment goal, which is spec-ified as an explicit goal state or a set of goalstates Sg. The objective is to find any sequenceof state transitions that ends at one of the goalstates. This assumption excludes, for example,states to be avoided, constraints on state trajec-tories, and utility functions.

Assumption A5 (Sequential Plans). A solutionplan to a planning problem is a linearly orderedfinite sequence of actions.

Assumption A6 (Implicit Time). Actions andevents have no duration, they are instanta-neous state transitions. This assumption isembedded in the state-transition model, whichdoes not represent time explicitly.

Assumption A7 (Off-line Planning). The planneris not concerned with any change that mayoccur in Σ while it is planning; it plans for thegiven initial and goal states regardless of thecurrent dynamics, if any.

In summary, classical planning requires com-plete knowledge about a deterministic, static,finite system with restricted goals and implicittime. Here planning reduces to the followingproblem:

Given Σ = (S, A, γ), an initial state s0 and a subsetof goal states Sg, find a sequence of actions cor-responding to a sequence of state transitions(s0, s1, …, sk) such that s1 ∈ γ (s0, a1), s2 ∈ γ (s1, a2),…, sk ∈ γ (sk−1, ak), and sk ∈ sg.

Classical planning may appear trivial: plan-ning is simply searching for a path in a graph,which is a well understood problem. Indeed, ifwe are given the graph Σ explicitly then thereis not much more to say about planning forthis restricted case. However, it can be shown(Ghallab, Nau, and Traverso 2004) that even invery simple problems, the number of states inΣ can be many orders of magnitude greaterthan the number of particles in the universe!Thus it is impossible in any practical sense tolist all of Σ’s states explicitly. This establishesthe need for powerful implicit representationsthat can describe useful subsets of S in a waythat both is compact and can easily besearched.

The simplest representation for classicalplanning is a set-theoretic one: a state s is repre-

sented as a collection of propositions, the set ofgoal states Sg is represented by specifying a col-lection of propositions that all states in Sg mustsatisfy, and an action a is represented by givingthree lists of propositions: preconditions to bemet in a state s for an action a to be applicablein s, propositions to assert, and propositions toretract from s in order to get the resulting stateγ(s, a). A plan is any sequence of actions, andthe plan solves the planning problem if, start-ing at s0, the sequence of actions is executable,producing a sequence of states whose finalstate is in Sg.

7

A more expressive representation is the clas-sical representation: starting with a function-freefirst-order language L, a state s is a collection ofground atoms, and the set of goal states Sg isrepresented by an existentially closed collec-tion of atoms that all states must satisfy. Anoperator is represented by giving two lists ofground or unground literals: preconditions andeffects. An action is a ground instance of anoperator. A plan is any sequence of actions, andthe plan solves the planning problem if, start-ing at s0, the sequence of actions is executable,producing a sequence of states whose finalstate satisfies in Sg. The de facto standard forclassical planning is to use some variant of thisrepresentation.

Classical Planning AlgorithmsIn the following paragraphs, I provide briefsummaries of some of the best-known tech-niques for classical planning: plan-space plan-ning, planning graphs, state-space planning,and translation into other problems.

Plan-Space Planning. In plan-space plan-ning, the basic idea is to plan for a set of goals{g1, …, gk} by planning for each of the individ-ual goals more-or-less separately, but maintain-ing various bookkeeping information to detectand resolve interactions among the plans forthe individual goals. For example, a simpledomain called the Dock Worker Robotsdomain (Ghallab, Nau, and Traverso 2004),which includes piles of containers, robots thatcan carry containers to different locations, andcranes that can put containers onto robots ortake them off of robots. Suppose there are sev-eral piles of containers as shown in state s0 offigure 6, and the objective is to rearrange themas shown in state sg. Then a plan-space plannersuch as UCPOP (Penberthy and Weld, 1992)will produce a partially ordered plan like theone shown in the figure.

Planning Graphs. A planning graph is astructure such as the one shown in figure 7.For each n, level n includes every action a suchthat at level n – 1, a’s preconditions are satis-

Articles

WINTER 2007 49

fied and do not violate certain kinds of mutu-al-exclusion constraints. The literals at level ninclude the literals at level n – 1, plus all effectsof all actions at level n. Thus the planninggraph represents a relaxed version of the plan-ning problem in which several actions canappear simultaneously even if they conflictwith each other. The basic GraphPlan algo-rithm is as follows:

For n = 1, 2, ... until a solution is found,

Create a planning graph of n levels.

Do a backwards state-space search from thegoal to try to find a solution plan, but restrictthe search to include only the actions in theplanning graph.

The planning graph can be computed relative-ly quickly (that is, in a polynomial amount oftime), and the restriction that the backwardsearch must operate within the planning graphdramatically improves the efficiency of thebackward search. As a result, GraphPlan runsmuch faster than plan-space planning algo-rithms. Researchers have created a large num-ber of planning algorithms based on Graph-plan, including IPP, CGP, DGP, LGP, PGP, SGP,TGP, and others.8

State-Space Planning. Although state-spacesearch algorithms are very well known, it wasnot until a few years ago that they began toreceive much attention from classical planningresearchers, because nobody knew how tocome up with a good heuristic function toguide the search. The breakthrough camewhen it was realized that heuristic values couldbe computed relatively quickly by extracting

them from relaxed solutions (such as the plan-ning graphs discussed earlier). This has led toplanning algorithms such as HSP (Bonet andGeffner 1999) and FastForward (Hoffmann andNebel 2001).

Translation Into Other Problems. Here,the basic idea consists of three steps. First,translate the planning problem into anotherkind of combinatorial problem—such as satis-fiability or integer programming—for whichefficient problem solvers already exist. Second,use a satisfiability solver or integer-program-ming solver to solve the translated problem.Third, take the solution found by the problemsolver and translate it into a plan. Thisapproach has led to planners such as Satplan(Kautz and Selman 1992).

Limitations of Classical PlanningFor nearly the entire time that automated plan-ning has existed, it has been dominated byresearch on classical planning. In fact, formany years the term domain-independent plan-ning system was used almost synonymouslywith classical planning, as if there were no lim-itations on what kind of planning domainscould be represented as classical planningdomains. But since classical planning requiresall of the restrictive assumptions in theDomain-Independent Planning section, itactually is restricted to a very narrow class ofplanning domains that exclude most problemsof practical interest. For example, Mars explo-ration (figure 1) and sheet-metal bending (fig-ure 2) satisfy none of the assumptions in the

Articles

50 AI MAGAZINE

put(c,p2)

take(a,p1)put(a,b)

put(b,c)take(b,p3)

Goals:on(c,p2)on(a,b)on(b,c)

take(c,p1)

s0

a bc

p1 p2 p3

sg

cba

p1 p2 p3

Figure 6. During the Operation of a Plan-Space Planner, Each Node of the Search Space is a Partial Plan like the One Shown Here.

On the left-hand side is the initial state s0, on the right-hand side is the goal state sg, and in the middle is a collection of actions and con-straints. Planning proceeds by introducing more and more constraints or actions, until the plan contains no more flaws (that is, it is guar-anteed to be executable and to produce the goal.

Domain-Independent Planning section, andthe game of bridge (figure 3) satisfies only A0,A2, and A6.

On the other hand, several of the conceptsdeveloped in classical planning have beenquite useful in developing planners for non-classical domains. This will be discussed furtherin the Directions for Growth section.

Domain-Configurable PlannersDomain-specific and domain-configurableplanners make use of domain-specific knowl-edge to constrain the search to a small part ofthe search space. As an example of why thismight be useful, consider the task of travelingfrom the University of Maryland in CollegePark, Maryland, to the LAAS research center inToulouse, France. We may have a number ofactions for traveling: walking, riding a bicycle,roller skating, skiing, driving, taking a taxi, tak-ing a bus, taking a train, flying, and so forth.We may also have a large number of locationsamong which one may travel: for example, allof the cities in the world. Before finding a solu-tion, a domain-independent planner might

first construct a huge number of nonsensicalplans, such as the following one:

Walk from College Park to Baltimore, then bicy-cle to Philadelphia, then take a taxi to Pitts-burgh, then fly to Chicago, ….

In contrast, anyone who has ever had muchpractical experience in traveling knows thatthere are only a few reasonable options to con-sider. For traveling from one location to anoth-er, we would like the planner to concentrateonly on those options. To do this, the plannerneeds some domain-specific information abouthow to create plans.

In a domain-specific planner, the domain-specific information may be encoded into theplanning engine itself. Such a planner can bequite efficient in creating plans for the targetdomain but will not be usable in any otherplanning domain. If one wants to create a plan-ner for another domain—for example, plan-ning the movements of a robot cart—one willneed to build an entirely new planner.

In a domain-configurable planner, the plan-ning engine is domain independent, but theinput to the planner includes a domain descrip-tion, that is, a collection of domain-specific

Articles

WINTER 2007 51

Level 0 Level 1 Level 2

take(b,p3)

take(c,a)

no-op

Actionsapplicableto subsets

of theliterals atlevel 0

Effects ofthose actions

Actions applicableto subsets of theliterals at level 1

Literals in theinitial state

take(a,p1)

put(b,c)

take(b,p3)

take(c,a)

put(b,p2)

put(b,a)

put(c,b)

put(c,p2)

put(c,a)

no-op

put(b,p3)

on(b,p3), on(c,a),holding(nothing),clear(c), clear(p2),clear(b), …

clear(c), clear(p2),clear(b), clear(p3),on(c,a), on(b,p3),holding(c),holding(b),holding(nothing), …

s0

a bc

p1 p2 p3

Effects ofthose actions

…

Figure 7. Part of a Planning Graph.

The figure omits several details (for example, mutual-exclusion relationships). The 0’th level contains all literals that are true in the initialstate s0. For each i, level i + 1 includes all actions that are applicable to the literals (or some subset of them) at level i and all of these actions’effects.

knowledge written in a language that is under-standable to the planning engine. Thus, theplanning engine can be reconfigured to workin another problem domain by giving it a newdomain description.

The existing domain-configurable plannerscan be divided into two types: hierarchical tasknetwork (HTN) planners and control-rule plan-ners. These planners are discussed in the fol-lowing two sections.

HTN PlannersIn a hierarchical task network (HTN) planner,the planner’s objective is described not as a setof goal states but instead as a collection of tasksto perform.9 Planning proceeds by decompos-ing tasks into subtasks, subtasks into subsub-tasks, and so forth in a recursive manner untilthe planner reaches primitive tasks that can beperformed using actions similar to the actionsused in a classical planning system. To guidethe decomposition process, the planner uses acollection of methods that give ways of decom-posing tasks into subtasks.

As an illustration, figure 8 shows two meth-ods for the task of traveling from one locationto another: air travel and taxi travel. Travelingby air involves the subtasks of purchasing aplane ticket, traveling to the local airport, fly-ing to an airport close to our destination, andtraveling from there to our destination. Trav-eling by taxi involves the subtasks of calling ataxi, riding in it to the final destination, andpaying the driver. The preconditions specifythat the air-travel method is applicable only forlong distances and the taxi-travel method isapplicable only for short distances. Now, con-sider again the task of traveling from the Uni-

versity of Maryland to LAAS. Since this is along distance, the taxi-travel method is notapplicable, so we must choose the air-travelmethod. As shown in figure 9, this decompos-es the task into the following subtasks: (1) pur-chase a ticket from IAD (Washington Dulles)airport to TLS (Toulouse Blagnac), (2) travelfrom the University of Maryland to IAD, (3) flyfrom IAD to TLS, and (4) travel from TLS toLAAS.

For the subtasks of traveling from the Uni-versity of Maryland to BWI and traveling fromLogan to MIT, we can use the taxi-travelmethod to produce additional subtasks asshown in figure 9.

HTN-planning research has been much moreapplication-oriented than most other AI-plan-ning research. Domain-configurable systemssuch as O-Plan (Tate, Drabble, and Kirby 1994),SIPE-2 (Wilkins 1988), and SHOP2 (Nau et al.2003) have been used in a variety of applica-tions, and domain-specific HTN planning sys-tems have been built for several applicationdomains (for example, Smith, Nau, and Throop[1998]).

Control-Rule PlannersIn a control-rule planner, the domain-specif-ic information is a set of rules describing con-ditions under which the current node can bepruned from the search space. In most cases,the planner does a forward search, startingfrom the initial state, and the control rulesare written in some form of temporal logic.Don’t read too much into the name temporallogic, because the logical formalisms used inthese planners provide only a simple repre-sentation of time, as a sequence of states of

Articles

52 AI MAGAZINE

Method: taxi-travel(x, y)Task: travel(x, y)Precond: short-distance(x, y)Subtasks: get-taxi

ride(x, y)pay-driver

Method: air-travel(x, y)Task: travel(x, y)Precond: long-distance(x, y)Subtasks: get-ticket(a(x), a(y))

travel(x, a(x))fly(a(x), a(y))travel(a(y), y)

Method: get-ticket(x, y)Task: travel(x, y)Precond: short-distance(x, y)Subtasks: find-flights(x, y)

select-flight(x, y, f)buy-ticket(f)

Figure 8. HTN Methods for a Simple Travel-Planning Domain.

In this example, the subtasks are to be done in the order that they appear.

the world s0 = the initial state, s1 = γ (s0 , a1), s2

= γ (s1 , a2), … where a1, a2, … are the actionsof a plan.

As an example, the planning problem of fig-ure 5 is in a simple planning domain called theDock Worker Robots domain (Ghallab, Nau,and Traverso 2004), which includes piles ofcontainers, robots that can carry containers todifferent locations, and cranes that can putcontainers onto robots or take them off ofrobots. In this domain, suppose that the onlykind of goal that we have is to put various con-tainers into various piles. Then the followingcontrol rule may be useful:

Don’t ever pick up a container from the top ofany pile p unless at least one of the containersin p needs to be in another pile.

Here is a way to write this rule in the temporallogic used by TLPlan:

■■ [top(p, x) ∧ ¬∃[y:in(y, p)] ∃[q: GOAL(in(y, q)](q ≠ p) ⇒ 〇 (¬∃[k:holding(k, x)]) ]

What this rule says, literally, is the following: In every state of the world, if x is at the top of pand there are no y and q such that (1) y is in p,(2) there’s a goal saying that y should be in q,and (3) q ≠ p, then in the next state, no crane isholding x.

If a planner such as TLPlan reaches a state thatdoes not satisfy this rule, it will backtrack andtry a different direction in its search space.

ComparisonsThe three types of planners — domain-inde-pendent, domain-specific, and domain-config-urable — compare with each other in the fol-lowing respects (see table 1): (1) effort toconfigure the planner for a new domain, (2)performance in a given domain, and (3) cover-age across many domains.

Effort to Configure the Planner for a NewDomain. Domain-specific planners require themost effort to configure, because one mustbuild an entire new planner for each newdomain—an effort that may be quite substan-tial. Domain-independent planners (provided,of course, that the new domain satisfies therestrictions of classical planning) require theleast effort, because one needs to write onlydefinitions of the basic actions in the domain.Domain-configurable planners are somewherein between: one needs to write a domaindescription but not an entire planner.

Performance in a Given Domain. A domain-independent planner will typically have theworst level of performance because it will nottake advantage of any special properties of thedomain. A domain-specific planner, providedthat one makes the effort to write a good one,will potentially have the highest level of per-formance, because one can encode domain-specific problem-solving techniques directly

Articles

WINTER 2007 53

get-ticket(IAD, TLS)

travel(UMD, IAD)

fly(IAD, Toulouse)travel(TLS, LAAS)

get-taxiride(TLS,Toulouse)pay-driver

find-flights(IAD,TLS)select-flight(IAD,TLS)buy-ticket

get-taxiride(UMD, IAD)pay-driver

get-ticket(BWI, TLS)

travel(UMD, BWI)fly(BWI, Toulouse)travel(TLS, LAAS)

find-flights(BWI,TLS)select-flight(BWI,TLS)

buy-ticket

BACKTRACK

travel(UMD, LAAS)

Figure 9. Backtracking Search to Find a Plan for Traveling from UMD to LAAS.

The downward arrows indicate task decompositions; the upward ones indicate backtracking.

into the planner. A sufficiently capabledomain-configurable planner should havenearly the same level of performance, becauseit should be possible to encode the samedomain-specific problem-solving techniquesinto the domain description that one mightencode into a domain-specific planner.10

Coverage across Many Domains. A domain-specific planner will typically work in only oneplanning domain, hence will have the leastcoverage. One might think that domain-inde-pendent and domain-configurable plannerswould have roughly the same coverage—but inpractice, domain-configurable planners havegreater coverage. This is due partly to efficien-cy and partly to expressive power.

As an example, let us consider the series ofsemiannual International Planning Competi-tions, which have been held in 1998, 2000,2002, 2004, and 2006. All of the competitionsincluded domain-independent planners; andin addition, the 2000 and 2002 competitionsincluded domain-configurable planners. In the2000 and 2002 competitions, the domain-con-figurable planners solved the most problems,solved them the fastest, usually found bettersolutions, and worked in many nonclassicalplanning domains that were beyond the scopeof the domain-independent planners.

A Cultural BiasIf domain-configurable planners perform bet-ter and have more coverage than domain-inde-pendent ones, then why were there nodomain-configurable planners in the 2004 and2006 International Planning Competitions?One reason is that it is hard to enter them inthe competition, because you must write all ofthe domain knowledge yourself. This is toomuch trouble except to make a point. Theauthors of TLPlan, TALplanner, and SHOP2 feltthey had already made their point by demon-strating the high performance of their plannersin the 2002 competition, hence they didn’t feelmotivated to enter their planners again.

So why not revise the International PlanningCompetitions to include tracks in which thenecessary domain descriptions are provided tothe contestants? There are two reasons. Thefirst is that unlike classical planning, in whichPDDL11 is the standard language for represent-ing planning domains, there is no standarddomain-description representation for do -main-configurable planners. The second is thatthere is a cultural bias against the idea. Forexample, when Drew McDermott, in an invit-ed talk at the 2005 International Conferenceon Planning and Scheduling (ICAPS-05), pro-posed adding an HTN-planning track to theInternational Planning Competition, severalaudience members objected that the use ofdomain-specific knowledge in a planner some-how constitutes “cheating.”

Whenever I’ve discussed this bias withresearchers in fields such as operationsresearch, control theory, and engineering, theygenerally have found it puzzling. A typicalreaction has been, “Why would anyone notwant to use the knowledge they have about aproblem they’re trying to solve?”

Historically, there is a very good reason forthe bias: it was necessary and useful for thedevelopment of automated planning as aresearch field. The intended focus of the field isplanning as an aspect of intelligent behavior, andit would have been quite difficult to developthis focus if the field had been tied too closelyto any particular application area or set ofapplication areas.

While the bias has been very useful histori-cally, I would argue that it is not so useful anymore: the field has matured, and the bias is toorestrictive. Application domains in whichhumans want to do planning and schedulingtypically have the following characteristics: adynamically changing world; multiple agents(both cooperative and adversarial); imperfectand uncertain information about the world;the need to consult external informationsources (sensors, databases, human users) to getinformation about the world; time durations,

Articles

54 AI MAGAZINE

Up Front Human Effort Performance Coverage

Highest Domain Specific

Configurable

Lowest Domain Independent Domain Specific

Domain Specific

Configurable

Configurable

Domain Independent

Domain Independent

Table 1. Relative Comparisons of the Three Types of Planners.

time constraints, and overlapping actions;durations, time constraints, asynchronousactions; and numeric computations involvingresources, probabilities, geometric and spatialrelationships. Classical planning excludes all ofthese characteristics.

TrendsFortunately, automated-planning research ismoving away from the restrictions of classicalplanning. As an example, consider the evolu-tion of the international planning competi-tions: The 1998 competition concentratedexclusively on classical planning. The 2000competition concentrated primarily on classi-cal planning, but one of its tracks (one of theversions of the Miconic-10 elevator domain[Koehler and Schuster 2000]) included somenonclassical elements. The 2002 competitionadded some elementary notions of time dura-tions, resources. The 2004 competition addedinference rules and derived effects, plus a a newtrack for planning in probabilistic domains.The 2006 competition added soft goals, trajec-tory constraints, preferences, plan metrics, andconstraints expressed in temporal logic.

Another reason for optimism is the success-ful use of automated planning and schedulingalgorithms in high-profile projects such as theRemote Agent (on Deep Space 1) (Muscettola etal. 1998) and the Mars Rovers (Estlin et al.2003). Successes such as this create excitementabout building planners that work in the realworld; and applications such as the MarsRovers provide opportunities for synergybetween theory and applications: a betterunderstanding of real-world planning leads tobetter theories, and better theories lead to bet-ter real-world planners.

Finally, automated-planning research hasproduced some very powerful techniques forreducing the size of the search space, and thesetechniques can be generalized to work in non-classical domains. Examples include partial-order planning, HTN planning, and planningin nondeterministic and probabilistic domains.

Partial-Order Planning. The planning algo-rithms used in the Remote Agent and the MarsRovers are based on the plan-space planningtechnique described in the Classical PlanningAlgorithms subsection. Some of the primaryextensions are ways to reason about time, dura-tions, and resources.

HTN Planning. As I discussed earlier, HTNplanners have been used in a variety of appli-cation domains. Although most HTN plannershave been influenced heavily by concepts fromclassical planning, they incorporate capabili-

ties that go in various ways beyond the restric-tions of classical planning.

Planning in Nondeterministic and ProbabilisticDomains. In probabilistic planning domainssuch as Markov Decision Processes (Boutilier,Dean, and Hanks 1996), the actions have mul-tiple possible outcomes, with probabilities foreach outcome (see figure 10a). Nondeterminis-tic planning domains (Cimatti et al. 2003) aresimilar except that no probabilities areattached to the outcomes (see figure 10b).

A series of recent papers have shown how toextend domain-configurable planning tech-niques to work in nondeterministic (Kuter etal. 2005) and probabilistic (Kuter and Nau2005) planning domains. In comparison withprevious algorithms for such domains, thesenew algorithms exhibit substantial perform-ance advantages—advantages analogous to theones for domain-configurable algorithms inclassical planning domains (see the Domain-Configurable Planners section).

Directions for GrowthIn my view, some of the more important direc-tions for growth in the near future includeplanning in multiagent environments, reason-ing about time, dynamic external information,acquiring domain knowledge, and cross-polli-nation with other fields. I’ll discuss each ofthese in greater detail in the following subsec-tions.

Planning in Multiagent EnvironmentsAutomated planning research has traditionallyassumed that the planner is a monolithic pro-gram that solves the problem by itself. But inreal-world applications, the planner is general-ly part of a larger system in which there areother agents, either human or automated orboth. When these agents interact with theplanner, it is important for the planner to rec-ognize what those agents are trying to accom-plish, in order to generate an appropriateresponse. Examples of such situations includemixed-initiative and embedded planning,assisted cognition, customer service hotlines,and computer games.

Reasoning about TimeClassical planning uses a trivial model of time,consisting of a linear sequence of instanta-neous states s0, s1, s2, …; and several temporallogics do the same thing. A more comprehen-sive model of time would include (1) timedurations for actions, overlapping actions, andactions whose durations depend on the condi-tions under which they are executed; (2)resource assignments, and integrated plan-ning/scheduling; (3) continuous change (for

Articles

WINTER 2007 55

example, vehicle movement); and (4) tempo-rally extended goals (for example, conditionsthat must be maintained over time).

A healthy amount of growth is alreadyoccurring in this direction: for example, vari-ous forms of temporal planning were includedin the last three International Planning Com-petitions. But several things remain to be done:for example, the PDDL language used in thecompetitions does not yet allow actions whoseduration depends on the conditions underwhich they are executed.

Dynamic Environments. In most automat-ed-planning research, the information avail-able is assumed to be static, and the plannerstarts with all of the information it needs. Inreal-world planning, planners may need toacquire information from an informationsource such as a web service, during planningand execution. This raises questions such asWhat information to look for? Where to get it?How to deal with lag time and informationvolatility? What if the query for informationcauses changes in the world? If the plannerdoes not have enough information to infer allof the possible outcomes of the plannedactions, or if the plans must be generated inreal time, then it may not be feasible to gener-ate the entire plan in advance. Instead, it may

be necessary to interleave planning and planexecution.

Some of these questions, I believe, are suffi-ciently formalizable that a new track could bedeveloped for them in the International Plan-ning Competitions.

Acquiring Domain Knowledge. At manypoints in this article, I have emphasized thebenefits of using domain knowledge duringplanning, but this leaves the problem of howto acquire this domain knowledge—whetherthrough machine learning, human input, orsome combination of the two. This is one ofthe least appreciated problems for automated-planning research; but in my view it is one ofthe most important: if we had good ways ofacquiring domain knowledge, we could makeplanners hundreds of times more useful forreal-world problems.

Researchers are starting to realize the impor-tance of this problem. For example, at ICAPS-05 there was an informal “Knowledge Engi-neering Competition” that was more of anexposition than a competition: the partici-pants exhibited GUIs for creating knowledgebases and ways for planners to learn domainknowledge automatically.

A promising source of planning knowledgeis the immense collection of data that is now

Articles

56 AI MAGAZINE

Desired outcome

Undesired outcome

Initial state0.999

0.001Location 1 Location 2

Location 1 Location 2

Location 1 Location 2

Move tolocation 2

Figure 10. A Simple Example of an Action with a Probabilistic Outcome.

If the robot tries to move to location 2, there is a 99.9 percent chance that it will do so successfully and a 0.1 percent chance it will get aflat tire. An action with a nondeterministic outcome would be similar except that the probabilities would be omitted.

available on the web, which is likely to includeinformation about plans in many differentcontexts. Can we datamine these plans fromthe web?

Cross-Pollination with Other Fields. Var-ious kinds of planning are studied in many dif-ferent fields, including computer games, gametheory, operations research, economics, psy-chology, sociology, political science, industrialengineering, systems science, and control the-ory. Some of the approaches and techniquesdeveloped in these fields have much in com-mon. But it is difficult to tell what the rela-tionships are because the research groups areoften nearly disjoint, with different terminolo-gy, assumptions, and ideas of what’s impor-tant. Provided that the appropriate bridges canbe made among these fields, there is tremen-dous potential for cross-pollination. Markovdecision processes and computational culturaldynamics are two examples.

Markov decision processes (MDPs) are usedin operations research, control theory, and sev-eral different subfields of AI including auto-mated planning and reinforcement learning.The MDP models used in automated-planningresearch typically assume the rewards andprobabilities are known, but in reinforcementlearning (and often in OR and control theory)they are unknown. MDPs in automated-plan-ning research generally have finitely manystates with no good continuous approxima-tions, hence use discrete optimization—but theMDP models used in OR and control theoryinclude features such as infinitely many states,continuous sets of states, actions and costs andrewards that are differentiable functions, henceuse linear and nonlinear optimization tech-niques. Many important problems are hybridsof these differing MDP models, and it would beinteresting to combine and extend the tech-niques from the various fields.

We have instituted a new laboratory at theUniversity of Maryland called the Laboratoryfor Computational Cultural Dynamics. Itbrings together faculty in computer science,political science, psychology, criminology, sys-tems engineering, linguistics, and business.The objective is to develop the theory and algo-rithms needed for tools to support decisionmaking in cultural contexts, to help under-stand how/why decision makers in various cul-tures make decisions. The potential benefits ofsuch research include more effective cross-cul-tural interactions, better governance when dif-ferent cultures are involved, recovery fromconflicts and disasters, and improving qualityof life in developing countries.

ConclusionAutomated-planning research has made greatstrides in recent years. Historically, the fieldwas been limited by its focus on classical plan-ning—but the scope of the field is broadeningto include a variety of issues that are importantfor planning in the real world. As a conse-quence, automated-planning techniques arefinding increased use in practical settings rang-ing from space exploration to automated man-ufacturing. Some of the most important areasfor future growth of the field include reasoningabout other agents, temporal planning, plan-ning in dynamic environments, acquiringdomain knowledge, and the potential for cross-pollination with other fields.

Notes1. Austin Tate (MIT Encyclopedia of the Cognitive Sci-ences, 1999).

2. Here we are using a Markov game model of a state-transition function, which assumes that actions andevents cannot occur at the same time. To include cas-es where actions and events could occur simultane-ously, we would need γ : S × A × E → 2S.

3. Policies have traditionally been defined to be totalfunctions rather than partial functions. But it is notalways necessary to know what to do in every state ofS, because the policy may prevent the system fromever reaching some of the states.

4. Whether to use a plan, a policy, or a more generalstructure, such as a conditional plan or an executionstructure, depends on what kind of planning prob-lem we are trying to solve. In the above example, aplan and a policy work equally well—but more gen-erally, there are some policies that cannot beexpressed as plans (for example, in environmentswhere some of the actions have nondeterministicoutcomes) and some plans that cannot be expressedas policies (for example, if we come to the same statetwice and want to do something different the secondtime).

5. Note that for an HTN planner or control-rule plan-ner to be domain-configurable, the planning enginemust be domain-independent. For example, the ver-sion of Bridge Baron mentioned earlier was notdomain-configurable because its HTN planningengine was specifically tailored for the game ofbridge.

6. Technically, the name of this assumption is inac-curate, because the plan is intended precisely tochange the state of the system. What the namemeans is that the system remains static unless con-trolled transitions take place.

7. This has also been called STRIPS-style representa-tion, after an early planning system that used a sim-ilar representation scheme.

8. Anyone who wants to write a successor of Graph-Plan may want to do so soon, before the supply ofthree-letter acronyms runs out!

9. In some HTN planners, goals can be specified as

Articles

WINTER 2007 57

AAAI-08 Author Deadlines

December 1, 2007–

January 25, 2008: Authors register on the AAAI web site

January 25, 2008: Electronic abstracts due

January 30, 2008: Electronic papers due

March 18-20, 2008: Author feedback about initial reviews

March 31, 2008: Notification of acceptance or rejection

April 15, 2008: Camera-ready copy due at AAAI office

tasks such as “achieve(g)” where g is the goal. But asshown in figure 8, tasks can also represent activitiesthat do not correspond to goals in the classical sense.

10. By analogy, in writing a computer system, onecan get the highest level of performance by writingassembly code—but one can get nearly the same lev-el of performance with much less effort by writing ina high-level language.

11. See Drew McDermott’s web page, cs-www.cs.yale.edu/homes/dvm.

ReferencesBacchus, F., and Kabanza, F. 2000. Using TemporalLogics to Express Search Control Knowledge for Plan-ning. Artificial Intelligence 116(1–2): 123–191.

Bonet, B., and Geffner, H. 1999. Planning as Heuris-tic Search: New Results. In Proceedings of the EuropeanConference on Planning (ECP). Berlin: Springer-Verlag.

Boutilier, C.; Dean, T. L.; and Hanks, S. 1996. Plan-ning under Uncertainty: Structural Assumptions andComputational Leverage. In New Directions in AIPlanning, ed. M. Ghallab and A. Milani, 157–171.Amsterdam: IOS Press.

Cimatti, A.; Pistore, M.; Roveri, M.; and Traverso, P.2003. Weak, Strong, and Strong Cyclic Planning ViaSymbolic Model Checking. Artificial Intelligence,147(1–2): 35–84.

Estlin, T.; Castaño, R.; Anderson, B.; Gaines, D.; Fish-er, F.; and Judd, M. 2003. Learning and Planning forMars Rover Science. In Proceedings of the EighteenthInternational Joint Conference on Artificial Intelligence(IJCAI). San Francisco: Morgan Kaufmann Publishers.

Ghallab, M.; Nau, D.; and Traverso, P. 2004. Auto-mated Planning: Theory and Practice. San Francisco:Morgan Kaufmann Publishers.

Gupta, S. K.; Bourne, D. A; Kim, K.; and Krishanan,S. S. 1998. Automated Process Planning for SheetMetal Bending Operations. Journal of ManufacturingSystems 17(5): 338–360.

Hoffmann, J., and Nebel, B. 2001. The FF PlanningSystem: Fast Plan Generation through HeuristicSearch. Journal of Artificial Intelligence Research, 14:253–302.

Kautz, H., and Selman, B. 1992. Planning as Satisfia-

bility. In Proceedings of the Tenth European Conferenceon Artificial Intelligence (ECAI), 359–363. Chichester,UK: John Wiley and Sons.

Koehler, J., and Schuster, K. 2000. Elevator Controlas a Planning Problem. In Proceedings of the Fifth Inter-national Conference on AI Planning Systems (AIPS),331–338. Menlo Park, CA: AAAI Press.

Kuter, U., and Nau, D. 2005. Using Domain-Config-urable Search Control for Probabilistic Planning. InProceedings of the Twentieth National Conference onArtificial Intelligence (AAAI). Menlo Park, CA: AAAIPress.

Kuter, U.; Nau, D.; Pistore, M.; and Traverso, P. 2005.A Hierarchical Task-Network Planner Based on Sym-bolic Model Checking. In Proceedings of the Interna-tional Conference on Automated Planning and Schedul-ing (ICAPS), 300–309. Menlo Park, CA: AAAI Press.

Kvarnström, J., and Doherty, P. 2001. TALplanner: ATemporal Logic Based Forward Chaining Planner.Annals of Mathematics and Artificial Intelligence, 30(1–4): 119–169.

Muscettola, N.; Nayak, P. P.; Pell, B.; and Williams,B. C. 1998. Remote Agent: To Boldly Go Where No AISystem has Gone Before. Artificial Intelligence, 103(1–2): 5–47.

Nau, D.; Au, T.-C.; Ilghami, O.; Kuter, U.; Murdock,J. W.; Wu, D.; and Yaman, F. 2003. SHOP2: An HTNPlanning System. Journal of Artificial IntelligenceResearch, 20(December): 379–404.

Penberthy, J. S., and Weld, D. 1992. UCPOP: ASound, Complete, Partial-Order Planner for ADL. InProceedings of the Third International Conference onKnowledge Representation and Reasoning (KR-92). SanFrancisco: Morgan Kaufmann Publishers.

Smith, S. J. J.; Nau, D. S.; and Throop, T. 1998. Com-puter Bridge: A Big Win for AI planning. AI Magazine,19(2): 93–105.

Tate, A.; Drabble, B.; and Kirby, R. 1994. O-Plan2: AnArchitecture for Command, Planning, and Control. SanMateo, CA: Morgan-Kaufmann Publishers.

Wilkins, D. E. 1988. Practical Planning: Extending theClassical AI Planning Paradigm. San Mateo, CA: Mor-gan Kaufmann Publishers.

Dana Nau is anAAAI Fellow, a pro-fessor of both com-puter science andsystems research atthe University ofMaryland, and thedirector of the uni-versity’s Laboratory

for Computational Cultural Dynamics. He receivedhis Ph.D. from Duke University in 1979, where hewas an NSF graduate fellow. He coauthored the auto-mated-planning algorithms that enabled BridgeBaron to win the 1997 world computer bridge cham-pionship. His SHOP2 planning system has been usedin hundreds of projects worldwide and won an awardin the 2002 International Planning Competition. Hehas more than 300 publications and is coauthorof Automated Planning: Theory and Practice, the firstcomprehensive textbook on automated planning.

Articles

58 AI MAGAZINE

current trends in automated planning - university of …nau/papers/nau2007current.pdf · current...

Documents