cse 574 automated planning

CSE 574

Automated Planning

Daniel S. Weld

Agent Control: The Problem

Environment

Percepts Actions

What action next?

What’s Wrong with this Picture?

Motivating Domain #1

Motivating Domain #2

Applications (Current & Potential)

Scheduling problems with action choices as well as resource handling requirements Problems in supply chain management HSTS (Hubble Space Telescope scheduler) Workflow management

Autonomous agents RAX/PS (The NASA Deep Space planning agent)

Software module integrators VICAR (JPL image enhancing system); CELWARE (CELCorp) Test case generation (Pittsburgh)

Interactive decision support Monitoring subgoal interactions

Optimum AIV system Plan-based interfaces

E.g. NLP to database interfaces Plan recognition

Web-service composition

Lots of activity...Significant scale-up in

the last 4-5 years Before we could

synthesize about 5-6 action plans in minutes

Now, we can synthesize optimal 100-action durative plans in minutes

Further scale-up with domain-specific control

Significant strides in our understanding Rich connections between

planning and CSP(SAT) OR (ILP)

Vanishing separation between planning & Scheduling

New ideas for heuristic control of planners

Wide array of approaches for customizing planners with domain-specific knowledge

Today’s Agenda

AdministriviaApproaches to agent controlSimplifying assumptionsSTRIPS – the easiest caseOverview of methods

Administrivia

Class Goals Learn about planning Practice critical reading Gain experience as reviewers Gain experience leading discussion (PC Mtg) Extensible projects

Experience[ Quals / Generals ][ AAAI / ICAPS Papers ]

Grading

20% paper summaries 25% organization of your

class 20% class participation 35% project

Paper Summary ProcessOne [ two ] papers / classPost reviews to class b-board

One-line summary The most important ideas in the paper, and why The one or two largest flaws in the paper Identify two important, open research questions on

the topic, and why they matterStimulate class discussion Encourage detailed reading

Process for Student Class-LeadsYou pick an areaRead papers ahead of time (I provide list)Decide on best paper for class to read We meet to discuss strategy

T- (1 week) for an hour

We meet to discuss +/-

Project

Not too big But potential for quals or publication

Individual or groupLots of software available for quick startMore soon…

Administrivia

Add yourself to mailing listAnything else?

Agent-Control Approaches

Reactive Control Set of situation-action rules E.g.

1) if dog-is-behind-methen run-forward

2) if food-is-nearthen eat

Planning Reason about effect of combinations of actions “Planning ahead” Avoiding “painting oneself into a corner”

Ways to make “plans”

Generative Planning Reason from first principles (knowledge of actions) to

generate plan Requires formal model of actions

Case-Based Planning Retrieve old plan which worked for similar problem Revise retrieved plan for this problem

Reinforcement Learning Act randomly, noticing effects Learn reward, action models, policy

Generative Planning

Input Description of (initial state of) world (in some KR) Description of goal (in some KR) Description of available actions (in some KR)

Output Controller

E.g. Sequence of actionsE.g. Plan with loops and conditionalsE.g. Policy = f: states -> actions

Input Representation

Description of initial state of world E.g., Set of propositions: ((block a) (block b) (block c) (on-table a) (on-

table b) (clear a) (clear b) (clear c) (arm-empty))Description of goal (i.e. set of desired worlds)

E.g., Logical conjunction Any world that satisfies the conjunction is a goal (and (on a b) (on b c)))

Description of available actions

Simplifying Assumptions

Environment

Percepts Actions

What action next?

Static vs.

Dynamic

Fully Observable vs.

Partially Observable

Deterministic vs.

Stochastic

Instantaneous vs.

Durative

Full vs. Partial satisfaction

Perfectvs.

Classical Planning

EnvironmentStatic

Fully Observable Deterministic Instantaneous

Perfect

I = initial state G = goal state Oi(prec) (effects)

[ I ] Oi Oj Ok Om[ G ]

Real Class Focus

Durative Actions Simultaneous actions, events, deadline goals

Planning Under Uncertainty Modeling a robot’s (or softbot’s) sensors

[ I ] Oi

Static Deterministic Observable InstantaneousPropositional

“Classical Planning”

DynamicR

Durative

Continuous

Stochastic

rleaved

Policie

olicie

PartiallyObservable

rleaved

olicie

Broad Aims & Biases

AIM: We will concentrate on planning in deterministic, quasi-static and fully observable worlds Will start with “classical” domains;

but discuss handling durative actions and numeric constraints, as well as

replanning

Neo-Classical Planning

BIAS: To the extent possible, we shall shun brand-names and concentrate on unifying themes

Better understanding of existing plannersNormalized comparisons between plannersEvaluation of trade-offs provided by various design choices

Better understanding of inter-connectionsHybrid planners using multiple refinementsExplication of the connections between planning,CSP, SAT and ILP

Today’s Agenda

AdministriviaApproaches to agent controlSimplifying assumptionsSTRIPS – the easiest caseOverview of methods

Why care about “classical” Planning?

Most advances seen first in classical planning Many stabilized environments ~satisfy classical assumptions

It is possible to handle minor assumption violations through replanning and execution monitoring

“ This form of solution has the advantage of relying on widely-used (and often very efficient) classical planning technology” Boutilier, 2000

Techniques developed for classical planning often shed light on effective ways of handling non-classical planning worlds Most of the efficient techniques for handling non-classical scenarios

based on classical ideas/advances

How Represent Actions?Simplifying assumptions

Atomic time Agent is omniscient (no sensing necessary). Agent is sole cause of change Actions have deterministic effects

STRIPS representation World = set of true propositions Actions:

Precondition: (conjunction of literals)Effects (conjunction of literals)

north11 north12

W0 W2W1

STRIPS ActionsAction =function from world-stateworld-statePrecondition says where function definedEffects say how to change set of propositions

north11

north11precond: (and (agent-at 1 1)

(agent-facing north))

effect: (and (agent-at 1 2)

(not (agent-at 1 1)))

Note: str

ips doesn

allow deri

ved effec

you must b

e complet

Action Schemata

(:operator pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1)

(on-table ?ob1) (arm-empty))

:effect (and (not (clear ?ob1)) (not (on-table ?ob1))

(not (arm-empty)) (holding ?ob1)))

Instead of defining: pickup-A and pickup-B and …

Define a schema:Note: strips doesn’t

allow derived effects;

you must be complete!}

Planning as Search

Initial State

Goal State

World states

Actions

The state satisfying the complete description of the initial conds

Any state satisfying the goal propositions

Forward-Chaining World-Space Search

InitialState Goal

Backward-Chaining Search Thru Space of Partial World-States

Problem: Many possible goal states are equally acceptable.

From which one does one search?

Initial State is completely defined

Plan Space

Forward chaining thru world-states

Backward chaining thru world-states

“Causal Link” Planning

Initial State

Goal State

Partially specified plans

Adding + deleting actions or constraints (e.g. <) to plan

The empty plan(Actually two dummy actions…)

A plan which when simulated achieves the goalNeed efficient way to evaluate quality (percentage ofpreconditions satisfied) of partial plan …Hence causal link datastructures

Plan-Space Search

pick-from-table(C)

pick-from-table(B)

pick-from-table(C)put-on(C,B)

How represent plans?How test if plan is a solution?

Planning as Search 3Graphplan

Phase 1 - Graph Expansion Necessary (insufficient) conditions for plan

existence Local consistency of plan-as-CSP

Phase 2 - Solution Extraction Variables

action execution at a time point

Constraints goals, subgoals achievedno side-effects between actions

Planning Graph

PropositionInit State

ActionTime 1

PropositionTime 1

ActionTime 2

Constructing the planning graph…

Initial proposition layer Just the initial conditions

Action layer i If all of an action’s preconds are in i-1 Then add action to layer I

Proposition layer i+1 For each action at layer i Add all its effects at layer i+1

Mutual Exclusion

Actions A,B exclusive (at a level) if A deletes B’s precond, or B deletes A’s precond, or A & B have inconsistent preconds

Propositions P,Q inconsistent (at a level) if all ways to achieve P exclude all ways to

achieve Q

Graphplan

Create level 0 in planning graphLoop

If goal contents of highest level (nonmutex)

Then search graph for solutionIf find a solution then return and

terminate

Else Extend graph one more level

A kind of double search: forward direction checks necessary

(but insufficient) conditions for a solution, ...

Backward search verifies...

Searching for a Solution

For each goal G at time t For each action A making G true @t

If A isn’t mutex with a previously chosen action, select it

If no actions work, backup to last G (breadth first search)

Recurse on preconditions of actions selected, t-1

PropositionInit State

ActionTime 1

PropositionTime 1

ActionTime 2

Dinner Date

Initial Conditions: (:and (cleanHands) (quiet))

Goal: (:and (noGarbage) (dinner) (present))

Actions:(:operator carry :precondition

:effect (:and (noGarbage) (:not (cleanHands)))(:operator dolly :precondition

:effect (:and (noGarbage) (:not (quiet)))(:operator cook :precondition (cleanHands)

:effect (dinner))(:operator wrap :precondition (quiet)

:effect (present))

Planning Graph noGarb

cleanH

dinner

present

cleanH

0 Prop 1 Action 2 Prop 3 Action 4 Prop

Are there any exclusions? noGarb

cleanH

dinner

present

cleanH

Do we have a solution? noGarb

cleanH

dinner

present

cleanH

Extend the Planning Graph noGarb

cleanH

dinner

present

cleanH

noGarb

cleanH

dinner

present

One (of 4) possibilities noGarb

cleanH

dinner

present

cleanH

noGarb

cleanH

dinner

present

cse 574 automated planning

Documents

574-074_spanish06deappguide_lowres (3)

scan0002 - dhaka university of engineering & technology...

(203)574 -8038 (203)574 8039

planning as satisfiability cse 574 april 8, 2003 dan weld

data-link layer and management protocols for...

continuous time and resource uncertainty cse 574 lecture...

computational learning theory -...

time & location sensitive messaging protocol for automated...

convocatoria 2020 residencias cundinamarca museo … ·...

al mumtaz 574

applied automated theorem proving cse 291 instructor: sorin...

automated whitebox fuzz testing - internet society€¦ ·...

cse 574 extracting, managing & personalizing web...

574...2号 574

6/14/2015 8:20 pm1 cse 574 extracting, managing &...

regulament 574 - 1972_1

duits cse gl en tl duits vbo-mavo-d - examenblad ·...

l'informador 574

cse 574 – artificial intelligence ii (nlp) ee 517

la crÓnica 574