planning as satisfiability cs672. 2 outline 0. overview of planning 1. modeling and solving planning...

Post on 11-Jan-2016

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Planning as SatisfiabilityPlanning as Satisfiability

CS672

2

OutlineOutline

0. Overview of Planning

1. Modeling and Solving Planning Problems as SAT - SATPLAN

2. Improved Encodings using Graph Analysis - BLACKBOX

3. Improved Encodings using Compiled Control Knowledge

3

Overview of PlanningOverview of Planning

Find a sequence of operators that transform an initial state to a goal state

State = complete truth assignment to a set of variables (fluents)

Goal = partial truth assignment (set of states)

Action = a partial function State State

• specified by three sets of variables:precondition, add list, delete

list

4

Abdundance of Negative Complexity Results

Abdundance of Negative Complexity Results

I. Domain-independent planning: PSPACE-complete

(Chapman 1987; Bylander 1991; Backstrom 1993)

II. Domain-dependent planning: NP-complete(Chenoweth 1991; Gupta and Nau 1992)

III. Approximate planning: NP-complete(Selman 1994)

5

Planning as InferencePlanning as Inference

• Planning as first-order theorem proving (Green 1969)

computationally infeasible

• STRIPS (Fikes & Nilsson 1971)

very hard

• Partial-order planning (modal truth criteria) (Tate 1977, Chapman 1985, McAllester 1991, Smith & Peot 1993)

can be more efficient, but still hard (Minton, Bresina, & Drummond 1994)

• SATPLAN: planning as propositional reasoning

6

Part 1: Modeling and Solving Planning Problems as SAT

Part 1: Modeling and Solving Planning Problems as SAT

7

SAT EncodingsSAT Encodings

Planning Problem -> Propositional CNF by axiom schemas

Discrete time, modeled by integers

• state predicates: indexed by time at which they hold

• action predicates: indexed by time at which action begins

• each action takes 1 time step

• many actions may occur at the same step

8

Encoding ConventionsEncoding Conventions

• Actions imply preconditions and effects

fly(x,y,i) at(x,i) & route(x,y) & at(y,i+1)

• Conflicting actions cannot occur at same time (A deletes a precondition of B)

fly(x,y,i) & yz fly(x,z,i)

• If something changes, an action must have caused it (Explanatory Frame Axioms)

at(x,i) & at(x,i+1) y . route(x,y) & fly(x,y,i)

• Initial and final states hold

at(NY,0) & ... & at(LA,9) & ...

9

Modeling TricksModeling Tricks

Can often dramatically reduce size of problem by modeling techniques

move(x,y,z,i) requires n4 vars

pickup(x,y,i), putdown(x,z,i) requires 2n3 vars

State-based encodings: eliminate all action variables (“compile away”)

at(x,i) at(x,i+1) y . route(x,y) & at(y,i+1)

at(x,i) & xy at(y,i)

10

Solution to a Planning ProblemSolution to a Planning Problem

A solution is specified by any model (satisfying truth assignment) of the conjunction of the axioms describing the initial state, goal state, and operators

Easy to convert back to a STRIPS-style plan

11

SATPLANSATPLAN

axiomschemas instantiated

propositionalclauses

satisfyingmodelplan

mapping

length

problemdescription

SATengine(s)

instantiate

interpret

12

SAT AlgorithmsSAT Algorithms

Systematic Search• DP (Davis Putnam Logemann Loveland)

backtrack search + unit propagation

• satz (Chu Min Li) - variable selection by forward checking: max unit props

• relsat (Bayardo) - dependency directed backtracking: add new clauses at dead-ends

Local Search• Inspired by Mins-Conflict algorithm

(Adorf, Johnson, Minton, Philips, & Laird)

• GSAT (Selman), Walksat (Selman, Kautz & Cohen)greedy local search + noise to escape minima

13

Planning Benchmark Test SetPlanning Benchmark Test Set

Extension of Graphplan test set

blocks world - up to 18 blocks, 1019 states

logistics - complex, highly-parallel transportation domain.

Logistics.d:

• 2,165 possible actions per time slot

• 1016 legal configurations (22000 states)

• optimal solution contains 74 distinct actions over 14 time slots

Problems of this size never previously handled by state-space planning systems

14

Scaling Up Logistics PlanningScaling Up Logistics Planning

0.01

0.1

1

10

100

1000

10000

rocket.a

rocket.b

log.b

log.a

log.c

log.d

log

so

luti

on

tim

e

Graphplan

DP

DP/Satz

Walksat

15

Randomized RestartsRandomized Restarts

Solution: randomize the systematic solver

• Add noise to the heuristic branching (variable choice) function

• Cutoff and restart search after a fixed number of backtracks

In practice: rapid restarts with low cutoff can dramatically improve performance

(Gomes 1996, Gomes, Kautz, and Selman 1997, 1998)

16

Increased PredictabilityIncreased Predictability

0.01

0.1

1

10

100

1000

10000

rocket.a

rocket.b

log.b

log.a

log.c

log.d

log

so

luti

on

tim

e

Satz

Satz/Rand

17

What SATPLAN ShowsWhat SATPLAN Shows

General propositional theorem provers can compete with state of the art specialized planning systems

• New, highly tuned variations of DP surprising powerful

– result of sharing ideas and code in large SAT/CSP research community

– specialized engines can catch up, but by then new general techniques

• Radically new stochastic approaches to SAT can provide very low exponential scaling

– 2+ orders magnitude speedup on hard benchmark problems

18

Why SATPLAN WorksWhy SATPLAN Works

More flexible than forward or backward chaining

• Systematic: most unit propagation at most highly constrained states

• Stochastic: iterative repair

Randomized algorithms less likely to get trapped along bad paths

19

Part 2: Improved Encodings by Graph Analysis: The BLACKBOX Planner

Part 2: Improved Encodings by Graph Analysis: The BLACKBOX Planner

20

GraphplanGraphplan

Planning as graph search (Blum & Furst 1995)

Set new paradigm for planning

Like SATPLAN...

• Two phases: instantiation of propositional structure, followed by search

Unlike SATPLAN...

• Interleaves instantiation and pruning of plan graph

• Employs specialized search engine

Graphplan - better instantiation

SATPLAN - better search

21

Graph PruningGraph Pruning

Graphplan instantiates in a forward direction, pruning unreachable nodes • conflicting actions are mutex

• if all actions that add two facts are mutex, the facts are mutex

• if the preconditions for an action are mutex, the action is unreachable!

In logical terms: limited application of negative binary propagation

• given: P V Q, P V R V S V ...

• infer: Q V R V S V ...

22

The Plan GraphThe Plan Graph

Facts FactsActions

... ...

Facts FactsActions

... ...

preconditions add effects

mutually exclusive

delete effects

23

Translation of Plan GraphTranslation of Plan Graph

Fact Act1 Act2

Act1 Pre1 Pre2

¬Act1 ¬Act2

Act1

Act2

Fact

Pre1

Pre2

24

General Limited InferenceGeneral Limited Inference

Generated wff can be further simplified by consistency propagation techniques

Compact (Crawford & Auton 1996)

• unit propagation: is Wff inconsistant by resolution against unit clauses?

O(n)

• failed literal rule: is Wff + { P } inconsistant by unit propagation?

O(n2)

• binary failed literal rule: is Wff + { P V Q } inconsistant by unit propagation?

O(n3)

Complements domain specific limited inference

Discovers hidden local structure!

25

General Limited InferenceGeneral Limited Inference

Percent vars set byProblem Varsunitprop

failedlit

binaryfailed

bw.a 2452 10% 100% 100%bw.b 6358 5% 43% 99%bw.c 19158 2% 33% 99%log.a 2709 2% 36% 45%log.b 3287 2% 24% 30%log.c 4197 2% 23% 27%log.d 6151 1% 25% 33%

26

BlackboxBlackbox

STRIPSPlan Graph

Mutex computation

CNF

GeneralStochastic / Systematic SAT engines

Solution

SimplifierTranslator

CNF

27

Blackbox ResultsBlackbox Results

0.01

0.1

1

10

100

1000

10000

rocket.a rocket.b log.a log.b log.c log.d

Graphplan

BB-walksat

BB-rand-sys

Handcoded-walksat

28

ApplicabilityApplicability

When is the BlackBox approach not a good idea?

• when domain too large for propositional planning approaches

• when long sequential plans are needed

• when solution time dominated by reachability analysis (plan-graph generation), not extraction

• when optimal or near optimal planning not necessary

29

Part 3: Improved Encodings: Compiling Control KnowledgePart 3: Improved Encodings: Compiling Control Knowledge

30

Kinds of Control KnowledgeKinds of Control Knowledge

About domain itself• a truck is only in one location

About good plans• do not remove a package from its destination location

About how to search• plan air routes before land routes

31

Expressing KnowledgeExpressing Knowledge

Such information is traditionally incorporated in the planning algorithm itself

– or in a special programming language

Instead: use additional declarative axioms– (Bacchus 1995; Kautz 1998; Chen, Kautz, & Selman 1999)

• Problem instance: operator axioms + initial and goal axioms + control axioms

• Control knowledge constraints on search and solution spaces

• Independent of any search engine strategy

32

Axiomatic Control KnowledgeAxiomatic Control Knowledge

State Invariant: A truck is at only one location

at(truck,loc1,i) & loc1 loc2 at(truck,loc2,i)

Optimality: Do not return a package to a location

at(pkg,loc,i) & at(pkg,loc,i+1) & i<j at(pkg,loc,j)

Simplifying Assumption: Once a truck is loaded, it should immediately move

in(pkg,truck,i) & in(pkg,truck,i+1) &at(truck,loc,i+1)

at(truck,loc,i+2)

33

Adding Control Kx to SATPLANAdding Control Kx to SATPLAN

ProblemSpecification

Axioms

Control Knowledge

Axioms

Instantiated Clauses

SAT Simplifier

SAT Engine

SAT “Core”

As control knowledge increases, Core shrinks!

34

Tradeoffs of Control KnowledgeTradeoffs of Control Knowledge

If the planning domain is inherently intractable, how can any amount of control knowledge make planning tractable?

• by reducing solution quality

• optimal planning - NP-Hard

• non-optimal - (maybe) Polynomial

Issue: speed / quality tradeoff

Case study: Control Knowledge in TLPLAN and BlackBox

• TLPLAN (Bacchus 1996): simple forward-chaining search with strong control rules

35

TLPlanTLPlan

Temporal Logic Control Formula

36

I. Rules involves only static information

II. Rules depends on the current state

III. Rules depends on the current state and

requires dynamic user-defined predicates

Temporal Logic for ControlTemporal Logic for Control

( at(obj1, loc1) => at(obj1, loc1) )

37

a

Category I Control RulesCategory I Control Rules

a

Do NOT unload an object from an airplane unless the object is at its goal destination

GoalInitial

a

SFO ORLNYC

38

Pruning the Planning GraphCategory I Rules

Pruning the Planning GraphCategory I Rules

Facts FactsActions

... ...

Facts FactsActions

... ...

39

Effect of Graph PruningEffect of Graph Pruning

0

2000

4000

6000

8000

10000

log-a log-b log-c log-d

nu

mb

er o

f n

od

es

Original Pruned

40

Category II Control RulesCategory II Control Rules

a

ORL NYC

Do NOT move an airplane if there is an object in the airplane that needs to be unloaded at that location.

SFO

41

Control by Adding ConstraintsControl by Adding Constraints

Control Rules

))ORLORL )next(at(p,)at(p,(in(pkg,p)

Planning Formula Constraints Clauses

)( 1 iii yyx

42

Blackbox with Control Knowledge(Logistics domain)

Blackbox with Control Knowledge(Logistics domain)

1

10

100

1000

10000

log-a log-b log-c log-d log-e

tim

e (s

ec)

blackbox blackbox(I) blackbox(I&II)

43

Comparison Comparison between Blackbox and TLPlan Blackbox and TLPlan((Parallel Plan Length) Plan Length)

0

5

10

15

20

25

30

35

log-c log-d log-e log-1 log-2

Par

alle

l P

lan

Len

gth

TLPlan Blackbox

44

Comparison between Blackbox and TLPlan(Running Time)

Comparison between Blackbox and TLPlan(Running Time)

0

20

40

60

80

log-a log-b log-c log-d log-e

Tim

e (

se

c)

TLPlan Blackbox(I&II)

45

ComparisonComparison

TLPlan (without Control): Intractable.TLPlan (with Control): fastest, but limited parallelism

Blackbox (without Control): slower, high parallelismBlackbox (with Control): faster, high parallelism

46

SummarySummary

Easy to encode domain-specific knowledge in the planning as satisfiablity frame

• Key to order-of-magnitude scaling

• Propositional logic, temporal logic, ...

• Can be applied before/after SAT encoding

Can control time / quality tradeoff• Power of underlying SAT engines gives option of

finding higher quality solutions

Heuristics are independent from the SAT engine• Can use same axioms for radically different

problem solvers

47

How to Generate Control KxHow to Generate Control Kx

Introspection• Try to capture “obvious” inferences that are hard to

deduce

EBL (Minton, Kambhampati)

• Generalize trace of previous problem solving

Static analysis (Smith, Etzioni, Knoblock, Peot)

• Analyze operators

Inductive Logic Programming (Huang, Selman, Kautz)

• Find rules that hold for a set of previous high-quality solution plans

48

ConclusionsConclusions

• Propositional approaches to Open-Loop planning using general SAT engines are highly competitive with specialized planning algorithms

• Synergy with Plan Graph approaches

• Can effectively employ purely declarative control knowledge

• Biggest limitation: domains where number of objects is too large to instantiate

top related