© daniel s. weld 1 logistics ps3 project additional problem reading planning & csp for...

© Daniel S. Weld 1

Logistics

• PS3 Project Additional problem

• Reading Planning & CSP for “today” SAT plan paper review for Wed

© Daniel S. Weld 2

PSet 2Student(s) = s is a studentTakes(s, c, q) = s takes c during quarter qIsQtr(q) = q is a quarterPasses(s, c) = s passes course cHigher(h, g) = grade h is higher than gIsGradeIn(g, c) = g is a grade in course c

Every student who takes French passes it

Needs time argument

Needs person & time args

s,q [student(s) isQtr(q) takes(x, French, q)] passes(s, French)

But what if Joe takes French in the fall, fails and then doesn’t take French in the winter?The antecedent is false… but PQ = P QSo the formula is true?!?

What’s the fix??

© Daniel S. Weld 3

PSet 2Higher(h, g) =

grade h is higher than gGradeOf(g, s, c, q) =

s has grade g in course c during q

The best score in Greek is always better than that french

g, q [s gradeOf(s, g, French, q)] [t, h gradeOf(t, h, Greek, q) Higher(h, g)]

© Daniel S. Weld 4

573 Topics

Agency

Problem Spaces

Search Knowledge Representation

Planning Uncertainty

MDPs SupervisedLearning

ReinforcementLearning

© Daniel S. Weld 5

Immediate Outline• Constraint satisfaction

Defn – factoring state spaces Backtracking policies Variable-ordering heuristics & preprocessing

• The planning problem• Searching world states• Graphplan• SATplan • Reachability analysis & heuristics

• Planning under uncertainty

© Daniel S. Weld 6

Constraint Satisfaction

• Kind of search in which States are factored into sets of variables Search = assigning values to these variables Structure of space is encoded with constraints

• Backtracking-style algorithms work E.g. DFS for SAT (i.e. DPLL)

• But other techniques add speed Propagation Variable ordering Preprocessing

© Daniel S. Weld 7

Chinese Constraint Network

Soup

Total Cost< $30

ChickenDish

Vegetable

RiceSeafood

Pork Dish

Appetizer

Must beHot&Sour

No Peanuts

No Peanuts

NotChow Mein

Not BothSpicy

© Daniel S. Weld 8

CSPs in the real world

• Scheduling Space Shuttle Repair• Airport gate assignments• Transportation Planning• Supply-chain management• Computer Configuration• Diagnosis• UI Optimization• Etc...

Adapting to

Device

Characteristics

© Daniel S. Weld 10

Binary Constraint Network• Set of n variables: x1 … xn

• Value domains for each variable: D1 … Dn

• Set of binary constraints (also “relations”) Rij Di Dj

Specifies which values pair (xi xj) are consistent

• V for each country• Each domain = 4

colors• Rij enforces


Binary Constraint NetworkPartial assignment of values = tuple of pairs

{...(x, a)…} means variable x gets value a...

Tuple=consistent if all constraints satisfiedTuple=full solution if consistent + has all vars

Tuple {(xi, ai) … (xj, aj)} = consistent w/ a set of vars {xm … xn}

iff am … an such that {(xi, ai)…(xj, aj), (xm, am)…(xn, an)} } =

consistent


N Queens• Variables = board columns• Domain values = rows• Rij = {(ai, aj) : (ai aj) (|i-j| |ai-aj|)

e.g. R12 = {(1,3), (1,4), (2,4), (3,1), (4,1), (4,2)}

Q

Q

Q

• {(x1, 2), (x2, 4), (x3, 1)} consistent with (x4)• Shorthand: “{2, 4, 1} consistent with x4”


CSP as a search problem?

• What are states? (nodes in graph)

• What are the operators? (arcs between nodes)

• Initial state?• Goal test?

Q

Q

Q


Chronological Backtracking (BT) (e.g., depth first

search)

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

1

2

34

5

6

Consistency check performed in the order in which vars were instantiatedIf c-check fails, try next value of current varIf no more values, backtrack to most recent var


Backjumping (BJ)• Similar to BT, but

more efficient when no consistent instantiation can be found for the current var

• Instead of backtracking to most recent var… BJ reverts to deepest var which was c-checked

against the current var

BJ Discovers (2, 5, 3, 6) inconsistent with x6

No sense trying other values of x5

Q

Q

Q

Q

Q


5

Conflict-Directed Backjumping (CBJ)

• More sophisticated backjumping behavior• Each variable has conflict set CS

Set of vars that failed c-checks w/ current val Update this set on every failed c-check

• When no more values to try for xi

Backtrack to deepest var, xd, in CS(xi) And update CS(xd):=CS(xd)CS(xi)-{xd}

CBJ Discovers(2, 5, 3) inconsistent with {x5, x6 }

Q

Q

Q

Q

Q

1 1

3

2

3

3 3

21

2

3

4

5

6x1 x2 x3 x4 x5 x6

CS(x5)

1,2,3

CS(x6)

1,2,3,5


BT vs. BJ vs. CBJ

{


Forward Checking (FC)

• Perform Consistency Check Forward• Whenever a var is assigned a value

Prune inconsistent values from As-yet unvisited variables Backtrack if domain of any var ever collapses

Q

Q

Q

Q

Q

FC only visits consistent nodes but not all such nodes skips (2, 5, 3, 4) which CBJ visitsBut FC can’t detect that (2, 5, 3) inconsistent with {x5, x6 }


Number of Nodes Explored

BT=BM

BJ=BMJ=BMJ2

CBJ=BM-CBJ

FC-CBJ

FC

More

Fewer=BM-CBJ2


Number of Consistency Checks

BMJ2

BT

BJ

BMJ

BM-CBJ

CBJFC-CBJ

BM

BM-CBJ2

FC

More

Fewer


Dynamic variable ordering

• In the N-queens examples we assumed First x1 then x2 then ...

• But this order not required Any order ok with respect to completeness A good order leads to huge speedup

• A good heuristic (MRV): Choose variable w/ minimum remaining

values • This is easy if one is doing FC


DVO MRV => WOW!!

Algo 17 Queens 21 Queens 27 QueensFC-CBJmrv 1959 2572 5602FC-CBJ 67090 114612 737008FC 67329 115120 7448781CBJ 428645 949128BJ 436340 972065BT 485597 1156015


Preprocessing Strategies

• Even FC-CBJ is O(bd) time worst case• Sometimes useful to preprocess

before doing exponential search spend polynomial time to achieve local

consistency • Arc consistency

Consider all pairs of vars Can values be eliminated from a domain ala FC Propagate O(d2) time where d= number of vars


Constraint Satisfaction Recap

• CSP = Factoring a state space• Chronological Backtracking (BT)• Backjumping (BJ)• Conflict-Directed Backjumping (CBJ)• Forward checking (FC)• Dynamic variable ordering heuristics• Preprocessing Strategies


Immediate Outline

• The planning problem• Searching world states• Constraint satisfaction• Graphplan• SATplan • Reachability analysis & heuristics



Ways to make “plans”

Generative PlanningReason from first principles (knowledge of actions)Requires formal model of actions

Case-Based PlanningRetrieve old plan which worked on similar problemRevise retrieved plan for this problem

Reinforcement LearningAct ”randomly” - noticing effects Learn reward, action models, policy


Generative Planning

InputDescription of (initial state of) world (in some KR)Description of goal (in some KR)Description of available actions (in some KR)

OutputController

E.g. Sequence of actionsE.g. Plan with loops and conditionalsE.g. Policy = f: states -> actions


Input Representation

• Description of initial state of world E.g., Set of propositions: ((block a) (block b) (block c) (on-table a)

(on-table b) (clear a) (clear b) (clear c) (arm-empty))

• Description of goal: i.e. set of worlds or ?? E.g., Logical conjunction Any world satisfying conjunction is a goal (and (on a b) (on b c)))

• Description of available actions


Simplifying Assumptions

Environment

Percepts Actions

What action next?

Static vs.

Dynamic

Fully Observable vs.

Partially Observable

Deterministic vs.

Stochastic

Instantaneous vs.

Durative

Full vs. Partial satisfaction

Perfectvs.

Noisy


Classical Planning

EnvironmentStatic

Fully Observable Deterministic Instantaneous

Full

Perfect

I = initial state G = goal state Oi(prec) (effects)

[ I ] Oi Oj Ok Om[ G ]


Static Deterministic ObservableInstantaneousPropositional

“Classical Planning”

DynamicR

ep

lan

ni

ng

/S

itu

ate

d

Pla

ns

Durative

Tem

pora

l R

eason

in

g

Continuous

Nu

meri

c

Con

str

ain

t re

ason

ing

(LP

/ILP

)

Stochastic

Con

tin

gen

t/C

on

form

an

t P

lan

s,

Inte

rleaved

execu

tion

MD

P

Policie

sP

OM

DP

P

olicie

s

PartiallyObservable

Con

tin

gen

t/C

on

form

an

t P

lan

s,

Inte

rleaved

execu

tion

Sem

i-M

DP

P

olicie

s


Today’s Hot Research Areas

• Durative Actions Simultaneous actions, events, deadline goals

• Planning Under Uncertainty Modeling sensors; searching belief states

[ I ] Oi

Oj

Ok

?

Ob

Oa

Oc


Representing Actions

• Situation Calculus• STRIPS• PDDL• UWL• Dynamic Bayesian Networks


How Represent Actions?• Simplifying assumptions

Atomic time Agent is omniscient (no sensing necessary). Agent is sole cause of change Actions have deterministic effects

• STRIPS representation World = set of true propositions Actions:

• Precondition: (conjunction of literals)• Effects (conjunction of literals)

a

aa

north11 north12

W0 W2W1


STRIPS Actions• Action = function: worldState worldState• Precondition

says where function defined• Effects

say how to change set of propositions

aa

north11

W0 W1

north11precond: (and (agent-at 1 1)

(agent-facing north))

effect: (and (agent-at 1 2)

(not (agent-at 1 1)))

Note: str

ips doesn

’t

allow deri

ved effec

ts;

you must b

e complet

e!


Action Schemata

(:operator pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1)

(on-table ?ob1) (arm-empty))

:effect (and (not (clear ?ob1)) (not (on-table ?ob1))

(not (arm-empty)) (holding ?ob1)))

• Instead of defining: pickup-A and pickup-B and …

• Define a schema:Note: strips doesn’t

allow derived effects;

you must be complete!}


Immediate Outline

• Constraint satisfaction• The planning problem• Searching world states

Regression Heuristics

• Graphplan• SATplan • Reachability analysis & heuristics



Planning as Search

• Nodes

• Arcs

• Initial State

• Goal State

World states

Actions

The state satisfying the complete description of the initial conds

Any state satisfying the goal propositions


Forward-Chaining World-Space Search

AC

BCBA

InitialState Goal

State


Backward-Chaining Search Thru Space of Partial World-States

DCBA

E

D

CBA

E

DCBA

E

* * *

• Problem: Many possible goal states are equally acceptable.

• From which one does one search?

AC

B

Initial State is completely defined

DE


Regression• Regressing a goal, G, thru an action, A• Yields the weakest precondition G’

Such that: if G’ is true before A is executed G is guaranteed to be true afterwards

A Gp

recon

d

eff

ectG’

Represents a set of

world states

Represents a set of

world states


Regression Example

pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1)

(on-table ?ob1) (arm-empty))

:effect (and (not (clear ?ob1)) (not (on-table ?ob1))

(not (arm-empty)) (holding ?ob1)))

A G

pre

con

d

eff

ectG’

(and (holding C) (on A B))

(and (clear A) (on-table A) (arm-empty) (on A B))

Disjunction preconditions


Conditional Effects


Regressing Conditional Effects

A G

pre

con

d

eff

ectG’

(and (at keys home) (at paycheck bank))

(and (at briefcase bank) (in keys briefcase) (not (in paycheck briefcase)) (at paycheck bank))

bankhome

© daniel s. weld 1 logistics ps3 project additional problem reading planning & csp for...

Documents