© daniel s. weld 1 logistics ps3 project additional problem reading planning & csp for...
TRANSCRIPT
© Daniel S. Weld 1
Logistics
• PS3 Project Additional problem
• Reading Planning & CSP for “today” SAT plan paper review for Wed
© Daniel S. Weld 2
PSet 2Student(s) = s is a studentTakes(s, c, q) = s takes c during quarter qIsQtr(q) = q is a quarterPasses(s, c) = s passes course cHigher(h, g) = grade h is higher than gIsGradeIn(g, c) = g is a grade in course c
Every student who takes French passes it
Needs time argument
Needs person & time args
s,q [student(s) isQtr(q) takes(x, French, q)] passes(s, French)
But what if Joe takes French in the fall, fails and then doesn’t take French in the winter?The antecedent is false… but PQ = P QSo the formula is true?!?
What’s the fix??
© Daniel S. Weld 3
PSet 2Higher(h, g) =
grade h is higher than gGradeOf(g, s, c, q) =
s has grade g in course c during q
The best score in Greek is always better than that french
g, q [s gradeOf(s, g, French, q)] [t, h gradeOf(t, h, Greek, q) Higher(h, g)]
© Daniel S. Weld 4
573 Topics
Agency
Problem Spaces
Search Knowledge Representation
Planning Uncertainty
MDPs SupervisedLearning
ReinforcementLearning
© Daniel S. Weld 5
Immediate Outline• Constraint satisfaction
Defn – factoring state spaces Backtracking policies Variable-ordering heuristics & preprocessing
• The planning problem• Searching world states• Graphplan• SATplan • Reachability analysis & heuristics
• Planning under uncertainty
© Daniel S. Weld 6
Constraint Satisfaction
• Kind of search in which States are factored into sets of variables Search = assigning values to these variables Structure of space is encoded with constraints
• Backtracking-style algorithms work E.g. DFS for SAT (i.e. DPLL)
• But other techniques add speed Propagation Variable ordering Preprocessing
© Daniel S. Weld 7
Chinese Constraint Network
Soup
Total Cost< $30
ChickenDish
Vegetable
RiceSeafood
Pork Dish
Appetizer
Must beHot&Sour
No Peanuts
No Peanuts
NotChow Mein
Not BothSpicy
© Daniel S. Weld 8
CSPs in the real world
• Scheduling Space Shuttle Repair• Airport gate assignments• Transportation Planning• Supply-chain management• Computer Configuration• Diagnosis• UI Optimization• Etc...
Adapting to
Device
Characteristics
© Daniel S. Weld 10
Binary Constraint Network• Set of n variables: x1 … xn
• Value domains for each variable: D1 … Dn
• Set of binary constraints (also “relations”) Rij Di Dj
Specifies which values pair (xi xj) are consistent
• V for each country• Each domain = 4
colors• Rij enforces
© Daniel S. Weld 11
Binary Constraint NetworkPartial assignment of values = tuple of pairs
{...(x, a)…} means variable x gets value a...
Tuple=consistent if all constraints satisfiedTuple=full solution if consistent + has all vars
Tuple {(xi, ai) … (xj, aj)} = consistent w/ a set of vars {xm … xn}
iff am … an such that {(xi, ai)…(xj, aj), (xm, am)…(xn, an)} } =
consistent
© Daniel S. Weld 12
N Queens• Variables = board columns• Domain values = rows• Rij = {(ai, aj) : (ai aj) (|i-j| |ai-aj|)
e.g. R12 = {(1,3), (1,4), (2,4), (3,1), (4,1), (4,2)}
Q
Q
Q
• {(x1, 2), (x2, 4), (x3, 1)} consistent with (x4)• Shorthand: “{2, 4, 1} consistent with x4”
© Daniel S. Weld 13
CSP as a search problem?
• What are states? (nodes in graph)
• What are the operators? (arcs between nodes)
• Initial state?• Goal test?
Q
Q
Q
© Daniel S. Weld 14
Chronological Backtracking (BT) (e.g., depth first
search)
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
1
2
34
5
6
Consistency check performed in the order in which vars were instantiatedIf c-check fails, try next value of current varIf no more values, backtrack to most recent var
© Daniel S. Weld 15
Backjumping (BJ)• Similar to BT, but
more efficient when no consistent instantiation can be found for the current var
• Instead of backtracking to most recent var… BJ reverts to deepest var which was c-checked
against the current var
BJ Discovers (2, 5, 3, 6) inconsistent with x6
No sense trying other values of x5
Q
Q
Q
Q
Q
© Daniel S. Weld 16
5
Conflict-Directed Backjumping (CBJ)
• More sophisticated backjumping behavior• Each variable has conflict set CS
Set of vars that failed c-checks w/ current val Update this set on every failed c-check
• When no more values to try for xi
Backtrack to deepest var, xd, in CS(xi) And update CS(xd):=CS(xd)CS(xi)-{xd}
CBJ Discovers(2, 5, 3) inconsistent with {x5, x6 }
Q
Q
Q
Q
Q
1 1
3
2
3
3 3
21
2
3
4
5
6x1 x2 x3 x4 x5 x6
CS(x5)
1,2,3
CS(x6)
1,2,3,5
© Daniel S. Weld 17
BT vs. BJ vs. CBJ
{
© Daniel S. Weld 18
Forward Checking (FC)
• Perform Consistency Check Forward• Whenever a var is assigned a value
Prune inconsistent values from As-yet unvisited variables Backtrack if domain of any var ever collapses
Q
Q
Q
Q
Q
FC only visits consistent nodes but not all such nodes skips (2, 5, 3, 4) which CBJ visitsBut FC can’t detect that (2, 5, 3) inconsistent with {x5, x6 }
© Daniel S. Weld 19
Number of Nodes Explored
BT=BM
BJ=BMJ=BMJ2
CBJ=BM-CBJ
FC-CBJ
FC
More
Fewer=BM-CBJ2
© Daniel S. Weld 20
Number of Consistency Checks
BMJ2
BT
BJ
BMJ
BM-CBJ
CBJFC-CBJ
BM
BM-CBJ2
FC
More
Fewer
© Daniel S. Weld 21
Dynamic variable ordering
• In the N-queens examples we assumed First x1 then x2 then ...
• But this order not required Any order ok with respect to completeness A good order leads to huge speedup
• A good heuristic (MRV): Choose variable w/ minimum remaining
values • This is easy if one is doing FC
© Daniel S. Weld 22
DVO MRV => WOW!!
Algo 17 Queens 21 Queens 27 QueensFC-CBJmrv 1959 2572 5602FC-CBJ 67090 114612 737008FC 67329 115120 7448781CBJ 428645 949128BJ 436340 972065BT 485597 1156015
© Daniel S. Weld 23
Preprocessing Strategies
• Even FC-CBJ is O(bd) time worst case• Sometimes useful to preprocess
before doing exponential search spend polynomial time to achieve local
consistency • Arc consistency
Consider all pairs of vars Can values be eliminated from a domain ala FC Propagate O(d2) time where d= number of vars
© Daniel S. Weld 24
Constraint Satisfaction Recap
• CSP = Factoring a state space• Chronological Backtracking (BT)• Backjumping (BJ)• Conflict-Directed Backjumping (CBJ)• Forward checking (FC)• Dynamic variable ordering heuristics• Preprocessing Strategies
© Daniel S. Weld 25
Immediate Outline
• The planning problem• Searching world states• Constraint satisfaction• Graphplan• SATplan • Reachability analysis & heuristics
• Planning under uncertainty
© Daniel S. Weld 26
Ways to make “plans”
Generative PlanningReason from first principles (knowledge of actions)Requires formal model of actions
Case-Based PlanningRetrieve old plan which worked on similar problemRevise retrieved plan for this problem
Reinforcement LearningAct ”randomly” - noticing effects Learn reward, action models, policy
© Daniel S. Weld 27
Generative Planning
InputDescription of (initial state of) world (in some KR)Description of goal (in some KR)Description of available actions (in some KR)
OutputController
E.g. Sequence of actionsE.g. Plan with loops and conditionalsE.g. Policy = f: states -> actions
© Daniel S. Weld 28
Input Representation
• Description of initial state of world E.g., Set of propositions: ((block a) (block b) (block c) (on-table a)
(on-table b) (clear a) (clear b) (clear c) (arm-empty))
• Description of goal: i.e. set of worlds or ?? E.g., Logical conjunction Any world satisfying conjunction is a goal (and (on a b) (on b c)))
• Description of available actions
© Daniel S. Weld 29
Simplifying Assumptions
Environment
Percepts Actions
What action next?
Static vs.
Dynamic
Fully Observable vs.
Partially Observable
Deterministic vs.
Stochastic
Instantaneous vs.
Durative
Full vs. Partial satisfaction
Perfectvs.
Noisy
© Daniel S. Weld 30
Classical Planning
EnvironmentStatic
Fully Observable Deterministic Instantaneous
Full
Perfect
I = initial state G = goal state Oi(prec) (effects)
[ I ] Oi Oj Ok Om[ G ]
© Daniel S. Weld 31
Static Deterministic ObservableInstantaneousPropositional
“Classical Planning”
DynamicR
ep
lan
ni
ng
/S
itu
ate
d
Pla
ns
Durative
Tem
pora
l R
eason
in
g
Continuous
Nu
meri
c
Con
str
ain
t re
ason
ing
(LP
/ILP
)
Stochastic
Con
tin
gen
t/C
on
form
an
t P
lan
s,
Inte
rleaved
execu
tion
MD
P
Policie
sP
OM
DP
P
olicie
s
PartiallyObservable
Con
tin
gen
t/C
on
form
an
t P
lan
s,
Inte
rleaved
execu
tion
Sem
i-M
DP
P
olicie
s
© Daniel S. Weld 32
Today’s Hot Research Areas
• Durative Actions Simultaneous actions, events, deadline goals
• Planning Under Uncertainty Modeling sensors; searching belief states
[ I ] Oi
Oj
Ok
?
Ob
Oa
Oc
© Daniel S. Weld 33
Representing Actions
• Situation Calculus• STRIPS• PDDL• UWL• Dynamic Bayesian Networks
© Daniel S. Weld 34
How Represent Actions?• Simplifying assumptions
Atomic time Agent is omniscient (no sensing necessary). Agent is sole cause of change Actions have deterministic effects
• STRIPS representation World = set of true propositions Actions:
• Precondition: (conjunction of literals)• Effects (conjunction of literals)
a
aa
north11 north12
W0 W2W1
© Daniel S. Weld 35
STRIPS Actions• Action = function: worldState worldState• Precondition
says where function defined• Effects
say how to change set of propositions
aa
north11
W0 W1
north11precond: (and (agent-at 1 1)
(agent-facing north))
effect: (and (agent-at 1 2)
(not (agent-at 1 1)))
Note: str
ips doesn
’t
allow deri
ved effec
ts;
you must b
e complet
e!
© Daniel S. Weld 36
Action Schemata
(:operator pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1)
(on-table ?ob1) (arm-empty))
:effect (and (not (clear ?ob1)) (not (on-table ?ob1))
(not (arm-empty)) (holding ?ob1)))
• Instead of defining: pickup-A and pickup-B and …
• Define a schema:Note: strips doesn’t
allow derived effects;
you must be complete!}
© Daniel S. Weld 37
Immediate Outline
• Constraint satisfaction• The planning problem• Searching world states
Regression Heuristics
• Graphplan• SATplan • Reachability analysis & heuristics
• Planning under uncertainty
© Daniel S. Weld 38
Planning as Search
• Nodes
• Arcs
• Initial State
• Goal State
World states
Actions
The state satisfying the complete description of the initial conds
Any state satisfying the goal propositions
© Daniel S. Weld 39
Forward-Chaining World-Space Search
AC
BCBA
InitialState Goal
State
© Daniel S. Weld 40
Backward-Chaining Search Thru Space of Partial World-States
DCBA
E
D
CBA
E
DCBA
E
* * *
• Problem: Many possible goal states are equally acceptable.
• From which one does one search?
AC
B
Initial State is completely defined
DE
© Daniel S. Weld 41
Regression• Regressing a goal, G, thru an action, A• Yields the weakest precondition G’
Such that: if G’ is true before A is executed G is guaranteed to be true afterwards
A Gp
recon
d
eff
ectG’
Represents a set of
world states
Represents a set of
world states
© Daniel S. Weld 42
Regression Example
pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1)
(on-table ?ob1) (arm-empty))
:effect (and (not (clear ?ob1)) (not (on-table ?ob1))
(not (arm-empty)) (holding ?ob1)))
A G
pre
con
d
eff
ectG’
(and (holding C) (on A B))
(and (clear A) (on-table A) (arm-empty) (on A B))
Disjunction preconditions
© Daniel S. Weld 43
Conditional Effects
© Daniel S. Weld 44
Regressing Conditional Effects
A G
pre
con
d
eff
ectG’
(and (at keys home) (at paycheck bank))
(and (at briefcase bank) (in keys briefcase) (not (in paycheck briefcase)) (at paycheck bank))
bankhome