computer science cpsc 502 lecture 2 search (textbook ch: 3)
TRANSCRIPT
Computer Science CPSC 502
Lecture 2
Search
(textbook Ch: 3)
Representational Dimensions for Intelligent Agents
We will look at
• Deterministic and stochastic domains
• Static and Sequential problems
We will see examples of representations using
• Explicit state or features or relations
Would like most general agents possible, but in this course we need to restrict ourselves to:
• Flat representations (vs. hierarchical)
• Knowledge given and intro to knowledge learned
• Goals and simple preferences (vs. complex preferences)
• Single-agent scenarios (vs. multi-agent scenarios)
2
Course OverviewEnvironment
Problem Type
Query
Planning
Deterministic Stochastic
Constraint Satisfaction Search
Arc Consistency
Search
Search
Logics
STRIPS
Vars + Constraints
Value Iteration
Variable Elimination
Belief Nets
Decision Nets
Markov Processes
Static
Sequential
Representation
ReasoningTechnique
Variable Elimination
Approximate Inference
Temporal Inference
First Part of the Course
Course OverviewEnvironment
Problem Type
Query
Planning
Deterministic Stochastic
Constraint Satisfaction Search
Arc Consistency
Search
Search
Logics
STRIPS
Vars + Constraints
Value Iteration
Variable Elimination
Belief Nets
Decision Nets
Markov Processes
Static
Sequential
Representation
ReasoningTechnique
Variable Elimination
Approximate Inference
Temporal Inference
Today we focus on Search
(Adversarial) Search: Checkers
Source: IBM Research
• Early work in 1950s by
Arthur Samuel at IBM
• Chinook program by Jonathan Schaeffer
(UofA)
• Search explore the space of possible
moves and their consequence
1994: world champion
2007: declared unbeatable
(Adversarial) Search: Chess In 1997, Gary Kasparov, the chess grandmaster and
reigning world champion played against Deep Blue, a program written by researchers at IBM
Source: IBM Research
(Adversarial) Search: Chess• Deep Blue’s won 3 games, lost 2, tied 1
• 30 CPUs + 480 chess processors• Searched 126.000.000 nodes per sec• Generated 30 billion positions per move
reaching depth 14 routinely
Today’s Lecture
• Simple Search Agent
• Uniformed Search
• Informed Search
Simple Search Agent
Deterministic, goal-driven agent • Agent is in a start state• Agent is given a goal (subset of possible states)• Environment changes only when the agent acts• Agent perfectly knows:
• actions that can be applied in any given state• the state it is going to end up in when an action is applied
in a given state• The sequence of actions (and appropriate ordering) taking the
agent from the start state to a goal state is the solution
Definition of a search problem• Initial state(s)
• Set of actions (operators) available to the agent: for each state, define the successor state for each operator- i.e., which state the agent would end up in
• Goal state(s)
• Search space: set of states that will be searched for a path from initial state to goal, given the available actions• states are nodes and actions are links between them.• Not necessarily given explicitly (state space might be too large or
infinite)
• Path Cost (we ignore this for now)
Three examples1. Vacuum cleaner world
2. Solving an 8-puzzle
3. The delivery robot planning the route it will take in a bldg. to get from one room to another (see textbook)
Example: vacuum world
Possible start state Possible goal state
• States• Two rooms: r1, r2
• Each room can be either dirty or not
• Vacuuming agent can be in either in r1 or r2
Feature-based representation
Features?
Example: vacuum world
• States• Two rooms: r1, r2
• Each room can be either dirty or not
• Vacuuming agent can be in either in r1 or r2
Features?
Feature-based representation:
how many states?
Suppose we have the same problem with k rooms.
The number of states is….
k * 2k
…..
Cleaning Robot
• States – one of the eight states in the picture
• Operators –left, right, suck• Possible Goal – no dirt
Search Space
16
• Operators – left, right, suck - Successor states in the graph describe the effect of each action
applied to a given state
• Possible Goal – no dirt
Search Space
17
• Operators – left, right, suck - Successor states in the graph describe the effect of each action
applied to a given state
• Possible Goal – no dirt
Search Space
18
• Operators – left, right, suck - Successor states in the graph describe the effect of each action
applied to a given state
• Possible Goal – no dirt
Search Space
19
• Operators – left, right, suck - Successor states in the graph describe the effect of each action
applied to a given state
• Possible Goal – no dirt
Search Space
20
• Operators – left, right, suck - Successor states in the graph describe the effect of each action
applied to a given state
• Possible Goal – no dirt
Eight Puzzle
States: each state specifies which number/blank occupies each of the 9 tiles HOW MANY STATES ?
Operators: blank moves left, right, up down
Goal: configuration with numbers in right sequence
9!
Search space for 8puzzle
How can we find a solution?
• How can we find a sequence of actions and their appropriate ordering that lead to the goal?
• Need smart ways to search the space graph
Search: abstract definition• Start at the start state• Evaluate where actions can lead us from states that have been
encountered in the search so far• Stop when a goal state is encountered
To make this more formal, we'll need to review the formal definition of a graph...
• A directed graph consists of a set N of nodes (vertices) and a set A of ordered pairs of nodes, called edges (arcs).
• Node n2 is a neighbor of n1 if there is an arc from n1 to n2. That is, if n1, n2 A.
• A path is a sequence of nodes n0, n1,..,nk such that ni-1, ni A.
• A cycle is a non-empty path such that the start node is the same as the end node.
Search graph• Nodes are search states • Edges correspond to actions• Given a set of start nodes and goal nodes, a solution is a path
from a start node to a goal node: a plan of actions.
Graphs
24
• Generic search algorithm: given a graph, start nodes, and goal nodes,
incrementally explore paths from the start nodes.
Maintain a frontier of paths from the start node that have been explored.
As search proceeds, the frontier expands into the unexplored nodes until a goal node is encountered.
• The way in which the frontier is expanded defines the search strategy.
Graph Searching
Problem Solving by Graph Searching
Input: a graph a set of start nodes Boolean procedure goal(n) testing if n is a
goal node
frontier:= [<s>: s is a start node]; While frontier is not empty: select and remove path <no,….,nk> from
frontier; If goal(nk)
return <no,….,nk>;
For every neighbor n of nk,
add <no,….,nk, n> to frontier;
end
Generic Search Algorithm
27
• The forward branching factor of a node is the number of arcs going out of the node
• The backward branching factor of a node is the number of arcs going into the node
• If the forward branching factor of a node is b and the graph is a tree, there are nodes that are n steps away from a node
Branching Factor
Comparing Searching Algorithms: Will it find a solution? the best one?
Def. : A search algorithm is complete if
whenever there is at least one solution, the algorithm is guaranteed to find it within a finite amount of time.
Def.: A search algorithm is optimal if
when it finds a solution, it is the best one
Comparing Searching Algorithms: Complexity
Def.: The time complexity of a search algorithm is
the worst- case amount of time it will take to run, expressed in terms of
• maximum path length m • maximum branching factor b.
Def.: The space complexity of a search algorithm is
the worst-case amount of memory that the algorithm will use (i.e., the maximum number of nodes on the frontier),
also expressed in terms of m and b.
Branching factor b of a node is the number of arcs going out
of the node
Today’s Lecture
• Simple Search Agent
• Uniformed Search
• Informed Search
Illustrative Graph: DFS
• DFS explores each path on the frontier until its end (or until a goal is found) before considering any other path.
Shaded nodes represent the end of paths on the
frontier
Input: a graph a set of start nodes Boolean procedure goal(n)
testing if n is a goal nodefrontier:= [<s>: s is a start node]; While frontier is not empty: select and remove path <no,
….,nk> from frontier;
If goal(nk)
return <no,….,nk>;
For every neighbor n of nk,
add <no,….,nk, n> to frontier;
end
DFS as an instantiation of the Generic Search Algorithm
34
• In DFS, the frontier is alast-in-first-out stack
Let’s see how this works in
Slide 35
DFS in AI Space• Go to: http://www.aispace.org/mainTools.shtml• Click on “Graph Searching” to get to the Search Applet• Select the “Solve” tab in the applet• Select one of the available examples via “File -> Load
Sample Problem (good idea to start with the “Simple Tree” problem)
• Make sure that “Search Options -> Search Algorithms” in the toolbar is set to “Depth-First Search”.
• Step through the algorithm with the “Fine Step” or “Step” buttons in the toolbat• The panel above the graph panel verbally describes what is
happening during each step• The panel at the bottom shows how the frontier evolves
See available help pages and video tutorials for more details on how to use the Search applet (http://www.aispace.org/search/index.shtml)
NOTE: p2 is only selected when all paths extending p1 have been explored.
Slide 36
Depth-first Search: DFS
Example:• the frontier is [p1, p2, …, pr] - each pk is a path
• neighbors of last node of p1 are {n1, …, nk}
• What happens?• p1 is selected, and its last node is tested for being a goal. If not
• K new paths are created by adding each of {n1, …, nk} to p1
• These K new paths replace p1 at the beginning of the frontier.
• Thus, the frontier is now: [(p1, n1), …, (p1, nk), p2, …, pr] .
• You can get a much better sense of how DFS works by looking at the Search Applet in AI Space
Analysis of DFS
• Is DFS complete?
.
• Is DFS optimal?
• What is the time complexity, if the maximum path length is m and the maximum branching factor is b ?
• What is the space complexity?
We will look at the answers in AISpace (but see next few slides for a summary of what we do)
Analysis of DFS
Def. : A search algorithm is complete if whenever there is at least one
solution, the algorithm is guaranteed to find it within a finite amount of time.
Is DFS complete? No
• If there are cycles in the graph, DFS may get “stuck” in one of them• see this in AISpace by loading “Cyclic Graph Examples” or by adding a
cycle to “Simple Tree” • e.g., click on “Create” tab, create a new edge from N7 to N1, go back to
“Solve” and see what happens
Analysis of DFS
39
Is DFS optimal? No
Def.: A search algorithm is optimal if
when it finds a solution, it is the best one (e.g., the shortest)
• It can “stumble” on longer solution
paths before it gets to shorter ones. • E.g., goal nodes: red boxes
• see this in AISpace by loading “Extended Tree Graph” and set N6 as a goal• e.g., click on “Create” tab, right-click on N6 and select “set as a goal node”
Analysis of DFS
40
• What is DFS’s time complexity, in terms of m and b ?
• In the worst case, must examine
every node in the tree• E.g., single goal node -> red box
Def.: The time complexity of a search algorithm is
the worst-case amount of time it will take to run, expressed in terms of
- maximum path length m - maximum forward branching factor b.
O(bm)
Analysis of DFS
41
Def.: The space complexity of a search algorithm is the
worst-case amount of memory that the algorithm will use (i.e., the maximum number of nodes on the frontier), expressed in terms of
- maximum path length m - maximum forward branching factor b.
O(bm)
• What is DFS’s space complexity, in terms of m and b ?
- for every node in the path currently explored, DFS maintains a path to its unexplored siblings in the search tree - Alternative paths that DFS needs to explore
- The longest possible path is m, with a maximum of b-1 alterative paths per node
See how this works in
Analysis of DFS: Summary
• Is DFS complete? NO• Depth-first search isn't guaranteed to halt on graphs with cycles.• However, DFS is complete for finite acyclic graphs.
• Is DFS optimal? NO• It can “stumble” on longer solution paths before it gets to
shorter ones. • What is the time complexity, if the maximum path length is m and
the maximum branching factor is b ?• O(bm) : must examine every node in the tree.• Search is unconstrained by the goal until it happens to stumble on the goal.
• What is the space complexity?• O(bm)• the longest possible path is m, and for every node in that path must maintain
a fringe of size b.
DFS is appropriate when.• Space is restricted• Many solutions, with long path length
It is a poor method when• There are cycles in the graph• There are sparse solutions at shallow depth• There is heuristic knowledge indicating when one path is
better than another
Analysis of DFS (cont.)
Why DFS need to be studied and understood?
• It is simple enough to allow you to learn the basic aspects of searching
• It is the basis for a number of more sophisticated useful search algorithms
Breadth-first search (BFS)
• BFS explores all paths of length l on the frontier, before looking at path of length l + 1
Input: a graph a set of start nodes Boolean procedure goal(n)
testing if n is a goal nodefrontier:= [<s>: s is a start node]; While frontier is not empty: select and remove path <no,….,nk>
from frontier; If goal(nk)
return <no,….,nk>;
Else For every neighbor n of nk,
add <no,….,nk, n> to frontier;
end
BFS as an instantiation of the Generic Search Algorithm
In BFS, the frontier is afirst-in-first-out queue
Let’s see how this works in AIspacein the Search Applet toolbar , set “Search Options -> Search Algorithms” to “Breadth-First Search”.
CPSC 322, Lecture 5 Slide 47
Breadth-first Search: BFS
Example:• the frontier is [p1,p2, …, pr]
• neighbors of the last node of p1 are {n1, …, nk}
• What happens?• p1 is selected, and its end node is tested for being a goal. If not
• New k paths are created attaching each of {n1, …, nk} to p1
• These follow pr at the end of the frontier.
• Thus, the frontier is now [p2, …, pr, (p1, n1), …, (p1, nk)].
• p2 is selected next.
As for DFS, you can get a much better sense of how BFS works by
looking at the Search Applet in AI Space
Analysis of BFS
• Is BFS complete?
.
• Is BFS optimal?
• What is the time complexity, if the maximum path length is m and the maximum branching factor is b ?
• What is the space complexity?
Analysis of BFS
49
Def. : A search algorithm is complete if
whenever there is at least one solution, the algorithm is guaranteed to find it within a finite amount of time.
Is BFS complete? Yes
• If a solution exists at level l, the path to it will be explored before any other path of length l + 1 • impossible to fall into an infinite cycle
• see this in AISpace by loading “Cyclic Graph Examples” or by adding a cycle to “Simple Tree”
Analysis of BFS
50
Is BFS optimal? Yes
Def.: A search algorithm is optimal if
when it finds a solution, it is the best one
• E.g., two goal nodes: red boxes
• Any goal at level l (e.g. red box
N 7) will be reached before goals
at lower levels
Analysis of BFS
51
• What is BFS’s time complexity, in terms of m and b ?
Def.: The time complexity of a search algorithm is
the worst-case amount of time it will take to run, expressed in terms of
- maximum path length m - maximum forward branching factor b.
O(bm)
• Like DFS, in the worst case BFS
must examine every node in the tree• E.g., single goal node -> red box
Analysis of BFS
52
Def.: The space complexity of a search algorithm is the
worst case amount of memory that the algorithm will use (i.e., the maximal number of nodes on the frontier), expressed in terms of
- maximum path length m - maximum forward branching factor b.
O(bm)
• What is BFS’s space complexity, in terms of m and b ?
- BFS must keep paths to all the nodes al level m
Slide 53
Analysis of Breadth-First Search• Is BFS complete?
• Yes (we are assuming finite branching factor)
• Is BFS optimal?• Yes
• What is the time complexity, if the maximum path length is m and the maximum branching factor is b?• The time complexity is O(bm): must examine every node in the tree.
• What is the space complexity?• Space complexity is O(bm): must store the whole frontier in memory
Using Breadth-first Search• When is BFS appropriate?
• space is not a problem• it's necessary to find the solution with the fewest arcs• When there are some shallow solutions• there may be infinite paths
• When is BFS inappropriate?• space is limited• all solutions tend to be located deep in the tree• the branching factor is very large
Iterative Deepening DFS (IDS)
How can we achieve an acceptable (linear) space complexity
while maintaining completeness and optimality?
Key Idea: re-compute elements of the frontier rather
than saving them.
depth = 1
depth = 2
depth = 3
. . .
Iterative Deepening DFS (IDS) in a Nutshell
• Use DFS to look for solutions at depth 1, then 2, then 3, etc– For depth D, ignore any paths with longer length– Depth-bounded depth-first search
If no goal re-start from scratch and get to depth 2
If no goal re-start from scratch and get to depth 3
If no goal re-start from scratch and get to depth 4
(Time) Complexity of IDS
Depth
Total # of paths at that level
#times created by BFS (or DFS)
#times created by IDS
Total #paths (re) created by IDS
1
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
m-1
m57
• That sounds wasteful!• Let’s analyze the time complexity• For a solution at depth m with branching factor b
(Time) Complexity of IDS
Depth
Total # of paths at that level
#times created by BFS (or DFS)
#times created by IDS
Total #paths for IDS
1 b 1
2 b2 1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
m-1 bm-1 1
m bm 1 58
m
m-1
21
mb(m-1) b2
2 bm-1
bm
• That sounds wasteful!• Let’s analyze the time complexity• For a solution at depth m with branching factor b
Solution at depth m, branching factor b
Total # of paths generated:
bm + 2 bm-1 + 3 bm-2 + ...+ mb
= bm (1 b0 + 2 b-1 + 3 b-2 + ...+ m b1-m )
)(1
1
m
i
im ibb
(Time) Complexity of IDS
2
1
b
bbm)(
1
)1(
i
im ibb
converges to
Overhead factor
For b= 10, m = 5. BSF 111,111 and ID = 123,456 (only 11% more nodes)
The larger b is, the better, but even with b = 2 the search ID will take only ~4 times as much as BFS
)( mbO
Further Analysis of Iterative Deepening DFS (IDS)
• Space complexity
• Complete?
• Optimal?
60
Further Analysis of Iterative Deepening DFS (IDS)
• Space complexity
• DFS scheme, only explore one branch at a time
• Complete?
• Only paths up to depth m, doesn't explore longer paths – cannot get trapped in infinite cycles, gets to a solution first
• Optimal?
61
Yes
Yes
O(mb)
Search with CostsSometimes there are costs associated with arcs.
In this setting we often don't just want to find any solution• we usually want to find the solution that minimizes cost
),cost(,,cost1
10
k
iiik nnnn
Def.: The cost of a path is the sum of the costs of its arcs
Example: Traveling in Romania
Search with CostsSometimes there are costs associated with arcs.
In this setting we often don't just want to find any solution• we usually want to find the solution that minimizes cost
),cost(,,cost1
10
k
iiik nnnn
Def.: The cost of a path is the sum of the costs of its arcs
Def.: A search algorithm is optimal if
when it finds a solution, it has the lowest path cost
• At each stage, Lowest-cost-first search selects the path with the lowest cost on the frontier.
• The frontier is implemented as a priority queue ordered by path cost.
Lowest-Cost-First Search (LCFS)
Exercise: see how this works in AIspace: in the Search Applet toolbar • select the “Vancouver Neighborhood Graph” problem • set “Search Options -> Search Algorithms” to “Lowest-Cost-First ”.• select “Show Edge Costs” under “View”• Create a new node from UBC to SC with cost 20 and run LCFS
• When arc costs are equal LCFS is equivalent to..
Analysis of Lowest-Cost Search (1)
• Is LCFS complete?• yes, as long as arc costs are strictly positive
• Is LCFS optimal?• Yes if arc costs are guaranteed to be
Exercise: think of what happens when cost is negative or zero• e.g, add arc with cost -20 to the simple search graph from N4 to S
0
Exercise: think of what happens when there are negative costs
Analysis of LCFS
• What is the time complexity, if the maximum path length is m and
the maximum branching factor is b ?
• What is the space complexity?•
Analysis of LCFS
• What is the time complexity, if the maximum path length is m and
the maximum branching factor is b ?• The time complexity is O(bm)• In worst case, must examine every node in the tree because it generates all
paths from the start that cost less than the cost of the solution
• What is the space complexity?• Space complexity is O(bm): • E.g. uniform cost: just like BFS, in worst case frontier has to store all nodes
m-1 steps from the start node
Summary of Uninformed Search
Complete Optimal Time Space
DFS N N O(bm) O(mb)
BFS Y Y(shortest)
O(bm) O(bm)
IDS Y Y(shortest)
O(bm) O(mb)
LCFS Y Costs > 0
Y(Least Cost)
Costs >=0
O(bm) O(bm)
Summary of Uninformed Search (cont.)
Why are all these strategies called uninformed?
• Because they do not consider any information about the states and the goals to decide which path to expand first on the frontier• They are blind to the goal
• In other words, they are general and do not take into account the specific nature of the problem.
Today’s Lecture
• Simple Search Agent
• Uniformed Search
• Informed Search ( aka Heuristic Search)
• Blind search algorithms do not take into account the goal until they are at a goal node.
• Often there is extra knowledge that can be used to guide the search: - an estimate of the distance/cost from node n to a goal
node.
• This estimate is called a search heuristic.
Heuristic Search
73
More formallyDef.:
A search heuristic h(n) is an estimate of the cost of the optimal
(cheapest) path from node n to a goal node.
Estimate: h(n1)
Estimate: h(n2)
Estimate: h(n3)n3
n2
n1
• h can be extended to paths: h(n0,…,nk)=h(nk)
• h(n) uses only readily obtainable information (that is easy to compute) about a node
Example: finding routes
75
• What could we use as h(n)?
Example: finding routes
76
• What could we use as h(n)? E.g., the straight-line (Euclidian) distance between source and goal node
Example 2
Search problem: robot has to find a route from start to goal location on a grid with obstacles
Actions: move up, down, left, right from tile to tile
Cost : number of moves
Possible h(n)?
1 2 3 4 5 6
G
4
3
2
1
Example 2
Search problem: robot has to find a route from start to goal location on a grid with obstacles
Actions: move up, down, left, right from tile to tile
Cost : number of moves
Possible h(n)? Manhattan distance (L1 distance) between two points: sum of the (absolute) difference of their coordinates
1 2 3 4 5 6
G
4
3
2
1
• Idea: always choose the path on the frontier with the smallest h value.
• BestFS treats the frontier as a priority queue ordered by h.
• Greedy approach: expand path whose last node seems closest to the goal - chose the solution that is locally the best.
Best First Search (BestFS)
79
Let’s see how this works in AIspace: in the Search Applet toolbar • select the “Vancouver Neighborhood Graph” problem • set “Search Options -> Search Algorithms” to “Best-First ”.• select “Show Node Heuristics” under “View”• compare number of nodes expanded by BestFS and LCFS
It is not complete (See the “misleading heuristics
demo” example in CISPACE )
Analysis of Best First Searchnor optimalTry AISPACE example “ex-best.txt” from
schedule page (save it and then load using “load from file” option)
still has time and space worst-case complexity of O(bm)Why would one want to use Best Search then? • Because if the heuristics is good if can find the solution very fast.• See this in Aispace, Delivery problem graph with C1 linked to o123 (cost 3.0)
• Thus, having estimates of the distance to the goal can speed things up a lot, but by itself it can also mislead the search (i.e. Best First Search)
• On the other hand, taking only path costs into account allows LCSF to find the optimal solution, but the search process is still uniformed as far as distance to the goal goes.
• We will see how to leverage these two elements to get a more powerful search algorithm, A*
What’s Next?
• A* search takes into account both • the cost of the path to a node c(p) • the heuristic value of that path h(p).
• Let f(p) = c(p) + h(p). • f(p) is an estimate of the cost of a path from the start to a goal
via p.
A* Search
c(p) h(p)
f(p)
A* always chooses the path on the frontier with the lowest estimated distance
from the start to a goal node constrained to go via that path.
CPSC 322, Lecture 7 Slide 83
A* is complete (finds a solution, if one exists)
and optimal (finds the optimal path to a goal) if
• the branching factor is finite• arc costs are > 0 • h(n) is admissible -> an underestimate of the length of the
shortest path from n to a goal node.
This property of A* is called admissibility of A*
Admissibility of A*
Admissibility of a heuristic
85
Def.: Let c(n) denote the cost of the optimal path from node
n to any goal node. A search heuristic h(n) is called admissible if h(n) ≤ c(n) for all nodes n, i.e. if for all nodes it is an underestimate of the cost to any goal.
• Example: is the straight-line distance (SLD) admissible?
Admissibility of a heuristic
86
Def.: Let c(n) denote the cost of the optimal path from node
n to any goal node. A search heuristic h(n) is called admissible if h(n) ≤ c(n) for all nodes n, i.e. if for all nodes it is an underestimate of the cost to any goal.
• Example: is the straight-line distance admissible?
- Yes! The shortest distance between two points is a line.
Example 2: grid world• Search problem: robot has to find a route from start
to goal location G on a grid with obstacles• Actions: move up, down, left, right from tile to tile• Cost : number of moves• Possible h(n)?
- Manhattan distance (L1 distance) to the goal G: sum of the (absolute) difference of their coordinates
- Admissible? YES
871 2 3 4 5 6
G
4
3
2
1
Example 3: Eight Puzzle
88
• One possible h(n):
Number of Misplaced Tiles
• Is this heuristic admissible? YES
Example 3: Eight Puzzle• Another possible h(n):
Sum of number of moves between each tile's current position and its goal position
• Is this heuristic admissible? YES
How to Construct an Admissible Heuristic
90
• Identify relaxed version of the problem: - where one or more constraints have been dropped- problem with fewer restrictions on the actions
• Grid world: …………….• Driver In Romania: …….• 8 puzzle:
- “number of misplaced tiles”:………………………………………………..
- “sum of moves between current and goal position”: ………………………………………….
• Why does this lead to an admissible heuristic?- …………………………..
How to Construct an Admissible Heuristic
91
• Identify relaxed version of the problem: - where one or more constraints have been dropped- problem with fewer restrictions on the actions
• Grid world: the agent can move through walls• Driver: the agent can move straight• 8 puzzle:
- “number of misplaced tiles”:tiles can move everywhere and occupy same spot as others
- “sum of moves between current and goal position”: tiles can occupy same spot as others
• Why does this lead to an admissible heuristic?- The problem only gets easier!
A* is complete (finds a solution, if one exists)
and optimal (finds the optimal path to a goal) if
• the branching factor is finite• arc costs are > 0 • h(n) is admissible -> an underestimate of the length of the
shortest path from n to a goal node.
This property of A* is called admissibility of A*
Now back to Admissibility of A*
It finds a solution if there is one (does not get caught in cycles)• Let
• fmin be the cost of the (an) optimal solution path s(unknown but finite if there exists a solution)
- cmin = >0 be the minimal cost of any arc
• Each sub-path p of s has f(p) ≤ fmin
- Due to admissibility
• A* expands path on the frontier with minimal f(n)- Always a subpath of s on the frontier
- Only expands paths p with f(p) ≤ fmin
- Terminates when expanding s• Because arc costs are positive, the cost of any other path p would eventually exceed
fmin , at depth less no greater than (fmin / cmin )
See how it works on the “misleading heuristic” problem in AI space:
Why is A* admissible: complete
• Let p* be the optimal solution path, with cost c*.
• Let p’ be a suboptimal solution path. That is c(p’) > c*.
We are going to show that any sub-path p’’ of p* on the frontier
will be expanded before p’
Therefore, A* will find p* before p’
Why is A* admissible: optimal
p’
p*
p”
• Let p* be the optimal solution path, with cost c*.
• Let p’ be a suboptimal solution path. That is c(p’) > c*.• Let p” be a sub-path of p* on the frontier.
Why is A* admissible: optimal
we know that because at a goal node
And f(p’’) f(p*) because
p’
p*
p”
Thus
f* f(p’)
f(p”) f(p’)
Any sup-path of the optimal solution path will be
expanded before p’
• Let p* be the optimal solution path, with cost c*.
• Let p’ be a suboptimal solution path. That is c(p’) > c*.• Let p” be a sub-path of p* on the frontier.
Why is A* admissible: optimal
we know that because at a goal node
f(p’’) <= f(p*)And because h is admissible
p’
p*
p”
f (goal) = c(goal)
Thus
f* < f(p’)
f(p”) < f(p’)
Any sup-path of the optimal solution path will be
expanded before p’
If fact, we can prove something even stronger about A* (when it is admissible)
A* is optimally efficient among the algorithms that
extend the search path from the initial state.
It finds the goal with the minimum # of expansions
Analysis of A*
Why A* is Optimally Efficient
No other optimal algorithm is guaranteed to expand fewer
nodes than A*
This is because any algorithm that does not expand every
node with f(n) < f* risks missing the optimal solution
Time Space Complexity of A*
• Time complexity is O(bm)• the heuristic could be completely uninformative and the edge
costs could all be the same, meaning that A* does the same thing as BFS
• Space complexity is O(bm) like BFS, A* maintains a frontier which grows with the size of the tree
Effect of Search Heuristic• A search heuristic that is a better approximation on the
actual cost reduces the number of nodes expanded by A*
Example: 8puzzle: (1) tiles can move anywhere
(h1 : number of tiles that are out of place)(2) tiles can move to any adjacent square
(h2 : sum of number of squares that separate each tile from its correct position)
average number of paths expanded: (d = depth of the solution)d=12 IDS = 3,644,035 paths
A*(h1) = 227 paths A*(h2) = 73 paths
d=24 IDS = too many paths A*(h1) = 39,135 paths A*(h2) = 1,641 paths
h2 dominates h1
because h2 (n) > h1 (n) for every n
Apply basic properties of search algorithms:
- completeness, optimality, time and space complexity
Complete Optimal Time Space
DFS N (Y if no cycles)
N O(bm) O(mb)
BFS Y Y O(bm) O(bm)
IDS Y Y O(bm) O(mb)
LCFS(when arc costs
available)
Y Costs > 0
Y Costs >=0
O(bm) O(bm)
Best First(when h available)
A*(when arc costs > 0 and h admissible)
Learning Goals
Apply basic properties of search algorithms:
- completeness, optimality, time and space complexity
Complete Optimal Time Space
DFS N (Y if no cycles)
N O(bm) O(mb)
BFS Y Y O(bm) O(bm)
IDS Y Y O(bm) O(mb)
LCFS(when arc costs
available)
Y Costs > 0
Y Costs >=0
O(bm) O(bm)
Best First(when h available)
N N O(bm) O(bm)
A*(when arc costs > 0 and h admissible)
Y Y O(bm) O(bm)
Learning Goals
Branch-and-Bound Search
• What does allow A* to do better than the other search algorithms we have seen?
• What is the biggest problem with A*?
• Possible Solution:
Branch-and-Bound Search
• What does allow A* to do better than the other search algorithms we have seen?
• What is the biggest problem with A*?
• Possible Solution:
Branch-and-Bound SearchOne way to combine DFS with heuristic guidance
• Follows exactly the same search path as depth-first search• But to ensure optimality, it does not stop at the first solution found
• It continues, after recording upper bound on solution cost• upper bound: UB = cost of the best solution found so far• Initialized to or any overestimate of optimal solution cost
• When a path p is selected for expansion:• Compute lower bound LB(p) = f(p) = cost(p) + h(p)
If LB(p) UB, remove p from frontier without expanding it (pruning)
Else expand p, adding all of its neighbors to the frontier• Requires admissible h
Example• Arc cost = 1• h(n) = 0 for every n
• Upper Bound (UB) = ∞
Solution!UB = ?
Before expanding a path p,check its f value f(p):Expand only if f(p) < UB
Example• Arc cost = 1• h(n) = 0 for every n
• UB = 5
Cost = 5Prune!
Example• Arc cost = 1• h(n) = 0 for every n
• UB = 5
Cost = 5Prune!
Cost = 5Prune!
Solution!UB =?
Example• Arc cost = 1• h(n) = 0 for every n
• UB = 3
Cost = 3Prune!
Cost = 3Prune!Cost = 3Prune!
Branch-and-Bound Analysis
• Complete? Not in general• Yes if there are no cycles (same as DFS) or if one can define a
suitable initial UB.
• Optimal: YES
• Time complexity: O(bm)
• Space complexity:
Slide 111
Other A* Enhancements• The main problem with A* is that it uses exponential space.
Branch and bound was one way around this problem, but can still get caught in infinite cycles or very long paths.
• Others?• Iterative deepening A*• Memory-bounded A*
• There are also ways to speed up the search via cycle checking and multiple path pruning
Study them in textbook and slides
Iterative Deepening A* (IDA*)
B & B can still get stuck in infinite (or extremely long)
paths• Search depth-first, but to a fixed depth, as we did for
Iterative Deepening • if you don't find a solution, increase the depth tolerance and
try again• depth is measured in f(n)
Iterative Deepening A* (IDA*)• The bound of the depth-bounded depth-first searches is in
terms of f(n)
• Starts at f(s): s is the start node with minimal value of h
• Whenever the depth-bounded search fails unnaturally, • Start new search with bound set to the smallest f-cost of any node
that exceeded the cutoff at the previous iteration
• Expands same nodes as A* , but re-computes them using DFS instead of storing them
Analysis of Iterative Deepening A* (IDA*)
• Complete and optimal? Yes, under the same conditions as A*• h is admissible• all arc costs > 0• finite branching factor
• Time complexity: O(bm)• Same argument as for Iterative Deepening DFS
• Space complexity:• Same argument as for Iterative Deepening DFS• But cost of recomputing levels can be a problem in practice,
if costs are real-valued.
O(bm)
Memory-bounded A*
• Iterative deepening A* and B & B use little memory• What if we have some more memory
(but not enough for regular A*)?• Do A* and keep as much of the frontier in memory as possible• When running out of memory
delete worst path (highest f value) from frontierBack the path up to a common ancestorSubtree gets regenerated only when all other paths have
been shown to be worse than the “forgotten” path
Slde 116
Memory-bounded A*
Details of the algorithm are beyond the scope of this course but
• It is complete if the solution is at a depth manageable by the available memory
• Optimal under the same conditions• Otherwise it returns the next best reachable solution
• Often used in practice, it is considered one of the best algorithms for finding optimal solutions
• It can be bogged down by having to switch back and forth among a set of candidate solution paths, of which only a few fit in memory
Selection Complete Optimal Time Space
DFS
BFS
IDS
LCFS
Best First
A*
B&B
IDA*
MBA*
Recap (Must Know How to Fill This
Selection Complete Optimal Time Space
DFS LIFO N N O(bm) O(mb)
BFS FIFO Y Y O(bm) O(bm)
IDS LIFO Y Y O(bm) O(mb)
LCFS min cost Y ** Y ** O(bm) O(bm)
Best First
min h N N O(bm) O(bm)
A* min f Y** Y** O(bm) O(bm)
B&B LIFO + pruning N Y O(bm) O(mb)
IDA* LIFO Y Y O(bm) O(mb)
MBA* min f Y** Y** O(bm) O(bm)
Algorithms Often Used in Practice
** Needs conditions
Cycle Checking and Multiple Path Pruning
• Cycle checking: good when we want to avoid infinite loops, but also want to find more than one solution, if they exist
• Multiple path pruning: good when we only care about finding one solution• Subsumes cycle checking
State space graph vs search tree
k c
b z
h
ad
f
k
c b
zk
a dc
If there are cycles or multiple paths, the two look very different
h
b
k
c bf
State space graph represents the states in a given search problem, and how they are connected by the available operators
Search Tree: Shows how the search space is traversed by a given search algorithm: explicitly “unfolds” the paths that are expanded.
Size of state space vs. search treeIf there are cycles or multiple paths, the two look very different
A
B
C
D
A
B
C
B
C CC
D D• With cycles or multiple parents, the search tree can be
exponential in the state space- E.g. state space with 2 actions from each state to next- With d + 1 states, search tree has depth d
• 2d possible paths through the search space => exponentially larger
search tree!
• What is the computational cost of cycle checking?
Cycle Checking
• You can prune a node n that is on the path from the start node to n.
• This pruning cannot remove an optimal solution => cycle check
• Using depth-first methods, with the graph explicitly stored, this can be done in constant time- Only one path being explored at a time
• Other methods: cost is linear in path length- (check each node in the path)
Cycle Checking
• See how DFS and BFS behave when
Search Options-> Pruning -> Loop detection
is selected• Set N1 to be a normal node so that there is only one start
node. • Check, for each algorithm, what happens during the first
expansion from node 3 to node 2
• If we only want one path to the solution• Can prune path to a node n that has already been reached via a
previous path- Store S := {all nodes n that have been expanded}- For newly expanded path p = (n1,…,nk,n)
- Check whether n S- Subsumes cycle check
Multiple Path Pruning
n
• See how it works by– Running BFS on the Cyclic Graph Example in CISPACE– See how it handles the multiple paths from N0 to N2– You can erase start node N1 to simplify things
Multiple Path Pruning
n
Multiple-Path Pruning & Optimal Solutions
• Problem: what if a subsequent path to n is shorter than the first path to n, and we want an optimal solution ?
• Can remove all paths from the frontier that use the longer path: these can’t be optimal.
• Can change the initial segment of the paths on the frontier to use the shorter
• Or…
2 2
1 1 1
CPSC 322, Lecture 10 Slide 127
Search in Practice
Complete Optimal Time Space
DFS N N O(bm) O(mb)
BFS Y Y O(bm) O(bm)
IDS(C) Y Y O(bm) O(mb)
LCFS Y Y O(bm) O(bm)
BFS N N O(bm) O(bm)
A* Y Y O(bm) O(bm)
B&B N Y O(bm) O(mb)
IDA* Y Y O(bm) O(mb)
MBA* N N O(bm) O(bm)
BDS Y Y O(bm/2) O(bm/2)
Search in Practice (cont’)
Many paths to solution,
no ∞ paths?
Informed?
Large branching factor?
NO
NO
NO
YY
Y
IDS
B&B
IDA*
MBA*
Sample A* applications
• An Efficient A* Search Algorithm For Statistical Machine Translation. 2001
• The Generalized A* Architecture. Journal of Artificial Intelligence Research (2007) • Machine Vision … Here we consider a new compositional
model for finding salient curves.
• Factored A*search for models over sequences and trees International Conference on AI. 2003• It starts saying… The primary challenge when using A* search is to
find heuristic functions that simultaneously are admissible, close to actual completion costs, and efficient to calculate…
• applied to NLP and BioInformatics
Slide 129
Slide 130
• Search is a key computational mechanism in many AI agents
• We studies the basic principles of search on the simple deterministic planning agent model
Generic search approach: • define a search space graph, • start from current state, • incrementally explore paths from current state until goal state
is reached.
The way in which the frontier is expanded defines the search strategy.
Lecture Summary
Learning Goals for search
• Identify real world examples that make use of deterministic, goal-driven search agents
• Assess the size of the search space of a given search problem. • Implement the generic solution to a search problem. • Apply basic properties of search algorithms:
-completeness, optimality, time and space complexity of search algorithms.
• Select the most appropriate search algorithms for specific problems.
• Define/read/write/trace/debug the different search algorithms we covered
• Construct heuristic functions for specific search problems• Formally prove A* optimality.• Define optimally efficient
Slide 132
Do all the “Graph Searching exercises” available at http://www.aispace.org/exercises.shtmlPlease, look at solutions only after you have tried hard to solve them!
• Make sure you can access the class discussion forum in Conect
Read Chp 4 of textbook
TODO for next Tu.