159.3021 lecture 12 last time: csps, backtracking, forward checking today: game playing

22
159.302 1 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

Upload: vernon-lamb

Post on 15-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 1

Lecture 12

Last time: CSPs, backtracking, forward checking

Today:Game Playing

Page 2: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 2

Types of Games

Bridge, Poker, ScrabbleBattleships

Backgammon,Monopoly

Chess, Checkers,Go

Deterministic Chance

PerfectInformation

ImperfectInformation

Page 3: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 3

Two player game

Two players: MAX and MIN

MAX moves first and they take turns until the game is over.

PropertiesInitial state: e.g. board configuration of chessSuccessor function: list of (move,state) pairs specifying legal moves.Terminal test: Is the game finished?Utility function: Gives numerical value of terminal states. E.g. win (+1), loose (-1) and draw (0)

MAX uses a search tree to determine next move.

Page 4: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 4

Game tree for noughts and crosses

Page 5: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 5

What is the Optimal Strategy?

Assume MIN plays perfectly

for each node assign a value given by:the utility function if it is a terminal nodethe minimum of the successor nodes if it is a min nodethe maximum of the successor nodes if it is a max node

Page 6: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 6

Minimax Algorithm

For each move by the computer1. Perform depth-first search as far as the terminal

state2. Assign utilities at each terminal state3. Propagate upwards the minimax choices

If the parent is a minimizer (opponent)Propagate up the minimum value of the children

If the parent is a maximizer (computer)Propagate up the maximum value of the

children4. Choose the move (the child of the current node)

corresponding to the maximum of the minimax values of the children

Page 7: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 7

Minimax Algorithm

int MINIMAX(N) { if N is a leaf then return the score of this leaf else

Let N1, N

2, .., N

m be the successors of N;

if N is a Min node then

return min{MINIMAX(N1), .., MINIMAX(N

m)}

else

return max{MINIMAX(N1), .., MINIMAX(N

m)}

}

Page 8: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 8

Example

Page 9: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 9

A Real Game

Start with a stack of coinsEach player divides one of the current stacks into two unequal stacks (one having more coins than the other).The game ends when every stack contains one or two coinsThe first player who cannot play loses.

Page 10: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 10

Game Tree

7

6, 1 5, 2 4, 3

3, 2, 2 3, 3, 1

Min’s turn

5, 1, 1 4, 2, 1

Max’s turn

4, 1, 1, 1 3, 2, 1, 1 Max’s turn

Min’s turn

Max’s turn

3, 1, 1, 1, 1

2, 1, 1, 1, 1, 1

MAX Loses

Min’s turn

2, 2, 2, 1MAX Loses

2, 2, 1, 1, 1Min Loses

Page 11: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 11

Complexity

Time ComplexityO(bm)

Space ComplexityO(bm)

Where b is the branching factor and m is the maximum depth

For chess b=35, m=100 approximately, this is not feasible

Page 12: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 12

Evaluation functions

Not often practical to search all the way to the terminal states.Use a heuristic evaluation function to estimate which moves lead to better terminal states.

A cutoff test must be used to limit the search.

Choosing a good evaluation function is very important to the efficiency of this method.

The evaluation function must agree with the utility function for terminal states and it must be a good predictor of the terminal values. If the evaluation function is infallible then no search is necessary, just choose the move that gives the best position. The better the evaluation function the less search need to be done.

Page 13: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 13

Cutting off Search

Fixed depthOK, if we know how long it will take to evaluate the tree.

Iterative deepeningGood if there is a time limit, just keep going until time is up and use best so far.

Page 14: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 14

Can we do better?

Yesmimimax examines some branches that are already known to be bad.

Alpha-Beta pruningkeep a track of the best and worst values (alpha and beta)if one of the successors of a min node is worse than the best so far, go no further.If one of the successors of a max node is better than the worst so far, go no further.

Page 15: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 15

Alpha-Beta Algorithm

int MAX-VALUE (state, alpha, beta) { if CUTOFF-TEST (state) then return EVAL (state) v=-MAXVAL for each s in SUCCESSORS (state) { v = MAX (v, MIN-VALUE (s,alpha,beta)) if(v>=beta) return v if(v>alpha) alpha=v } return v}

int MIN-VALUE (state, alpha, beta) { if CUTOFF-TEST (state) then return EVAL (state) v=MAXVAL for each s in SUCCESSORS (state) { v = MIN (v, MAX-VALUE (s,alpha, beta)) if (v<=alpha) return v if (v<beta) beta=v } return v}

Page 16: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 16

Alpha-Beta Pruning

Page 17: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 17

Alpha-Beta Pruning

Page 18: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 18

Alpha-Beta Pruning

Page 19: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 19

Alpha-Beta Pruning

Page 20: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 20

Alpha-Beta Pruning

Page 21: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 21

Effectiveness of alpha-beta pruning.

What are the maximum savings possible?Suppose the tree is ordered as follows:^ ____________________________o____________________________ / | \v ________o________ _______o________ ________o________ / | \ / | \ / | \^ o o o o o o o o o / | \ / | \ / | \ / | \ / | \ / | \ / | \ / | \ / | \ 14 15 16 17 18 19 20 21 22 13 14 15 26 27 28 29 30 31 12 13 14 35 36 37 38 39 40 * * * * * * * * * * *

Only those nodes marked (*) need be evaluated.How many static evaluations are needed?.If b is the branching factor (3 above) and d is the depth (3 above)

s = 2bd/2 - 1 IF d is evens = b(d+1)/2 + b(d-1)/2 - 1 IF d is odd

Page 22: 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.302 22

Effectiveness of alpha-beta pruning.

For our tree d=3, b=3 and so s=11.This is only for the perfectly arranged tree. It gives a lower bound on the number of evaluations of approximately 2bd/2 .The worst case is bd (minimax)In practice, for reasonable games, the complexity is O(b3d/4). Using minimax with alpha beta pruning allows us to look ahead about half as far again as without.