game playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of...
TRANSCRIPT
![Page 1: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/1.jpg)
slide 1
Game Playing
Xiaojin Zhu
Computer Sciences Department
University of Wisconsin, Madison
[based on slides from A. Moore http://www.cs.cmu.edu/~awm/tutorials , C. Dyer, and J. Skrentny]
![Page 2: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/2.jpg)
slide 2
Sadly, not these games…
![Page 3: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/3.jpg)
slide 3
Overview
• two-player zero-sum discrete finite deterministic game of perfect information
• Minimax search
• Alpha-beta pruning
• Large games
• two-player zero-sum discrete finite NON-deterministic
game of perfect information
![Page 4: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/4.jpg)
slide 4
Two-player zero-sum discrete finite deterministic games of perfect information
Definitions:
• Zero-sum: one player’s gain is the other player’s loss.
Does not mean fair.
• Discrete: states and decisions have discrete values
• Finite: finite number of states and decisions
• Deterministic: no coin flips, die rolls – no chance
• Perfect information: each player can see the
complete game state. No simultaneous decisions.
![Page 5: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/5.jpg)
slide 5
Which of these are: Two-player zero-sum discrete finite deterministic games of perfect information?
[Shamelessly copied from Andrew Moore]
Zero-sum: one player’s gain is the other player’s loss. Does not mean fair.
Discrete: states and decisions
have discrete values
Finite: finite number of states
and decisions
Deterministic: no coin flips, die
rolls – no chance
Perfect information: each
player can see the
complete game state. No
simultaneous decisions.
![Page 6: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/6.jpg)
slide 6
Which of these are: Two-player zero-sum discrete finite deterministic games of perfect information?
[Shamelessly copied from Andrew Moore]
Zero-sum: one player’s gain is the other player’s loss. Does not mean fair.
Discrete: states and decisions
have discrete values
Finite: finite number of states
and decisions
Deterministic: no coin flips, die
rolls – no chance
Perfect information: each
player can see the
complete game state. No
simultaneous decisions.
![Page 7: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/7.jpg)
slide 7
II-Nim: Max simple game
• There are 2 piles of sticks. Each pile has 2 sticks.
• Each player takes one or more sticks from one pile.
• The player who takes the last stick loses.
(ii, ii)
![Page 8: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/8.jpg)
slide 8
The game tree for II-Nim
(ii ii) Max
(i ii) Min (- ii) Min
(i i) Max (- ii) Max (- i) Max (- i) Max (- -) Max
+1
(- i) Min (- -) Min
-1 (- i) Min (- -) Min
-1 (- -) Min
-1
(- -) Max
+1 (- -) Max
+1
Symmetry
(i ii) = (ii i)
Convention: score is w.r.t. the first playerMax.Min’sscore=– Max
who is to move at this state
Two players:
Max and Min
Max wants the largest score
Min wants the smallest score
![Page 9: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/9.jpg)
slide 9
The game tree for II-Nim
(ii ii) Max
(i ii) Min (- ii) Min
(i i) Max (- ii) Max (- i) Max (- i) Max (- -) Max
+1
(- i) Min
+1 (- -) Min
-1 (- i) Min (- -) Min
-1 (- -) Min
-1
(- -) Max
+1 (- -) Max
+1
Symmetry
(i ii) = (ii i)
Convention: score is w.r.t. the first playerMax.Min’sscore=– Max
who is to move at this state
Two players:
Max and Min
Max wants the largest score
Min wants the smallest score
![Page 10: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/10.jpg)
slide 10
The game tree for II-Nim
(ii ii) Max
(i ii) Min (- ii) Min
(i i) Max
+1 (- ii) Max
+1 (- i) Max
-1 (- i) Max
-1 (- -) Max
+1
(- i) Min
+1 (- -) Min
-1
(- i) Min +1
(- -) Min -1
(- -) Min
-1
(- -) Max
+1 (- -) Max
+1
Symmetry
(i ii) = (ii i)
Convention: score is w.r.t. the first playerMax.Min’sscore=– Max
who is to move at this state
Two players:
Max and Min
Max wants the largest score
Min wants the smallest score
![Page 11: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/11.jpg)
slide 11
The game tree for II-Nim
(ii ii) Max
-1
(i ii) Min
-1
(- ii) Min
-1
(i i) Max
+1 (- ii) Max
+1 (- i) Max
-1 (- i) Max
-1 (- -) Max
+1
(- i) Min
+1 (- -) Min
-1
(- i) Min +1
(- -) Min -1
(- -) Min
-1
(- -) Max
+1 (- -) Max
+1
Symmetry
(i ii) = (ii i)
Convention: score is w.r.t. the first playerMax.Min’sscore=– Max
who is to move at this state
Two players:
Max and Min
Max wants the largest score
Min wants the smallest score
![Page 12: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/12.jpg)
slide 12
The game tree for II-Nim
(ii ii) Max
-1
(i ii) Min
-1
(- ii) Min
-1
(i i) Max
+1 (- ii) Max
+1 (- i) Max
-1 (- i) Max
-1 (- -) Max
+1
(- i) Min
+1 (- -) Min -1
(- i) Min +1
(- -) Min -1
(- -) Min -1
(- -) Max
+1 (- -) Max
+1
Symmetry
(i ii) = (ii i)
Convention: score is w.r.t. the first playerMax.Min’sscore=– Max
who is to move at this state
Two players:
Max and Min
Max wants the largest score
Min wants the smallest score
The first player always loses, if the
second player plays optimally
![Page 13: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/13.jpg)
slide 13
Game theoretic value
• Game theoretic value (a.k.a. minimax value) of a node = the score of the terminal node that will be reached if both players play optimally.
• = The numbers we filled in.
• Computed bottom up
In Max’s turn, take the max of the children (Max
will pick that maximizing action)
In Min’s turn, take the min of the children (Min will
pick that minimizing action)
• Implemented as a modified version of DFS: minimax
algorithm
![Page 14: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/14.jpg)
slide 14
Minimax algorithm
function Max-Value(s) inputs: s: current state in game, Max about to play output: best-score (for Max) available from s
if ( s is a terminal state ) then return ( terminal value of s ) else α := – for each s’ in Succ(s) α := max( α , Min-value(s’)) return α
function Min-Value(s) output: best-score (for Min) available from s
if ( s is a terminal state ) then return ( terminal value of s) else β := for each s’ in Succs(s) β := min( β , Max-value(s’)) return β
• Time complexity?
• Space complexity?
![Page 15: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/15.jpg)
slide 15
Minimax example
A
O
W -3
B
N 4
F G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
max
min
max
min
max
![Page 16: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/16.jpg)
slide 16
Minimax algorithm
function Max-Value(s) inputs: s: current state in game, Max about to play output: best-score (for Max) available from s
if ( s is a terminal state ) then return ( terminal value of s ) else α := – for each s’ in Succ(s) α := max( α , Min-value(s’)) return α
function Min-Value(s) output: best-score (for Min) available from s
if ( s is a terminal state ) then return ( terminal value of s) else β := for each s’ in Succs(s) β := min( β , Max-value(s’)) return β
• Time complexity? O(bm) bad
• Space complexity?
O(bm)
![Page 17: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/17.jpg)
slide 17
Against a dumber opponent?
• Max surely loses!
• If Min not optimal,
• Which way?
• Why?
(ii ii) Max
-1
(i ii) Min
-1
(- ii) Min
-1
(i i) Max
+1 (- ii) Max
+1 (- i) Max
-1 (- i) Max
-1 (- -) Max
+1
(- i) Min
+1 (- -) Min
-1
(- i) Min +1
(- -) Min -1
(- -) Min
-1
(- -) Max
+1 (- -) Max
+1
![Page 18: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/18.jpg)
slide 18
Next: alpha-beta pruning
Gives the same game theoretic values as minimax, but prunes part of the game tree.
"If you have an idea that is surely bad, don't take the time
to see how truly awful it is." -- Pat Winston
![Page 19: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/19.jpg)
slide 19
Alpha-Beta Motivation
S
A 100
C 200
D 100
B
E 120
F 20
max
min
• Depth-first order
• After returning from A, Max can get at least 100 at S
• After returning from F, Max can get at most 20 at B
• At this point, Max losts interest in B
• There is no need to explore G. The subtree at G is
pruned. Saves time.
G
![Page 20: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/20.jpg)
slide 20
Alpha-beta pruning
function Max-Value (s,α,β) inputs: s: current state in game, Max about to play α: best score (highest) for Max along path to s β: best score (lowest) for Min along path to s output: min(β , best-score (for Max) available from s)
if ( s is a terminal state ) then return ( terminal value of s ) else for each s’ in Succ(s) α := max( α , Min-value(s’,α,β)) if ( α ≥ β ) then return β /* alpha pruning */ return α
function Min-Value(s,α,β) output: max(α , best-score (for Min) available from s )
if ( s is a terminal state ) then return ( terminal value of s) else for each s’ in Succs(s) β := min( β , Max-value(s’,α,β))
if (α ≥ β ) then return α /* beta pruning */ return β
Starting from the root:
Max-Value(root, -, +)
![Page 21: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/21.jpg)
slide 21
Alpha pruning example
S
A 100
C 200
D 100
B
E 120
F 20
max
min
G
α=- β=+
• Keep two bounds along the path
: the best Max can do
: the best (smallest) Min can do
• If at anytime exceeds , the remaining children are
pruned.
![Page 22: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/22.jpg)
slide 22
Alpha pruning example
S
A 100
C 200
D 100
B
E 120
F 20
max
min
G
α=- β=+
α=- β=+
![Page 23: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/23.jpg)
slide 23
Alpha pruning example
S
A 100
C 200
D 100
B
E 120
F 20
max
min
G
α=- β=+
α=- β=200
![Page 24: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/24.jpg)
slide 24
Alpha pruning example
S
A 100
C 200
D 100
B
E 120
F 20
max
min
G
α=- β=+
α=- β=100
![Page 25: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/25.jpg)
slide 25
Alpha pruning example
S
A 100
C 200
D 100
B
E 120
F 20
max
min
G
α=100 β=+
α=- β=100
α=100 β=+
![Page 26: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/26.jpg)
slide 26
Alpha pruning example
S
A 100
C 200
D 100
B
E 120
F 20
max
min
G
α=100 β=+
α=- β=100
α=100 β=120
![Page 27: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/27.jpg)
slide 27
Alpha pruning example
S
A 100
C 200
D 100
B
E 120
F 20
max
min
G
α=100 β=+
α=- β=100
α=100 β=20
X
![Page 28: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/28.jpg)
slide 28
Beta pruning example
A
B 20
D 20
E -10
C 25
F -20
G 25
max
min
max
H
• Keep two bounds along the path
: the best Max can do
: the best (smallest) Min can do
• If at anytime exceeds , the remaining children are
pruned.
S
X
![Page 29: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/29.jpg)
slide 29
Yet another alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If at anytime exceeds , the remaining children are
pruned.
A A A α=-
O
W -3
B
N 4
F G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
max
– = –
+ = +
[Example from James Skrentny]
![Page 30: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/30.jpg)
slide 30
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
B B
B β=+
O
W -3
N 4
F G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=-
max
min
[Example from James Skrentny]
![Page 31: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/31.jpg)
slide 31
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
F F F α=-
O
W -3
B β=+
N 4
G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=-
max
min
max
[Example from James Skrentny]
![Page 32: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/32.jpg)
slide 32
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O
W -3
B β=+
N 4
F α=-
G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=-
max
N 4
min
max
![Page 33: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/33.jpg)
slide 33
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
F α=-
F α=4
O
W -3
B β=+
N 4
G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=-
max
min
max
updated
![Page 34: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/34.jpg)
slide 34
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O O
O β=+
W -3
B β=+
N 4
F α=4
G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=-
max
min
max
min
![Page 35: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/35.jpg)
slide 35
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O β=+
W -3
B β=+
N 4
F α=4
G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
K M H 3
I 8
J L 2
A α=-
max
min
W -3
min
V -9
![Page 36: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/36.jpg)
slide 36
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O β=+
O β=-3
W -3
B β=+
N 4
F α=4
G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=-
max
min
max
min
![Page 37: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/37.jpg)
slide 37
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O β=-3
W -3
B β=+
N 4
F α=4
G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=-
max
min
max
min X -5
Pruned!
![Page 38: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/38.jpg)
slide 38
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
B β=+
B β=4
O β=-3
W -3
N 4
F α=4
G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=-
max
min
max
min X -5
![Page 39: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/39.jpg)
slide 39
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O β=-3
W -3
B β=4
N 4
F α=4
G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=-
max
min
max
min X -5
G -5
![Page 40: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/40.jpg)
slide 40
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=-
max
min
max
min X -5
G -5
![Page 41: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/41.jpg)
slide 41
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E D 0
C
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=-5
max
min
max
min X -5
G -5
![Page 42: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/42.jpg)
slide 42
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
C C
C β=+
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E D 0
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=
max
min
max
min X -5
A α=-5
![Page 43: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/43.jpg)
slide 43
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E D 0
C β=+
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=
max
min
max
min X -5
A α=-5
H 3
![Page 44: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/44.jpg)
slide 44
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
C β=+
C β=3
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E D 0
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=
max
min
max
min X -5
A α=-5
![Page 45: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/45.jpg)
slide 45
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E D 0
C β=3
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J L 2
A α=
max
min
max
min X -5
A α=-5
I 8
![Page 46: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/46.jpg)
slide 46
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
J J
J α=-5
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E D 0
C β=3
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
L 2
A α=
max
min
min X -5
A α=-5
![Page 47: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/47.jpg)
slide 47
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E D 0
C β=3
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J α=-5
L 2
A α=
max
min
max
min X -5
A α=-5
P 9
![Page 48: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/48.jpg)
slide 48
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
J α=-
J α=9
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E D 0
C β=3
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
L 2
A α=
max
min
max
min X -5
A α=-5
Q -6
R 0
![Page 49: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/49.jpg)
slide 49
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
J α=-
J α=9
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E D 0
C β=3
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
L 2
A α=
max
min
max
min X -5
A α=3
Q -6
R 0
![Page 50: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/50.jpg)
slide 50
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E D 0
C β=3
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K M H 3
I 8
J α=9
L 2
A α=
max
min
max
min X -5
A α=3
![Page 51: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/51.jpg)
slide 51
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E β=2
D 0
C β=3
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K α=5
M H 3
I 8
J α=9
L 2
A α=
max
min
max
min X -5
A α=3
![Page 52: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/52.jpg)
slide 52
Alpha-beta pruning example • Keep two bounds along the path
: the best Max can do on the path
: the best (smallest) Min can do on the path
• If a max node exceeds , it is pruned.
• If a min node goes below , it is pruned.
[Example from James Skrentny]
O β=-3
W -3
B β=-5
N 4
F α=4
G -5
X -5
E β=2
D 0
C β=3
R 0
P 9
Q -6
S 3
T 5
U -7
V -9
K α=5
M H 3
I 8
J α=9
L 2
A α=
max
min
max
min X -5
A α=3
![Page 53: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/53.jpg)
slide 53
How effective is alpha-beta pruning?
• Depends on the order of successors!
• In the best case, the number of nodes to search is O(bm/2), the
square root of minimax’s cost.
• This occurs when each player's best move is the leftmost child.
• In DeepBlue (IBM Chess), the average branching factor was
about 6 with alpha-beta instead of 35-40 without.
• The worst case is no pruning at all.
E β=2
S 3
T 5
U 4
V 3
K α=5
M L 2
E β=2
S 3
T 5
U 4
V 3
K α=5
M α=4
L 2
E β=2
S 3
T 5
U 4
V 3
K M L
2
A α=3
A α=3
A α=3
![Page 54: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/54.jpg)
slide 54
Game-playing for large games
• We’ve seen how to find game theoretic values. But it is too expensive for large games.
• What do real chess-playing programs do?
They can’t possibly search the full game tree
They must respond in limited time
They can’t pre-compute a solution
![Page 55: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/55.jpg)
slide 55
Game-playing for large games
• The most popular solution: heuristic evaluation functions for games
‘Leaves’ are intermediate nodes at a depth cutoff,
not terminals
Heuristically estimate their values
Huge amount of knowledge engineering (R&N 6.4)
Example: Tic-Tac-Toe:
(number of 3-lengths open for me)-(number of 3-lengths open for you)
• Each move is a new depth-cutoff game-tree search
(as opposed to search the complete game-tree
once).
• Depth-cutoff can increase using iterative deepening,
as long as there is time left.
![Page 56: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/56.jpg)
slide 56
More on large games
• Battle the limited search depth
Horizon effect: things can suddenly get much
worse just outside your search depth (‘horizon’),
but you can’t see that
Quiescence / secondary search: select the most
‘interesting’ nodes at the search boundary, expand
them further beyond the search depth
• Incorporate book moves
Pre-compute / record opening moves, end games
![Page 57: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/57.jpg)
slide 57
• There is an element of chance (coin flip, dice roll, etc.)
• “Chance node” in game tree, besides Max and Min nodes.
Neither player makes a choice. Instead a random choice
is made according to the outcome probabilities.
Two-player zero-sum discrete finite NONdeterministic games of perfect information
Max
chance
Min Min
-20 +4
Min
chance
+3
Max
+10
Max
-5
Max
p=0.8 p=0.2
p=0.5 p=0.5
![Page 58: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/58.jpg)
slide 58
Solving non-deterministic games
• Easy to extend minimax to non-deterministic games
• At chance node, instead of using max() or min(),
compute the average (weighted by the probabilities).
• What’s the value for the chance node at right?
• What action should Max take at root?
• The play will be optimal. In what sense?
Max
chance
Min Min
-20 +4
Min
chance
+3
Max
+10
Max
-5
Max
p=0.8 p=0.2
p=0.5 p=0.5
4*0.5+(-20)*0.5 = -8
![Page 59: Game Playingpages.cs.wisc.edu › ~jerryzhu › cs540 › handouts › games.pdf · games of perfect information Definitions: • Zero-sum: one player’s gain is the other player’s](https://reader034.vdocuments.mx/reader034/viewer/2022042316/5f04948d7e708231d40eacaf/html5/thumbnails/59.jpg)
slide 59
What you should know
• What is a two-player zero-sum discrete finite deterministic game of perfect information
• What is a game tree
• What is the minimax value of a game
• Minimax search
• Alpha-beta pruning
• Basic understanding of very large games
• How to extend minimax to non-deterministic games