game playing - github pages · slide 1 game playing. part 2 alpha-beta pruning. yingyu liang....
TRANSCRIPT
slide 1
Game PlayingPart 2 Alpha-Beta Pruning
Yingyu [email protected]
Computer Sciences DepartmentUniversity of Wisconsin, Madison
[based on slides from A. Moore, C. Dyer, J. Skrentny, Jerry Zhu]
slide 2
S
A
C200
D100
B
E120
F20
max
min
max
min
G
H150
I100
Review: Game Theoretic Value
150
20100
100
The bottom-up approach: needs the whole game tree. Too much space! The minimax algorithm: a variant of DFS to save space.
slide 3
Review: Minimax algorithmfunction Max-Value(s)inputs:
s: current state in game, Max about to playoutput: best-score (for Max) available from s
if ( s is a terminal state )then return ( terminal value of s )else
α := – ∞for each s’ in Succ(s)
α := max( α , Min-value(s’))return α
function Min-Value(s)output: best-score (for Min) available from s
if ( s is a terminal state )then return ( terminal value of s)else
β := ∞for each s’ in Succs(s)
β := min( β , Max-value(s’))return β
• Time complexity?O(bm) bad
• Space complexity? O(bm)
slide 4
Minimax algorithm in execution
S
A
C200
D100
B
E120
F20
max
min
max
min
G
H150
I100
α=-∞
slide 5
Minimax algorithm in execution
S
A
C200
D100
B
E120
F20
max
min
max
min
G
α=-∞
β=+∞
H150
I100
slide 6
Minimax algorithm in execution
S
A
C200
D100
B
E120
F20
max
min
max
min
G
α=-∞
β=200
H150
I100
The execution on the terminal nodes is omitted.
slide 7
Minimax algorithm in execution
S
A100
C200
D100
B
E120
F20
max
min
max
min
G
α=-∞
β=100
H150
I100
slide 8
Minimax algorithm in execution
S
A100
C200
D100
B
E120
F20
max
min
max
min
G
α=100
β=100
H150
I100
slide 9
Minimax algorithm in execution
S
B
E120
F20
max
min
max
min
G
α=100
β=+∞A100
C200
D100
H150
I100
slide 10
Minimax algorithm in execution
S
B
E120
F20
max
min
max
min
G
β=120A100
C200
D100
α=100
H150
I100
slide 11
Minimax algorithm in execution
S
B
E120
F20
max
min
max
min
G
β=20A100
C200
D100
α=100
H150
I100
slide 12
Minimax algorithm in execution
S
B
E120
F20
max
min
max
min
G
β=20A100
C200
D100
α=100
H150
I100
α=-∞
slide 13
Minimax algorithm in execution
S
B
E120
F20
max
min
max
min
G
β=20A100
C200
D100
α=100
H150
I100
α=150
slide 14
Minimax algorithm in execution
S
B
E120
F20
max
min
max
min
G
β=20A100
C200
D100
α=100
H150
I100
α=150
slide 15
Minimax algorithm in execution
S
B
E120
F20
max
min
max
min
G150
β=20A100
C200
D100
α=100
H150
I100
slide 16
Minimax algorithm in execution
S
B20
E120
F20
max
min
max
min
G150
A100
C200
D100
α=100
H150
I100
slide 17
alpha-beta pruning
Gives the same game theoretic values as minimax, but prunes part of the game tree.
"If you have an idea that is surely bad, don't take the time to see how truly awful it is." -- Pat Winston
slide 18
Recall: Minimax algorithm in execution
S
B
E120
F20
max
min
max
min
G
β=20A100
C200
D100
α=100
The subtree below G is omitted
No matter what the subtree of G looks like, B’s value <=20
slide 19
Recall: Minimax algorithm in execution
S
E120
F20
max
min
max
min
G
A100
C200
D100
α=100
The subtree below G is omitted
B<=20
slide 20
Recall: Minimax algorithm in execution
S
E120
F20
max
min
max
min
G
A100
C200
D100
α=100
The subtree below G is omitted
Max on S must choose A rather than B.B
<=20
slide 21
Recall: Minimax algorithm in execution
S
E120
F20
max
min
max
min
G
A100
C200
D100
α=100
The subtree below G is omitted
No need to check G!
B<=20
X
slide 22
Alpha-Beta Motivation Summary
S
A100
C200
D100
B
E120
F20
max
min
• Depth-first order• After returning from A, Max can get at least 100 at S• After returning from F, Max can get at most 20 at B• At this point, Max losts interest in B• There is no need to explore G. The subtree at G is
pruned. Saves time.
GX
slide 23
Alpha-beta pruning example 1
S
A
C200
D100
B
E120
F20
max
min
G
α=-∞β=+∞
• Keep two bounds along the path α: the best Max can do β: the best (smallest) Min can do
• If at anytime α exceeds β, the remaining children are pruned. Case 1: happens on a Min node Case 2: happens on a Max node
The subtree below G is omitted
slide 24
Alpha-beta pruning example 1
S
A
C200
D100
B
E120
F20
max
min
G
α=-∞β=+∞
α=-∞β=+∞
slide 25
Alpha-beta pruning example 1
S
A
C200
D100
B
E120
F20
max
min
G
α=-∞β=+∞
α=-∞β=200
slide 26
Alpha-beta pruning example 1
S
A100
C200
D100
B
E120
F20
max
min
G
α=-∞β=+∞
α=-∞β=100
slide 27
Alpha-beta pruning example 1
S
A100
C200
D100
B
E120
F20
max
min
G
α=100β=+∞
α=-∞β=100
slide 28
Alpha-beta pruning example 1
S
B
E120
F20
max
min
G
α=100β=+∞
α=100β=+∞
A100
C200
D100
slide 29
Alpha-beta pruning example 1
S
B
E120
F20
max
min
G
α=100β=120
A100
C200
D100
α=100β=+∞
slide 30
Alpha-beta pruning example 1
S
B
E120
F20
max
min
G
α=100β=20
X
A100
C200
D100
α=100β=+∞
slide 31
Alpha-beta pruning example 2
A
B
D20
E-10
F-20
G25
max
min
max
H30
• Keep two bounds along the path α: the best Max can do β: the best (smallest) Min can do
• If at anytime α exceeds β, the remaining children are pruned. Case 1: on a Min node; Case 2: on a Max node
Sα=-∞β=+∞
C
slide 32
Alpha-beta pruning example 2
A
B
D20
E-10
F-20
G25
max
min
max
• Keep two bounds along the path α: the best Max can do β: the best (smallest) Min can do
• If at anytime α exceeds β, the remaining children are pruned.
S
α=-∞β=+∞
α=-∞β=+∞
C
H30
slide 33
Alpha-beta pruning example 2
A
B
D20
E-10
F-20
G25
max
min
max
• Keep two bounds along the path α: the best Max can do β: the best (smallest) Min can do
• If at anytime α exceeds β, the remaining children are pruned.
S
α=-∞β=+∞
α=-∞β=+∞
α=-∞β=+∞
C
H30
slide 34
Alpha-beta pruning example 2
A
B
D20
E-10
F-20
G25
max
min
max
• Keep two bounds along the path α: the best Max can do β: the best (smallest) Min can do
• If at anytime α exceeds β, the remaining children are pruned.
S
α=20β=+∞
α=-∞β=+∞
α=-∞β=+∞
C
H30
slide 35
Alpha-beta pruning example 2
A
B20
D20
E-10
F-20
G25
max
min
max
• Keep two bounds along the path α: the best Max can do β: the best (smallest) Min can do
• If at anytime α exceeds β, the remaining children are pruned.
S
α=20β=+∞
α=-∞β=+∞
α=-∞β=+∞
C
H30
slide 36
Alpha-beta pruning example 2
A
B20
D20
E-10
F-20
G25
max
min
max
• Keep two bounds along the path α: the best Max can do β: the best (smallest) Min can do
• If at anytime α exceeds β, the remaining children are pruned.
S
α=-∞β=20
α=-∞β=+∞
C
H30
slide 37
Alpha-beta pruning example 2
A
B20
D20
E-10
F-20
G25
max
min
max
• Keep two bounds along the path α: the best Max can do β: the best (smallest) Min can do
• If at anytime α exceeds β, the remaining children are pruned.
S
α=-∞β=20
α=-∞β=20
α=-∞β=+∞
C
H30
slide 38
Alpha-beta pruning example 2
A
B20
D20
E-10
F-20
G25
max
min
max
• Keep two bounds along the path α: the best Max can do β: the best (smallest) Min can do
• If at anytime α exceeds β, the remaining children are pruned.
S
α=-20β=20
α=-∞β=20
α=-∞β=+∞
C
H30
slide 39
Alpha-beta pruning example 2
A
B20
D20
E-10
F-20
G25
max
min
max
• Keep two bounds along the path α: the best Max can do β: the best (smallest) Min can do
• If at anytime α exceeds β, the remaining children are pruned.
S
X
α=25β=20C
H30
slide 40
Alpha-beta pruningfunction Max-Value (s,α,β)inputs:
s: current state in game, Max about to playα: best score (highest) for Max along path to sβ: best score (lowest) for Min along path to s
output: min(β , best-score (for Max) available from s)if ( s is a terminal state )then return ( terminal value of s )else for each s’ in Succ(s)α := max( α , Min-value(s’,α,β))if ( α ≥ β ) then return β /* alpha pruning */
return αfunction Min-Value(s,α,β)output: max(α , best-score (for Min) available from s )
if ( s is a terminal state )then return ( terminal value of s)else for each s’ in Succs(s)β := min( β , Max-value(s’,α,β))if (α ≥ β ) then return α /* beta pruning */
return β
Starting from the root:Max-Value(root, -∞, +∞)
slide 41
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If at anytime α exceeds β, the remaining children are pruned.
AAAα=-
O
W-3
B
N4
F G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
max– = – ∞+ = + ∞
[Example from James Skrentny]
slide 42
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
BBBβ=+
O
W-3
N4
F G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=-
max
min
[Example from James Skrentny]
slide 43
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
FFFα=-
O
W-3
Bβ=+
N4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=-
max
min
max
[Example from James Skrentny]
slide 44
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
O
W-3
Bβ=+
N4
Fα=-
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=-
max
N4
min
max
slide 45
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Fα=-
Fα=4
O
W-3
Bβ=+
N4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=-
max
min
max
α updated
slide 46
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
OOOβ=+
W-3
Bβ=+
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=-
max
min
max
min
slide 47
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Oβ=+
W-3
Bβ=+
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
K MH3
I8 J L
2
Aα=-
max
min
W-3
min
V-9
slide 48
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Oβ=+
Oβ=-3
W-3
Bβ=+
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=-
max
min
max
min
slide 49
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Oβ=-3
W-3
Bβ=+
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=-
max
min
max
minX-5
Pruned!
slide 50
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Bβ=+
Bβ=4
Oβ=-3
W-3
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=-
max
min
max
minX-5
slide 51
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Oβ=-3
W-3
Bβ=4
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=-
max
min
max
minX-5
G-5
slide 52
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=-
max
min
max
minX-5
G-5
slide 53
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=-5
max
min
max
minX-5
G-5
slide 54
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
CCCβ=+
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
ED0
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
max
min
max
minX-5
Aα=-5
slide 55
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
ED0
Cβ=+
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
max
min
max
minX-5
Aα=-5
H3
slide 56
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Cβ=+
Cβ=3
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
ED0
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
max
min
max
minX-5
Aα=-5
slide 57
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
ED0
Cβ=3
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
max
min
max
minX-5
Aα=-5
I8
slide 58
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
JJJα=-5
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
ED0
Cβ=3
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8
L2
Aα=
max
min
minX-5
Aα=-5
slide 59
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
ED0
Cβ=3
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8
Jα=-5
L2
Aα=
max
min
max
minX-5
Aα=-5
P9
slide 60
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Jα=9
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
ED0
Cβ=3
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8
L2
Aα=
max
min
max
minX-5
Aα=-5
Q-6
R0
slide 61
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Jα=-
Jα=9
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
ED0
Cβ=3
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8
L2
Aα=
max
min
max
minX-5
Aα=3
Q-6
R0
slide 62
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
ED0
Cβ=3
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8
Jα=9
L2
Aα=
max
min
max
minX-5
Aα=3
slide 63
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
Eβ=2
D0
Cβ=3
R0
P9
Q-6
S3
T5
U-7
V-9
Kα=5
MH3
I8
Jα=9
L2
Aα=
max
min
max
minX-5
Aα=3
slide 64
Alpha-beta pruning example• Keep two bounds along the path α: the best Max can do on the path β: the best (smallest) Min can do on the path
• If a max node exceeds β, it is pruned.• If a min node goes below α, it is pruned.
[Example from James Skrentny]
Oβ=-3
W-3
Bβ=-5
N4
Fα=4
G-5
X-5
Eβ=2
D0
Cβ=3
R0
P9
Q-6
S3
T5
U-7
V-9
Kα=5
MH3
I8
Jα=9
L2
Aα=
max
min
max
minX-5
Aα=3
slide 65
How effective is alpha-beta pruning?• Depends on the order of successors!
• In the best case, the number of nodes to search is O(bm/2), the square root of minimax’s cost.
• This occurs when each player's best move is the leftmost child. • In DeepBlue (IBM Chess), the average branching factor was
about 6 with alpha-beta instead of 35-40 without.• The worst case is no pruning at all.
Eβ=2
S3
T5
U4
V3
Kα=5
ML2
Eβ=2
S3
T5
U4
V3
Kα=5
M α=4
L2
Eβ=2
S3
T5
U4
V3
K ML2
Aα=3
Aα=3
Aα=3
slide 66
Game-playing for large games• We’ve seen how to find game theoretic values. But it
is too expensive for large games.• What do real chess-playing programs do? They can’t possibly search the full game tree They must respond in limited time They can’t pre-compute a solution
slide 67
Game-playing for large games• The most popular solution: heuristic evaluation
functions for games ‘Leaves’ are intermediate nodes at a depth cutoff,
not terminals Heuristically estimate their values Huge amount of knowledge engineering (R&N 6.4) Example: Tic-Tac-Toe: (number of 3-lengths open for me)-(number of 3-lengths open for you)
• Each move is a new depth-cutoff game-tree search (as opposed to search the complete game-tree once).
slide 68
Prelude• Heuristic estimation of the values of intermediate
nodes can be done via Machine Learning
Game state Machine Learning model Estimated value