look ahead cec2011

52
THE IMPORTANCE OF A LOOK-AHEAD DEPTH TO EVOLUTIONARY CHECKERS Belal Al-Khateeb Graham Kendall [email protected] [email protected] School of Computer Science (ASAP Group) University of Nottingham

Upload: belal-al-khateeb

Post on 06-Jul-2015

106 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Look ahead cec2011

THE IMPORTANCE OF A LOOK-AHEAD DEPTH TO

EVOLUTIONARY CHECKERS

Belal Al-Khateeb Graham Kendall [email protected] [email protected]

School of Computer Science (ASAP Group)

University of Nottingham

Page 2: Look ahead cec2011

Outline

-Introduction

- Checkers

- Samuel’s Checkers Program

- Previous Work

- Experimental Setup

- Results and Discussion

- Conclusions

2

Page 3: Look ahead cec2011

Checkers3

Opening Board of Checkers

Page 4: Look ahead cec2011

Checkers4

Black Forced to make Jump

move

Page 5: Look ahead cec2011

Checkers5

Black Gets King

Page 6: Look ahead cec2011

Samuel’s Checkers Program

- 1959, Arthur Samuel started to look at

Checkers

- The determination of weights through

self-play

- 39 Features

- Included look-ahead via mini-max (Alpha-

Beta)

- Defeated Robert Nealy

6

Page 7: Look ahead cec2011

o Samuels’s program defeated Robert

Nealy, although the victory is surrounded

in controversy

oDid he lose the game or did Samuel win?

How good was Samuel’s

player?7

Page 8: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 23 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : Just about to make move 16

How good was Samuel’s

player?8

Page 9: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 23 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program)

Forced Jump

How good was Samuel’s

player?9

Page 10: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 23 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program)

How good was Samuel’s

player?10

Page 11: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 23 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program)Strong

(Try to

keep)

Trapped

Only

advance to

Square 28

How good was Samuel’s

player?11

Page 12: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 23 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program)

How good was Samuel’s

player?12

What Move

would you

make?

20

21

22

26

32

Page 13: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 23 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program)

How good was Samuel’s

player?13

Page 14: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 23 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program)

How good was Samuel’s

player?14

o This was a

very poor

move

o It allowed

Samuel to

retain his

“Triangle of

Oreo

o AND.. By

moving his

checker

from 19 to

24 it

guaranteed

Samuel a

King

Page 15: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23K

How good was Samuel’s

player?15

Page 16: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23K

How good was Samuel’s

player?16

Page 17: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23K

16-12 then 5-1, Chinook said

would be a draw

How good was Samuel’s

player?17

Page 18: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23K

How good was Samuel’s

player?18

Page 19: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23K

How good was Samuel’s

player?19

Page 20: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23K

How good was Samuel’s

player?20

Page 21: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23K

How good was Samuel’s

player?21

Page 22: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23K

How good was Samuel’s

player?22

Page 23: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23

K

This

checker is

lost

How good was Samuel’s

player?23

Page 24: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23

K

How good was Samuel’s

player?24

Page 25: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23

K

This checker

could run (to

10) but..

K

How good was Samuel’s

player?25

Page 26: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23

K

K

How good was Samuel’s

player?26

Page 27: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23

K

K

How good was Samuel’s

player?27

Page 28: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23

K

Forced

Jump

K

How good was Samuel’s

player?28

Page 29: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23

K

K

How good was Samuel’s

player?29

Page 30: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23

K

K

How good was Samuel’s

player?30

Page 31: Look ahead cec2011

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 24

25 2627

28

29 30 31 32

White (Nealey)

Red (Samuel’s Program) : After Move 25

23

K

Victory

How good was Samuel’s

player?31

Page 32: Look ahead cec2011

Two Mistakes by Nealy

Allowing Samuel to get a King

Playing a move that led to defeat

when there was a draw available

How good was Samuel’s

player?32

Page 33: Look ahead cec2011

o The next year a six match rematch was

won by Nealy 5-1.

o Three years later (1966) the two world

championship challengers (Walter

Hellman and Derek Oldbury) played four

games each against Samuel’s program.

They won every game.

How good was Samuel’s

player?33

Page 34: Look ahead cec2011

Blondie24

- Produced by Fogel and Chellapilla in 1999-

2000

- Neural network as an evaluation function.

- Values for input nodes

Red (Black) – positive

White – negative

Empty – zero

- Piece differential

- Subsections (sub-boards)

34

Page 35: Look ahead cec2011

Blondie2435

Blondie24’s EANN Architecture

Page 36: Look ahead cec2011

Blondie24

- Initial population of 30 neural networks

(players).

- Each neural network plays 5 games (as red)

against 5 randomly chosen players:-

+1 for a win

0 for a draw

-2 for a loss

-Best 15 players retained, the other 15 players

eliminated.

-Copy the best 15 players (replacing the worst

15) and mutate the weights.

36

Page 37: Look ahead cec2011

Blondie24

- Repeat the process for 840 generations and

the best player after these generations is

retained.

- Played 165 games at zone.com.

- Rating: 2045.85 at that time

- In top 500 of over 120,000 players on

zone.com at that time.

- Better than 99.61% of registered players on

zone.com

End Product

37

Page 38: Look ahead cec2011

Blondie24

- There has been a lot of discussion about the

importance of the look-ahead depth

- It is believed to be important and many

people state it, but we wanted to investigate

- Fogel, in his work on evolving Blondie24 said

that “At four ply, there really isn’t any “deep”

search beyond what a novice could do with a

paper and pencil if he or she wanted to”.

38

Page 39: Look ahead cec2011

Blondie2439

-Generating four ply depth using a paper and

pencil:

- Not an easy task for novices.

- Time consuming.

- It might be done at some subconscious

level, where pruning is taking place.

- Has not been reported in the scientific

literature.

Page 40: Look ahead cec2011

Previous Work40

-Many researchers have shown the importance

of the look-ahead depth for computer games.

-None of them was related to checkers.

-Most of the findings are related to chess

- Increasing the depth level will produce

superior chess players.

Page 41: Look ahead cec2011

Previous Work41

- Runarsson and Jonsson for Othello:

- Better playing strategies are found when

using TD learning when ε–greedy is

applied with a lower look-ahead search

depth and a deeper look-ahead search

during game play.

Page 42: Look ahead cec2011

Experimental Setup

- For the purpose of investigating our

hypothesis an evolutionary checkers

player, was implemented.

- Our implementation has the same

structure and architecture that Fogel

utilised in Blondie24.

- Four players were evolved.

C1 is evolved using one ply depth.

C2 is evolved using two ply depth.

C3 is evolved using three ply depth.

C4 is evolved using four ply depth.

42

Page 43: Look ahead cec2011

Experimental Setup

-Our previous efforts to enhance Blondie24

introduced a round robin tournament. Al-Khateeb, B and Kendall, G Introducing a round robin tournament into Blondie24. CIG 2009: 112-116, 2009

- We also use this player, Blondie24-RR

(evolved using four ply) to investigate the

importance of the look-ahead depth.

- Three players were evolved (in addition to

Blondie24-RR.

- Blondie24-RR1Ply is evolved using one

ply.

- Blondie24-RR2Ply is evolved using two

ply.

43

Page 44: Look ahead cec2011

Experimental Setup

- C1, C2, C3 and C4 played against each other

by using the idea of a two-move ballot and

each player allowed to search to a depth of 6-

ply.

- The games were played until either one side

wins or a draw is declared after 100 moves

for each player.

- The same procedure was also used to play

Blondie24-RR1Ply, Blondie24-

RR2Ply, Blondie24-RR3Ply, Blondie24-RR.

44

Page 45: Look ahead cec2011

Results and Discussion45

C1 C2 C3 C4 Σ wins

C1- 28 17 13 58

C233 - 24 19 76

C345 31 - 27 103

C459 40 35 - 134

C1

C2 C3

C4 Σ draws

C1- 25 24 14 63

C225 - 31 27 83

C324 31 - 26 91

C414 27 26 - 67

Number of wins (for the row

player) out of 258 games.

Number of draws (for the row

player) out of 258 games.

Page 46: Look ahead cec2011

Results and Discussion46

Mean SD Class

C1

C2

1188.94 28.94 E

1206.24 27.62 D

C1

C3

1146.58 27.40 E

1266.18 26.14 D

C1

C4

1264.11 27.21 D

1474.99 26.14 C

C2

C3

1179.47 26.85 E

1205.10 25.60 D

C2

C4

1114.61 27.17 E

1200.21 25.88 D

C3

C4

1176.02 28.26 E

1205.26 26.98 D

C2 C3 C4

C1Red Lost Lost Lost

White Drawn Lost Lost

C2Red - Lost Lost

White - Drawn Lost

C3Red - Lost

White - Lost

Standard rating formula for all players after

5000 different orderings of the 86 games

played.

Wins/Loses for C1, C2, C3 and C4 when not

using two-move ballot.

Page 47: Look ahead cec2011

Results and Discussion47

1pl

y

2ply 3ply 4ply Σ wins

1ply - 28 20 14 62

2ply 32 - 29 21 82

3ply 42 34 - 27 103

4ply 57 46 39 - 142

1ply 2ply 3ply 4ply Σ draws

1ply - 26 24 15 65

2ply 26 - 23 19 68

3ply 24 23 - 20 67

4ply 15 19 20 - 54

Number of wins (for the row

player) out of 258 games for the

round robin players.

Number of draws (for the row

player) out of 258 games for the

round robin players.

Page 48: Look ahead cec2011

Results and Discussion48

Mean SD Class

1Pl

y

2Pl

y

1187.79 28.86 E

1200.74 27.55 D

1Pl

y

3Pl

y

1160.17 28.15 E

1252.67 26.84 D

1Pl

y

4Pl

y

1256.00 27.71 D

1450.51 26.58 C

2Pl

y

3Pl

y

1194.62 29.30 E

1212.04 27.98 D

2Pl 1335.38 28.72 D

2Ply 3Ply 4Ply

1Ply Red Lost Lost Lost

White Lost Lost Lost

2Ply Red - Lost Lost

White - Lost Lost

3Ply Red - Lost

White - Lost

Standard rating formula for all players after

5000 different orderings of the 86 games

played.

Wins/Loses for 1Ply, 2Ply, 3Ply and 4Ply

when not using two-move ballot.

Page 49: Look ahead cec2011

Conclusions

- Many evolutionary checkers players

produced, using different depths of ply

during learning.

- Better value functions would be learned when

training with deeper look-ahead search.

49

Page 50: Look ahead cec2011

Conclusions

- Increasing of the ply depth will increase the

computational cost of evolving evolutionary

checkers. In our experiments as all the

experiments were run for the same amount of

time (19 days).

- The results suggest that starting with a depth

of four ply is the best value function to start

with during learning phase for checkers. That

is, train at four ply and then play at the

highest ply possible.

50

Page 51: Look ahead cec2011

References

51

1- Samuel, A. L., Some studies in machine learning using the game of checkers 1959,1967.

2- Fogel D. B., Blondie24 Playing at the Edge of AI, United States of America Academic Press, 2002.

3- Chellapilla K. and Fogel, D. B., Anaconda defeats hoyle 6-0: A case study competing an evolved

checkers program against commercially available software 2000.

4- Fogel D. B. and Chellapilla K., Verifying anaconda's expert rating by competing against Chinook:

experiments in co-evolving a neural checkers player.

5- Chellapilla K. and Fogel D.B., Evolution, Neural Networks, Games, and Intelligence,” 1999.

6- Chellapilla K. and Fogel D. B., Evolving an expert checkers playing program without using human

expertise, 2001.

7- Chellapilla K. and Fogel D. B., Evolving neural networks to play checkers without relying on

expert knowledge.1999.

8- Runarsson, T.P. and Jonsson, E.O, Effect of look-ahead search depth in learning position

evaluation functions for Othello using ε–greedy exploration, 2007.

Page 52: Look ahead cec2011

Questions/Discussions

Thank You

52