neurogammon -...

13
Neurogammon CJ Bell Matthew Maas Brian Suchland Joe Cartano

Upload: others

Post on 29-Oct-2019

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

NeurogammonCJ Bell

Matthew MaasBrian Suchland

Joe Cartano

Page 2: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Backgammon• A zero-sum board game between two players• Players roll dice and choose which checkers to

move• Players can also choose to use the doubling-

cube

Page 3: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Backgammon (continued)• An excellent candidate for an AI program• BUT, the game involves a large element of

chance• Traditional search methods are inefficient• Expert human players rely on judgment, not

search.

Page 4: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Neurogammon• Developed by Gerald Tesauro of IBM• Relies on neural-networks instead of search

Page 5: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Implementation

• Six neural-networks for six different stages of the game. (289-?-?)

• One additional neural-network to determine whether to use the doubling-cube. (best setup: 243-24-9)

Page 6: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Training

• Input: Initial board position and transition to next position

• The first six networks trained on a set of expert’s games, where each move was rated from -100 (worst) to 100.

Page 7: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Training

• The seventh network trained on a separate set of expert games

• 3000 positions covering 64 games (225 set aside for testing)

• Each position was categorized from 1 to 9 by an expert, indicating whether it was a good time to use the doubling-cube.

• The 9 outputs were summed

Page 8: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

First Computer Olympiad

• Held in 1989• Pitted the six premier computer

backgammon programs of the time against each other in a round-robin tournament

• The first serious test of Neurogammon’sabilities

• All five other programs relied on traditional, human-defined board evaluations

Page 9: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

Results of the First Computer Olympiad

COMPUTER OPPONENT RESULTS(FIRST TO 11 POINTS)

Saitek Backgammon 12-9, won by Neurogammon

Mephisto Backgammon 12-5, won by Neurogammon

Backbrain 11-4, won by Neurogammon

AI Backgammon 16-1, won by Neurogammon

Video Gammon 12-7, won by Neurogammon

Page 10: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

TD-GammonVersion Training

GamesOpponents Results

TD-Gammon 0.0 300,000 Computer Programs

Tied for Best

TD-Gammon 1.0 300,000 Various Human Experts

-13 Points / 51 Games

TD-Gammon 2.0 800,000 Various Human Experts

-7 Points / 38 Games

TD-Gammon 2.1 1,500,000 Robertie(Grandmaster)

-1 Point / 40 Games

TD-Gammon 3.0 1,500,000 Kazaros(Grandmaster)

+6 Points / 20 Games

Page 11: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

TD-Gammon is Used to Reevaluate Board Positions

White has just rolled two 4’s, giving it 4 moves of 4 spaces each

Page 12: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

The Traditional Move

The traditionally accepted move in this situation is 8-4, 8-4, 11-7, 11-7

Page 13: Neurogammon - courses.cs.washington.educourses.cs.washington.edu/courses/cse473/07au/notes/neuroslides.pdf · Backgammon • A zero-sum board game between two players • Players

TD-Gammon’s Move

TD-Gammon’s move in this situation is 8-4, 8-4, 21-17, 21-17