othello artificial intelligence with machine learning computer systems tjhsst nick sidawy

21
Othello Artificial Intelligence With Machine Learning Computer Systems TJHSST Nick Sidawy

Upload: eunice-price

Post on 30-Dec-2015

228 views

Category:

Documents


0 download

TRANSCRIPT

Othello Artificial IntelligenceWith

Machine Learning

Computer Systems TJHSSTNick Sidawy

Introduction

Machine learning is an extensive filed of study. Most of what is done with machine learning is tied to artificial intelligence which is why Othello seemed to be a good vessel for my research. It is a simple enough game for me to work on, yet difficult enough to keep me working throughout the quarters.

Purpose

The purpose of this research project is to implement machine learning with artificial intelligence. The reason for this is two-fold:

First, to create a very effective Othello AI. Second, and more academically oriented, is to gain a

deeper understanding of machine learning.

Goal Breakdown

The first goal I would like to achieve with this project is to program an effective forward-checker for the AI.

The second goal is to use a genetic algorithm to formulate the best evaluation function for the AI to use.

My last goal is to have the AI learn from each move it makes so that the more it plays, the faster and better it will perform.

Development

The project will be coded in java.I have separated it into 7 different programs:The driver The panel (which controls the game)The AIThe game functions (such as capturing pieces)A class called MyButtonA program to run the Genetic Algorithm A program for the game functions that will be used

in the Genetic Algorithm.A program to the MachineLearning functions.

Construction (First Iteration)

There are two algorithms that the first quarter was focused on:

The forward-checking algorithm (Minimax)

The evaluation function

A MyButton class was also created

MyButton Class

The MyButton class extends Jbutton and has the following variations:

It has a paint component

It can store values which indicate if the spot on the board is occupied by a piece and will draw images accordingly

Forward-Checker

The goal of a forward Checker is to traverse a tree of possible moves and picking the move that will lead to the best scenario down the line. The ply determines how many levels of the tree it goes through.

The trick is that at each level of the three it picks the move that is best for the player it is simulating for. Therefore, the computer assumes the opponent will play perfectly.It is nicknamed the Minimax algorithm for this reason. (See Diagram A)

Evaluation Function

This function returns a number rating how good a particular board is for a player. It does this based on the positions of the pieces and amount of available moves for each player. For example, pieces in the corners are very valuable and will add many more points to the rating than a piece near the center.

Construction (Second Iteration)

Recreating the GUI that the AI would run in so different AI’s and simulations could be used when running the program just once, instead of having to restart it.

Creating a program to run the Genetic Algorithm.

Creating a program that would contain the moves necessary for the Genetic Algorithm to run.

The New GUI

Genetic Algorithm

The values used to evaluate a given Othello board are very important for picking the best move and I hope to use the Genetic Algorithm to find the best values for evaluation.

The main components: The population (A set of different evaluation values) The fitness evaluation (The way of testing how well a

set of evaluation values performs) Splicing and Offspring (The production of a new,

improved set of evaluation values)

Genetic Algorithm (Cont’d)

First, 8 sets of evaluation values are created that will make up the population.

Second, each set is given a score based on how it does with the fitness function, which in this case is a game played against a different, constant set of values.

Third, 8 new sets of values are produced by splicing the 8 sets of values in the original generation.

Splicing The reproduction takes a Darwinian approach. Depending on how well a particular set does against the

fitness function determines the likelihood it will be chosen for reproduction.

Two sets (Set A and B) are chosen based on the probabilities and a crossover point is chosen at random for the sets.

Next, the two sets will splice by taking all the values before the crossover point of Set A and combining them with the values after the crossover point of Set B. Then, the opposite is performed.

Splicing (Cont’d)

This process is done 4 times so the offspring (next generation) is created and can be tested against the fitness function.

There is also a predetermined chance that a mutation may occur. A mutation is when a value on a specific set is put to a random number and helps prevent the evaluation sets from all reaching the same values (plateau).

In the event a plateau is reached and all the sets in a generation are the same, then the fitness function will become the one of the sets of the generation and 8 new sets will be created by random.

In theory this will create better and better evaluation sets over time.

Construction (Third Iteration)

Creating a program (MachineLearning) which does the following:

Saving information from each move the AI makes.

Storing the data in a file so it can be used game after game.

Loading data from the file and putting it into a HashMap for quick access during a game.

Data Collecting Procedure

Games will be run between the AI and a random opponent over and over.

Data from each move made will be stored in several different variations. (For example, the board that is given, a reflection of that board, etc.)

Data Storage

After each move the AI makes, four pieces of data will be stored:

Player's piece color Boardstate (before the move is made) Move chosen EvaluationDuring a game, this data will be stored in a

HashMap with the player's piece color and boardstate as the key and the chosen move and evaluation as the values stored.

Implementation of Data

By storing all this data about moves the AI has chosen, the FowardChecker should be more efficient and, possibly, more effective.

This will prevent the need to traverse through a tree of possible moves if a board situation found anywhere in the tree has been encountered.

It should make it possible to search deeper into a tree of moves by using the stored information only after the ForwardChecker has moved three or four levels deep.

Conclusions

I have learned the importance of the evaluation function. A strong evaluation function has much more weight than a forward-checker with a high-ply.

Collecting loads of data and putting it is useful when you have enough of it, probably, but as far as I can tell I will need thousands of times more than I have for any real improvement to take place.