swarm intelligence in strategy games · swarm intelligence (si) is the sub-ﬁeld of ai that...

Swarm Intelligence in Strategy Games

Auguste Antoine Cunha

Thesis to obtain the Master of Science Degree in

Information Systems and Computer Engineering

Supervisors: Prof. Pedro Alexandre Simões dos SantosProf. Carlos António Roque Martinho

Examination Commitee

Chairperson: Prof. Mário Jorge Costa Gaspar da SilvaSupervisor: Prof. Pedro Alexandre Simões dos Santos

Member of the Committee: Prof. César Figueiredo Pimentel

May 2015

Acknowledgments

I want to thank my family for supporting me and helping me see reason in finishing this chapter of my

life.

I would also like to thank my coordinators Pedro Santos and Carlos Martinho for the insight they

provided throughout the development of this thesis.

I want to send word for Ivo Capelo and Pedro Engana for helping with the formatting of this thesis,

facilitating my work with LATEX.

A word of appreciation towards my colleagues at Miniclip for the moral support they have provided.

I’d like to thank the Almansur community for helping me test the work I’ve done with their game.

And finally, a great big thank you to all other people that have been a part of my academic life, as

they were also part of why I reached this far.

iii

Resumo

Desenvolver um jogador inteligente para um jogo não é tarefa fácil. Cada Jogador Artificial Inteligente

é criado especificamente para o seu contexto e, por essa razão, é não é facilmente reutilizável. No

entanto, alguns dos desenvolvimentos a mais baixo nível possuem maior significância tanto para jogos

como para outras área — como é o caso de algoritmos de procura, optimização de caminhos (pathing),

ou optimização geral.

Neste trabalho, desenhámos e implementámos um algoritmo que combina conceitos de Inteligência

de Enxame (Swarm Intelligence) com os mechanismos de decisões tradicionais utilizados em jogadores

artificiais inteligentes — especificamente aqueles usados em Jogos de Estratégia. O nosso principal

objectivo era, portanto, averiguar a adequação do conhecimento actual em Inteligência de Enxame para

com os requisitos do Jogador Artificial Inteligente, seguido do desenvolvimento do algoritmo de teste

em si. O conceito básico passo pelo o afastamento da comum solução centralizada, e aproximação de

uma solução descentralizada, complementada pela aplicação de algumas noções relativas à Inteligên-

cia de Enxame actualmente documentadas. O algoritmo resultante estava responsável pelo método de

comunicação entre as unidades de um Jogador Artificial Inteligente. Um Jogador Artificial Inteligente

de implementação centralizada e scriptada foi usado como referência para a nossa solução baseada

em Inteligência de Enxame.

Este trabalho é assim uma tentativa de resposta aos problemas resultantes de Jogadores Artificiais

previsíveis — um problema comum de implementações scriptadas — e melhorar a sua capacidade de

adaptação — ao tirar partido do comportamento emergente resultante dos conceitos de Inteligência de

Swarm.

Palavras-chave: Jogador Artificial Inteligente, Jogos de Estratégia, Inteligência de Enx-

ame, Inteligência Descentralizada, Algoritmo, Comunicação, Previsibilidade, Adaptabilitdade, Compor-

tamento Emergente

iv

Abstract

Developing an intelligent player for a game is no easy task. Each Artificial Intelligent Player is created

specifically for each context with very little re-usability. However, some lower level developments have

great significance in both games and other areas — such as the search, pathing, or optimization algo-

rithms.

In this work, we designed and implemented an algorithm that combined Swarm Intelligence concepts

with the traditional decision mechanisms of modern Artificial Intelligent Players — specifically those used

in Strategy Games. Our main objective was to assert the adequacy of Swarm Intelligence current knowl-

edge to the Artificial Intelligent Player requirements, followed by the development of a test algorithm in

itself. The basic concept was to distance our implementation from a common centralized solution, into

a decentralized solution, and complementing it by applying some of the currently documented Swarm

Intelligence notions. The resulting algorithm was especially responsible for the means of communication

between the units of an Artificial Intelligent Player. A centralized and scripted Artificial Intelligence was

used as benchmark for our Swarm Intelligence based solution.

This work is an attempt to answer the problems resulting from predictable Artificial Players — a com-

mon issue with scripted implementations — and to improve its adaptability — taking advantage of the

emergent behavior resulting of the Swarm Intelligence concepts.

Keywords: Artificial Intelligent Player, Strategy games, Swarm Intelligence, Decentralized In-

telligence, Algorithm, Communication, Predictability, Adaptability, Emergent behavior

v

Contents

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Resumo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

1 Introduction 2

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Document outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Related Work 6

2.1 Artificial Intelligence in Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 Game AI in Commercial Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.2 Industry vs Academy AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.3 AI in Games — in Short . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Artificial Intelligence in Strategy Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2.1 Strategy Games — Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2.2 Strategy Games and Game AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.3 AI-Player in Strategy Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.4 AI in Strategy Games — in Short . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3 Implementing an AI-Player . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1 Centralized vs Decentralized Approach . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.2 Human-Like Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.4 Swarm Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4.1 Emergence — Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4.2 Swarm Intelligence — Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.4.3 Algorithms in Swarm Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.4.4 Issues of a Swarm Intelligence Approach . . . . . . . . . . . . . . . . . . . . . . . 18

2.5 Artificial Immune System — A Defense Mechanism . . . . . . . . . . . . . . . . . . . . . . 19

vi

2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Solution 22

3.1 Algorithm Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.1.1 Intent — Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.1.2 Selfish Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.1.3 Negotiation Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.1.4 Algorithm Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2 TBS Game Environment — Almansur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2.1 AI in Almansur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.3 Solution Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.3.1 Algorithm Additional Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3.2 Algorithm Implementation and Heuristic Development . . . . . . . . . . . . . . . . 31

3.3.3 Integration Additional Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.4 Testing methodology and data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4 Experimental Results 35

4.1 Static Scenario Test - Duels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.1.1 Static Scenario Test Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.1.2 Static Scenario Test Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.2 Dynamic Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.2.1 Dynamic Test Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.2.2 Dynamic Test Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.3 Summary - Result Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.3.1 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5 Conclusions 47

5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

A Complete Algorithm 50

B Additional Graphics 52

B.1 Victory Points in Multi-player Match . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

B.2 Territory Victory Points in Multi-player Match . . . . . . . . . . . . . . . . . . . . . . . . . . 54

B.3 Territory Owned in Multi-player Match . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

B.4 Battle Victory Points in Multi-player Match . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

B.5 Army Size in Multi-player Match . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

B.6 Army Power in Multi-player Match . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Bibliography 66

vii

List of Figures

3.1 Conceptual design of the algorithm per swarm unit . . . . . . . . . . . . . . . . . . . . . . 23

3.2 Almansur — Map example of an historical game . . . . . . . . . . . . . . . . . . . . . . . 28

3.3 Almansur — Current AI implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.1 Graphic with the evolution of Victory Points for both players in the duel . . . . . . . . . . . 36

4.2 Graphic with the evolution of Territory Victory Points for both players in the duel . . . . . . 37

4.3 Graphic with the evolution of Territories Conquered for both players in the duel . . . . . . 37

4.4 Graphic with the evolution of Battle Victory Points for both players in the duel . . . . . . . 38

4.5 Graphic with the number of reconsiderations and intentions per turn. . . . . . . . . . . . . 38

4.6 Graphic with the average evolution of Victory Points for the three types of players . . . . . 40

4.7 Graphic with the average evolution of Territory Victory Points for the three types of players 41

4.8 Graphic with the average evolution of Battle Victory Points for the three types of players

— lines are affected by battle events, especially visible on symmetric changes. . . . . . . 41

4.9 Graphic with the average evolution of Army Power for the three types of players. . . . . . 42

4.10 Graphic with the average evolution of Army Size for the three types of players. . . . . . . 43

4.11 Graphic with the average number of intentions and reconsiderations per turn — this is an

average of the three AIs in the game. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.12 Correlation between battle victory points and reconsideration count on the first iteration

reconsideration cycle of the algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

B.1 Graphic of the Victory Points evolution throughout all the game for all the players . . . . . 52

B.2 Graphic of the Victory Points evolution throughout all the game for all the AIs . . . . . . . 53

B.3 Graphic of the Territory Victory Points evolution throughout all the game for all the players 54

B.4 Graphic of the Territory Victory Points evolution throughout all the game for all the AIs . . 55

B.5 Graphic of the Territory Owned evolution throughout all the game for all the players . . . . 56

B.6 Graphic of the Territory Owned evolution throughout all the game for all the AIs . . . . . . 57

B.7 Graphic of the Battle Victory Points evolution throughout all the game for all the players . 58

B.8 Graphic of the Battle Victory Points evolution throughout all the game for all the AIs . . . . 59

B.9 Graphic of the Army Size evolution throughout all the game for all the players . . . . . . . 60

B.10 Graphic of the Army Size evolution throughout all the game for all the AIs . . . . . . . . . 61

B.11 Graphic of the Army Power evolution throughout all the game for all the players . . . . . . 62

viii

B.12 Graphic of the Army Power evolution throughout all the game for all the AIs . . . . . . . . 63

ix

Chapter 1

Introduction

1.1 Motivation

Alongside all the technology advancements, we have seen video-game’s environments become beau-

tiful — with the most recent demonstration of technology prowess by Square Enix 1 — and immersive

— with the appearance of Virtual Reality (VR) and Augmented Reality (AR) helmets and glasses). With

this information accessible by all, video-games are expected to have the most perfect graphics possible.

However, the more beautiful a game is, the more prone players are to notice flaws in other aspects of

the game. The damage caused by these issues in game reviews can be much worse than the lack of a

very high graphic level. Linear Level Design, weak Storytelling, and unrealistic Artificial Intelligence (AI)

are examples of common noticed flaws — for example, the beautiful game with linear content nature

called The Order:1886 was not very well accepted by the critics2.

Despite the worldwide economic crisis and the above mentioned potential flaws, the video game

industry is still alive and well[1]. In 2012, over 20 billion dollars were spent in the U.S. alone[2]. The

Strategy genre is also one of the most successful, as around 24.9% of all Computer Games sold being

Strategy Games (SG) in the U.S. in 2012[2], making any and all improvements worthwhile. Being gen-

erally AI heavy, SG are an interesting study case for AI enthusiasts and investigators. And seeing how

far the industry has come in the computer generated graphics department, it is worthwhile to study ways

to improve the quality of games in other areas — potentially improving the revenue of this industry by it’s

own merit, instead of by looks alone.

Swarm Intelligence (SI) is the sub-field of AI that studies how a group of AI agents may work in a

self-organized and coordinated way, without the use of a centralized control. The algorithms developed

in this area, are a derivative of Nature studies, often from insect colonies (such as ants, bees, wasps, or

termites). Perfected by nature, these algorithms have become an interesting part of AI study, giving new

1Square Enix’s DirectX 12 demo — http://www.eurogamer.net/articles/2015-04-30-square-enix-challenges-the-uncanny-valley-in-directx-12-demo — Gamer Network, Last Accessed on 01 May 2015

2Forbes take on The Order:1886 — http://www.forbes.com/sites/insertcoin/2015/02/19/the-order-1886-is-a-beautiful-failed-experiment-in-cinematic-gaming/ — Forbes, Last Accessed on 01 May 2015

2

insights to the way people approach each problem, and even changing the way we perceive ’intelligence’.

The use of SI-based algorithms have already began to help with real world problems[3]. General

Motors Corp. used an algorithm based on wasps to reduce the idle time of their painting machines at

their Fort Wayne Assembly Plant. Another company, Cemex (Cementos Mexicanos), offered their truck

drivers the power to act after being given the full information available — through real-time GPS location

signals of every truck and massive telecommunications throughout the company. This allowed them to

improve their on-time delivery rate from 35 to an impressive 98%. Railways (like the Japanese bullet

express trains), similar to what currently happens in the Internet, use swarm algorithms to direct traffic

and ensure punctuality. In the end, by applying SI appropriately to a business it was possible to solve

previous known issues, and resulted in an increase of revenue.

However, the use of SI in computer games is limited, and usually only applied to under the hood

features — uses not directly visible to the player. For this reason, a question is left floating — How can

we take advantage of these field developments in gaming AI?.

1.2 Problem Description

Even though we understand that AIs may come in many forms, possibly even be part of the scenery —

which is a use of AI to improve the immersiveness of the game — we will focus on SG and AIs that are

responsible for the decisions of an opposing player (as if they were another human player in the same

game). As players become more and more aware of the lack of good AI present in some games —

either by seeing constant bad AI decisions or AIs that simply let them win — they are becoming more

demanding in that respect[4]. Our objective will always be to develop AIs capable of entertaining the

player — which can have multiple interpretations or definitions as different people are entertained by

different things (e.g. challenge seeking vs always have to win). In this work we will try to do just that, as

we will develop a new AI to control the units of an army in the Multi-player Online Turn-Based Strategy

Game Almansur3. For this reason, and as stated before, our study focus will be on AIs used in SG,

namely Turn-Base Strategy Games (TBS).

Going one step further, our goal will be to test the viability of using the current knowledge of Swarm

Intelligence (SI), adapting it to improve the decision process of an AI in Almansur. With this approach,

we will attempt to solve some of the issues that traditional AI faces in a TBS environment. More con-

cretely, we will develop and apply an algorithm based on our SI research to the military decision process

of the current AI of Almansur[5] — a scripted and purely reactive AI. We expect the use of a SI based AI

will improve the quality of the AI itself, improving it’s adaptability to unpredictable situations and constant

changes in the environment.

3Almansur — http://www.almansur.net/ — Almansur LDA, Last Accessed on 27 November 2013

3

With this work, we aim to answer the question left in the previous section, bringing some of the val-

ues of SI to a concrete case of video-game AIs, putting it in charge of the military decisions in a TBS

environment.

As a final note, we’d like to reinforce that our goal is to improve the quality of an existing AI by ap-

plying a developed SI algorithm to the decision process of said AI. In order to validate our goal, we will

run multiple scenarios between the two AIs, as well as with real players to validate our application. We

expect to see a more responsible, a more adaptive, a more competitive and — generally speaking — a

more intelligent AI form, followed by an improvement in Game Experience for the player.

In order to improve its quality, we implemented a new algorithm that combines the usual knowledge

used by AI players and the SI concepts to improve the communication between the units of a decen-

tralized multi-agent system. By applying the concepts of SI to this process, we will are able to improve

its general performance, and produce optimized solutions for specific problems. These however, are

not without risk, as there are still quite a few uncertainties related to SI in general, which we hope to

overcome.

1.3 Document outline

The remainder of this document is divided in four components.

The section Related Work will cover the current state of the art, when it comes to AI in the gaming

industry. We will introduce relevant concepts for our work, as well as the most frequently used tech-

niques when designing an AI to play a game. At the end of the section, a brief discussion will be held,

with the intent of creating bridges between the related work, and relate that to our intended purposes.

Building up to our proposed solution, we will offer a brief review of the game in which we will work on —

Almansur — focusing the most relevant aspects for our work, the military planning.

The following section, Solution, will detail the process we followed to develop our SI algorithm, as

well as its application in Almansur. We will then perform a complexity algorithmic analysis of our solu-

tion. In the end, we’ll describe the methodology behind the evaluation process / experimental procedure.

Data Analysis will contain the results of our experiments with real players and the old non-swarm

AI. We will analyze the detailed data and discuss the finding’s relevance towards our objectives.

In the end, Conclusions we will summary of the reasoning behind our decisions, followed by a

connection between findings and objectives. Finally we will discuss potential future work that could

follow the results and findings we have gathered.

4

Chapter 2

Related Work

Nowadays, Artificial Intelligence(AI) can be found virtually anywhere. In order to understand why there

is a need to improve the AI in games, it is relevant to understand how it manifests inside a game. Also,

we will want to know why use games to test AI theories or algorithms?.

Even though some of the AI uses may be the same across genres — e.g. using path-finding algo-

rithms — some other are genre specific or, at the very least, genre intensive (this means, more used in

one genre than another).

In this section, we will talk about the many forms AI takes, how it improves game quality, and why we

should be weary of bad executions. As we go further in, we will increase the focus on the Strategy and

on the AI uses within said genre. We will emphasize the AI meant to control a player or opponent, and

lastly, we will introduce the concept of Swarm Intelligence.

2.1 Artificial Intelligence in Games

Computer generated behaviors can be divided in two components[24] — Game AI and Game Physics.

Respectively, these are responsible for the living and the dead parts of the game. Game AI refers to

entities (human or otherwise) that react to the presence of the player, displaying intelligent or intentional

behavior. On the other hand, Game Physics refers to the multiple aspects that do not have a behavior

derived from intentions, but rather behaves the same way in every iteration (within the same starting

conditions) — e.g. falling rocks, gravity, and the flow of a river. In other words, Game Physics is respon-

sible for the purely causal behaviors in a game.

When we think about it, the AI found in the video game industry can be quite overwhelming — it can

be responsible for multiple behaviors and mechanics and even some aesthetic aspects (like the flock-

like movement of the background birds in a scenery). This means, AI can be presented in a somewhat

6

"silent" form, almost unnoticed or assumed as a given by the player. Such is the case of algorithms used

in path-finding, that help the player better navigate his avatar in the virtual game world, and that produce

optimal pathing for a Non-Player Character (NPC). The use of AI can even have more subtle results,

like increasing the engagement factor of a game — e.g. when the AI is responsible for the behavior of

a NPC animal that follows the player around the game map, often making the player more attached to

both the virtual pet and the game. Studies from multiple sources (big game companies, indie studios and

universities) have also resulted in some progress within the narrative driven game genre, using some

AI to convey their narrative[7] — e.g. the narrator in Bastion1 is one of the most remarkable narrators

in gaming and, more recently, the narrator in The Stanley Parable2 also had a very good acceptance by

the gaming community.

2.1.1 Game AI in Commercial Games

In this work, we will focus on Game AI.

Game AI produces the part of a game’s behavior that players can best understand by "read-

ing" the behavior as if it results from the pursuit of goals given some knowledge. Creating a

sense of aliveness (...) the sense that there is an entity living within the computer that has its

own life independently of the player and cares about how the player’s actions impact this life.

(Michael Mateas[24])

Our ambition is always to improve the experience of play. And as we consider Game AI to be

our presence within the game after it’s released, the quality of the interaction between the player and

the Game AI must be as polished as possible — thus improving the interaction between Player and

Game. The most usual (and noticeable) Game AI manifestation in games are the supporting role and

the opponent role.

• Supporting Role — when the AI is responsible for aiding the main character (controlled by the

player) in his quest. This may also translate to allies in multi-player games — meaning AI and

player stand on equal footing.

• Opponent Role — when the AI is responsible to (pretend to) block the player’s progress. This

may also translate to enemies in multi-player games — same as in the supporting role, in equal

footing and same abilities as the player.

Each type of AI has it’s own inherited potential issues — a supporting AI that’s more of an hindrance

than an aid is just as bad as an AI too predictable (often scripted) or simply impossible to beat (clearly

cheating or otherwise).

1Second Person: Behind Bastion’s Unique Narrative — http://www.1up.com/features/second-person-bastion-narrative —Posted by Agnello, A. on September 2011, Accessed on 27 November 2013

2The Stanley Parable Calls Shenanigans on Narrative-Driven Design — http://www.vg247.com/2013/10/25/the-stanley-parable-calls-shenanigans-on-narrative-driven-design — Posted by Brenna, H. on October 2013, Last Accessed on 27 November 2013

7

If you do want to talk about poor enemy AI in shooters you have to think about what makes a

fun game. Brilliant AI is no fun to play against, if they keep you suppressed, never miss, and

flank you without warning players feel the game is unfair.3

(Metro.co.uk reader DarKerR)

In the end, no matter how hard it may seem to please the player and implement an acceptable AI be-

havior, it is not impossible. When this feat is achieved, the results are very impressive, both in respect to

player experience and to game sales — e.g. the development of Elizabeth, the helpful NPC of BioShock

Infinite, is a recent case of success when it comes to an AI in a support role4.

2.1.2 Industry vs Academy AI

The AI techniques in commercial games are often simplistic in comparison to the ones developed and

used in academic research or in other industrial applications[32]. This fact should not be understood as

AI in games is poorly done, as we could tell from the previous given examples. Even back in the 2000s,

when there was a big funneling of effort and resources to graphical fidelity, there was a set of well estab-

lished techniques widely used by game developers — e.g. Fuzzy State Machines, the A* Path-Finding

algorithm, and the BOIDS flocking of Craig Reynolds are examples of such techniques.

On a side note, we find it relevant to point out that these techniques were (and still are) so useful in the

industry that there were some who thought of creating Software Development Kits(SDKs) with generic

implementations of AI components, with the intention of lowering the game’s development times[32].

Not many people ended up using these SDKs, because of their lack of flexibility — they were not usable

without a great deal of effort from the developer, often requiring specific solutions for each problem to

solve. Ultimately these SDKs did not solve any problem.

Despite these early issues, nowadays, the average household computer has much higher specifi-

cations. This, aligned with the development and improvement of the Graphics Processing Unit(GPU),

made it possible to allocate a processing unit dedicated to graphics, freeing (some) of the other cores

for general processing. All these advancements make it possible to further develop and to use higher

demanding AI techniques inside our games, without breaking the performance nor the experience. Also,

the current graphical level is so high that a game is expected to have some awesome or innovative

gameplay mechanic, and/or noticeable good AI.

3Gamers have never had it so good — http://metro.co.uk/2015/05/02/gamers-have-never-had-it-so-good-readers-feature-5177285/ — Posted on 02 May 2015 on Metro.co.uk by reader DarKerR, Last Accessed on 02 May 2015

4BioShock Infinite: The Revolutionary AI Behind Elizabeth — http://uk.ign.com/videos/2013/03/01/bioshock-infinite-the-revolutionary-ai-behind-elizabeth — IGN UK, Posted on 1 March 2013, Last Accessed on 27 November 2013

8

2.1.3 AI in Games — in Short

Even context dependent animation and audio use AI.

(Charles Weddle[15])

To sum it up, in gaming, AI is used to do anything from improving the aesthetic feel, to compete

against the player in some way. Some (most) games even use multiple types of AI blending seamlessly.

However, our primary focus is on Strategy Games, especially Turn-Base Strategy Games. The list of

relevant uses for AI in this context shortens a bit.

2.2 Artificial Intelligence in Strategy Games

Since the dawn of gaming, a clear distinction was made across genres. Taking the risk of over-simplifying

things a bit, we can say we have Action, Adventure, Strategy, Simulation, Puzzle, Platform, etc, and all

of these have multiple sub-genres. Our focus study is the Strategy genre. Within this genre, it is possible

to identify two big sub-genres (again sub-divided in multiple others) — Turn-Based Strategy (TBS), and

Real-Time Strategy (RTS) games. According to Fairclough et al.[32], the main distinctive point between

the two sub-genres is the time available for planning. RTS, as the name suggests, forces decision mak-

ing to be in real-time, while TBS has a softer (sometimes nonexistent) time constraint. This means that it

is possible to allow a longer period of time for planning in a TBS than in a RTS game, which, in principle,

should mean the a decision in a TBS should be the result of a more thorough and careful plan.

2.2.1 Strategy Games — Definition

The main characteristic of Strategy games, both TBS and RTS, is the ability to command. A player may

control one of multiple units through indirect control, only expressing his (or her) desire. The selected

unit will make way, through the shortest known path, to the designated location and will perform the

action selected by the player at that location. Some games, may have a tiled map, which simplifies

the pathing and, in some way, limits the unpredictability of the path-finding algorithms — this is more

common in TBS — while others have a more open field and require stronger algorithms — in opposition,

more common in RTS. There is a set of know-how skills that a player must have to be successful, which

is mostly common between the two sub-genres of Strategy games[22]. Those skills have a direct or

indirect connection to the actions a player can perform within the game, and they all fall back to the

player’s ability to command his (her) hero or army. They are:

• Resource Management — refers to the knowledge required to decide which resources to search/produce

and how to spend them in buildings or units.

9

• Decision Making Under Uncertainty — refers to the knowledge required to perform actions

without absolute certainty of the outcome, e.g. when exploring in fog of war5.

• Spatial and Temporal Reasoning — refers to the knowledge required to understand the nature

of the environment and to perform actions where and when it is most favorable.

• Collaboration — refers to the knowledge required to play while supporting or being supported by

some other player (that may also be an AI).

• Opponent Modeling / Learning— refers to the ability to learn from one game to another, increas-

ing the performance against a same opponent or tactic.

• Adversarial Planning — refers to the knowledge required to predict the future intentions and

actions of an opponent and then planning appropriate responses.

This set of required skills results in a strong need for parallel thinking, and takes an enormous amount

of detail into account. Each skill, even when considered separately, can easily be seen as a complex

case-study. And all of them together, make it harder to develop an intelligent AI capable of exploring all

these factors at the same time.

2.2.2 Strategy Games and Game AI

It is very easy to assume RTS and TBS games as a simplified take on military simulations — multi-

ple players (commanders) order their troops to gain access to resources scattered around the map,

which in itself sets up the economy required to invest in more units and defeat the opponent. A natural

step would be considering the use of these games as a simulated scenario to develop AI concepts and

algorithms[22], since a miss-implementation within a game is less harmful than one in a real world sce-

nario. Taking this notion one step further, the very nature of the RTS sub-genre and the constraints by

which games of this sub-genre are bound, make them an ideal testing ground for real-time decision mak-

ing AI agents, systems and algorithms. Analogously, the TBS sub-genre and the relative time-constraint

free environment, make TBS games ideal for testing planning centered AI agents, systems and algo-

rithms.

The complex environment, associated with the multiple unit and terrain types, and complemented by

the dynamic involvement of all entities within a Strategy game, result in a large variety of AI research

opportunities — often more complex than in other genres.

As a side note, it’s important to remember that some AI present in Strategy games is similar to that

of other genres. Even in Strategy games, AI can be responsible for the story-telling (adapting to player’s

actions), the tutorial (introducing the game at a pace the player can follow), or the occurrence of random5Definition of "fog of war" — http://en.wikipedia.org/wiki/Fog_of_war — Last Accessed on 7 January 2014

10

events (to keep the player engaged), just to name a few. All of these can be more or less obvious, that

is, more or less noticed by the player, and more or less easily identifiable as AI. Other things are even

taken for granted, like the pathing (path-finding) algorithms for units in games like the franchises Age

of Empires6 or Starcraft7 — these quality of life features are expected to be present in any game with

indirect control over units or avatars, and that expectancy makes people forget their real value.

However, the most studied AI field related to Strategy games, are agents (or even systems) respon-

sible for playing the role of an artificial player — opponent or ally. A central unit, a commander, that takes

upon itself to plan the strategy required to defeat its adversaries, giving out orders to its underlings —

all the units under its command. This means that an AI responsible for this kind of behavior must take

into account all the know-how skills referred to in the end of the previous section, as they are the central

point of playing games within this genre — and they are the same for a human player.

2.2.3 AI-Player in Strategy Games

Since very early in gaming, there was the need for an artificial intelligent player — e.g. single player

strategy games required opponents, or an AI might fill for a disconnected player (in an online multiplayer

game).

In the Strategy genre, an AI-Player is responsible for making the same decisions a human player

has to make — commanding the various units at its disposal, to achieve its own end goals. Knowledge

on how to design a Player-AI, went through many stages and many techniques were tested. In older

games, an artificial player that cheated was a very common practice. This could mean, for example,

instantly generating resources or units required to counter another player’s attack. While this technique

is quite appealing efficiency-wise and presents decent results — this is, the challenge presented to the

player is adequate — it’s flawed execution may be the cause of a feeling of injustice in the player, for

ruining the illusion of being challenged by an equally skilled opponent. This means the technique could

lead to a negative experience and, therefore, should be avoided.

Developers took a step forward when they started to consider the strategic knowledge required from

human players and began to introduce such notions in the AI-Player. However, the development of an

AI with strategic knowledge, capable of coherent planning against a human player is a complex pro-

cess. This complexity is partially due to the fact that a good AI will not resort to the same strategy every

game[32]. A human player would notice the repetition after a few games, and would work to counter it,

instead of learning from his own mistakes. In short, this would lead to an exploitation of that AI strategy

by the player, and it would, ultimately, leave the player bored or uninterested in the game — exactly the

opposite of the intended purpose. Such solutions would also fail to adapt to the different player decisions6Microsoft Studios c©Age of Empires — http://www.ageofempires.com/ — Last Accessed on 2 January 20147Blizzard Entertainment c©Starcraft — http://us.blizzard.com/en-us/%20/games/sc/ — Last Accessed on 2 January 2014

11

and would eventually fail to perform at the desired level.

In the AI research field, and for a long time, the ultimate goal has been to produce an AI capable of

challenging a human player, in the same way another human player would[10]. One of the most common

investigation components of AIs for TBS is the development of AI-Players capable of defeating a Human

Player (or another AI-Player) — notice that defeating and challenging are different concepts.

• Defeat — is playing to win, leaving no room for mistakes.

• Challenge — is to play on par with another, offering an even challenge (or the illusion of it).

The second, challenging, offers the player the opportunity to learn from mistakes and still come out

ahead. It is both a learning tool and a companion through each game. The first, is a pure adversary, a

harsh, hard wall to climb.

The ability to defeat was the direction chosen when developing AIs capable of playing Chess —

which, after all, is a TBS. Some AIs even ended up being able to beat human world champions. How-

ever, in the end, Potisartra et al.[10] concluded that when developing an AI-Player, we do not always

want an AI capable of defeating the human player. For a game to succeed, commercially-wise and gen-

erally entertainment-wise, the AI cannot be too hard to beat and, in an optimal scenario, the AI should

adapt its skill according to the skill of the person playing. The reason for this is quite simple as well — if

an AI is too agressive or simply too good at the game, the player will be frustrated for not being able to

beat it; just as much as he will end up bored , if the AI is too soft or too bad at it. In simpler terms, the

perfect AI is one that provides an adequate learning environment to the player, as well as a challenging

environment when the player has learned "enough".

2.2.4 AI in Strategy Games — in Short

Our reasoning up to this point lifts up different issues and different areas of interest. A perfect AI must

adapt itself to both game changes (this is, changes in game state), and to the player learning curve and

current ability. Also, we are saying the AI is partially responsible for the player engagement (or enter-

tainment), the player learning and directly changes the environment in which each game takes place —

at least, it should feel like a different opponent each game. These areas of interest have been studied

separately, but there isn’t much common ground between them.

As it is enough from the planing point of view, most of the studies on that area focus on single-agent

planning without considering adversaries who actively try to prevent the agent from achieving its goal or

that have goals that may conflict with its own[16] — this is usually done on a play-by-play perspective,

limiting the computation required, and optimizing for the immediate result. Considering the adversar-

ial planning greatly increases the size of possible states set, which means a greater complexity in the

12

planning and decision algorithms. In the field of adversarial planning, one of the biggest developments

was the minimax game tree8 applied to chess and checkers — producing AI systems able to challenge

and beat human experts. However, these games have a relatively small branching factor, allowing some

algorithms to look far ahead, without a great detriment of the algorithm speed and producing winning or

beneficial strategies much sooner than they would foreseeable for a human player. In Strategy games

the branching of the game tree is far greater as there are many more factors to take into account: ex-

istence of fog-of-war; more complex movements and actions; multiple adversaries; possibility of allies;

higher number of units (which can reach thousands); multiple unit types; and others. These are all fac-

tors that players have to take into account while playing, which means that an AI that wouldn’t take them

into account would not serve its purpose well enough.

On their work, Sailer et al.[16], refer that to tackle these problems and improve results, it is common

to divide the AI-Player in a sub-set of goal driven agents. This would mean that the complex AI would

have several components responsible for the achievement of each sub-goal in order to perform more

efficiently — something that has been known to help throughout all of the Artificial Intelligence field,

Divide-and-Conquer tactics. Resource gathering, scouting, and effective targeting are examples of sub-

goals. In the end, this set of agents would need to combine its results to ultimately achieve the original

AI-Player’s goal. For example, the knowledge gained from scouting the map, needs to be passed to the

resource manager and to the army manager, in order to further plan their next actions.

2.3 Implementing an AI-Player

The development of an AI for Strategy Games isn’t an easy task. One of the reasons why,

is their dependence on the underlying game world implementations — which present hard

variations from game to game within the genres. This also means that the development of

the AI is very dependent on having the game world up and running before hand.

(Forbus et al.[28])

2.3.1 Centralized vs Decentralized Approach

There are generally two ways of implementing an AI-Player for a Strategy Game — using a centralized

or a decentralized approach.

We mention the concept of manager, and we have said units hold some intelligence (like the know-

how of path-finding). In some way, this means we are dealing with a case of multi-agent system. More-

over, we are dealing with a centralized multi-agent system, as we have a controlling unit and multiple

controlled units. In a centralized approach[30] — the most common in Strategy games — a central unit,

8Minimax Decision Theory — http://en.wikipedia.org/wiki/Minimax — Last Accessed on 7 January 2014

13

usually god-like (unseen and all seeing), holds the most knowledge and conveys its intentions to the

other units. The units receive said intentions (e.g. attack this, defend that, or moves there), and execute

the appropriate action to fulfill that intention. In this case, the lower units need to be less ’intelligent’

than the controlling unit. One of the main characteristics of the centralized approach is the discrepancy

between the levels of intelligence required from each AI, and this is the most common approach for AI-

Players in Strategy games — since it requires less effort from the developers, as they may use the same

unit intelligence for all AI-Players and all human players, and it is less complicated. This approach can

be seen as a direct interpretation of player interaction — as we can see the controlling unit the ’player’.

By opposition, the decentralized approach[30] entitles each unit with a more complex thought pro-

cess, allowing them to perform local planning and communicate exchanging requests and intentions at

unit level. This more complex thought process is always built on top of the previous know-how the units

already had — such as the previously mentioned path-finding skills and the like.

A correct implementation of a decentralized approach allows quicker reactions to unpredictable lo-

calized events. The downside, though, is the ill-suited nature of the decentralized approach for actions

that require high coordination — like strategy execution ("I go this way, you go that way"). This means,

units are able to request aid, or even request the execution of an action from another unit, but due to

the nature of this kind of control (or this lack of full control), a unit is free to decide to help or not another

unit. In order to solve this issue, it is necessary to pair each request with a certain priority value, to help

each receiving unit plan its own action. Also, each unit may have a level of altruism/selfishness, making

it more/less prone to help its companions.

There can be different levels of centralization/decentralization. There may be multiple agents acting

towards the same ultimate goal (this is, win) but still have different sub-goals to achieve — e.g. divid-

ing the responsibilities of different types of planning to different entities, such as economic, military, or

diplomatic. Each objective-driven AI can be centralized or decentralized within the same system.

The advantages/disadvantages of each approach need to be weighted in before actually developing

an AI-Player, and there is no right answer as it should be completely intention dependent — referring

back to the challenge vs defeat the player intention. Despite the differences between these approaches,

they are still both related to the implementation of a Player-AI. It makes sense that, in a way, in a game

with multiple players (a mix or human and AI players), we would want them all to play with an approx-

imate level of human-like intelligence, since this would create some balance and improve the player

experience.

14

2.3.2 Human-Like Intelligence

Many are the attempts to approach the AI to a more human like intelligence. Qualitative Reasoning(QR)

may come in aid in that regard. QR is an area that studies ways to transform quantitative information

in a qualitative description. This means, QR offers the possibility for an AI to understand quantitative

information through qualitative notions, representing a stronger link between machine and human knowl-

edge. For example, being able to logically attribute tags to numbers in the right context — e.g. 100 may

be high in one context, but low in another, or even mean small army in a third context. In his paper,

Forbus et al.[28] suggest that QR may offer a valuable contribute to the video game industry, limiting the

dependency of the internal world/environment implementation. They believe that by using QR systems

it is possible to achieve better opponents, advisors and other NPCs. According to the author, the use

of QR may bring potential advantages, like: more human-like behavior; better communication of intent;

better path-finding; and more reusable strategy libraries. We dare even say, the use of a QR approach

may improve the extensibility of a developed AI.

QR may help decision making, but it doesn’t fully allow an AI to decide when and how to attack an

enemy. The process behind these decisions often lies in either using influence maps or using prede-

fined strategies (usually based in scripted behaviors). Using predefined strategies will not allow much

adaptivity, however. The AI-Player may feel and act differently in the first few games, but with repetition,

its strategies will start to seem predictable, ruining the experience. The way to improve this subject is to

offer some personality [28] to the AI-Player, modifying its behavior slightly — e.g. tweaking it’s levels of

aggressiveness. Also, the more strategies to choose from, the more unpredictable the AI.

Influence maps[35], however, are a different matter. Influence maps are an abstract representation

of the environment in a way AI can understand it. This technique allows the attribution of values to the

environment that represent abstract concepts — for example, it is possible to attribute a quantification of

how valuable is to attack, defend or move to a certain position, improving the following planning process.

In contrast to predefined strategies, the use of this technique allows the development of an adaptive AI,

as it will react differently, according to the faced opposition. The use of influence maps allows a better

comparison between different possible actions, which is also desired.

Finally, the last topic in human-like behavior we would like to discuss is related to Believability. Taking

the risk of over-simplifying the question at hand, it is possible to separate AI believability in two groups[39]

— AI that correctly simulates a human player, and AI that correctly acts like the character it is playing.

However, in Strategy games, the AI-Player controls not only a player, but the individual units that a player

normally controls. The concept of believable must be extended to support this case. Should an AI be

considered believable if the AI-Player uses human-like strategies? Or should it be considered believable

if its units perform logical (intelligent) actions after being given orders? Or maybe it should be a mix of

both, to different extents? This brings us back to the discussion between centralized and decentralized

multi-agent systems, and it is an answer we cannot provide due to its highly subjective nature.

15

It seems that AI has progressed to the point where it cannot be considered to be a binary

concept. Rather, in practice, the term refers to a spectrum of ideas ranging from a simple

system that can perform only basic tasks to a fully adaptive system that is able to solve highly

complex problems by using techniques that reflect the nature of human intelligence.

(Johnson et al.[33])

More than reflecting the nature of human intelligence, AI began to reflect a larger scope of Natural In-

telligence. Although there were impressive achievements when trying to imitate the human intelligence,

studies have started to diverge to other natural intelligence manifestations. Some of those studies were

in the origin of new AI fields with ’nature-based’ theories as their background, such as Swarm Intelli-

gence.

2.4 Swarm Intelligence

It is a well known fact that Man learned a lot from studying natural systems. This is also true for Computer

Science[38], as the studies derived from natural system inspired the development of such algorithm

models like artificial neural networks, evolutionary computation, swarm intelligence, artificial immune

systems, and fuzzy state machines. These mentioned breakthroughs in computer science are the re-

spective models of biological neural networks, evolution, swarm behavior of social organisms, natural

immune systems, and human thinking processes.

2.4.1 Emergence — Definition

From the studies on social organisms, it became evident that their ability to perform complex tasks had

the interactions between the individuals of the swarm in its core. This means, that the complexity was

not innate within any of the individuals, but rather present when analyzing their behavior as a whole.

The interaction in these biological swarm systems may be direct — through the natural senses of touch,

smell, hear or seeing — or indirect — through changes in the environment. The ability to perform com-

plex tasks as a result of individual independent labor is called emergence and it is not easy to predict

or deduct the complex resulting behavior from observing the simple behavior of the individuals. Engel-

brecht et al.[38] define emergence as the process deriving some new and coherent structures, patterns

and properties (or behaviors) in a complex system — structures, patterns and properties (or behaviors)

that come to be without the presence of a central commanding unit delegating or tasks to the individuals.

2.4.2 Swarm Intelligence — Definition

Swarm Intelligence (SI) is the terminology used to describe the problem-solving behavior emergent from

the interaction between agents within a swarm (or colony). In the same way that Computer Swarm In-

16

telligence is the terminology used to the algorithmic representations that model those same emergent

behavior.

SI studies ways to implement collective behavior resulting from decentralized and self-organized sys-

tems. Even though small and with limited sensory and cognitive skills, these insects are able to form

colonies and work together in order to persevere. This perseverance often implies the need to perform

complex tasks as a group, such as food foraging, brood clustering, and construction and maintenance

of the nest. The reason why this is so impressive, is because these insects are able to perform all

these tasks without a central unit controlling or defining the objectives, nor assigning each member of

the colony to a specific task.

We can exemplify this concept with ants, for example. Like other insects, ants have several built-in

systems that allow them to be so productive[25], and have helped with problem. By defining a clear

objective, like foraging for food or building a nest, they help the colony units understand their purpose.

They are committed in the greater good, even if they may seem to wander aimlessly on their own, they

are continuously searching for ways to serve the colony. Ants live in an empowering culture, which

means that each ant (or each colony unit) is allowed to try and experiment as many possibilities as

possible to reach a goal, without adverse consequences for failure — this empowerment may result in

finding a better supply of food, for example. And, finally, ants dispose of an automatic communication

system that they simply cannot turn off — this allows any ant who follows is always benefited by the in-

formation gathered by previous ants. This allows them to efficiently search for an optimal or near-optimal

solution for their goal.

2.4.3 Algorithms in Swarm Intelligence

Many are the SI-based algorithms and many are their applications in various areas. These algorithms

have been proven very efficient in solving AI problems that range from optimization to clustering algo-

rithms. We will focus on algorithms that have had some application in gaming.

The knowledge gained from studying ants has proven a great aid in the development of algorithms

such as the Ant Colony Optimization[11], and its usefulness falls under various categories — e.g. routing

problems (path-finding for distribution), assignment problems (distributing tasks to works, given some

constraints), scheduling problems (allocation of resources over time), or subset problems (selecting

items from a set that, together, form a solution).

17

Particle Swarm Optimization

The Particle Swarm Optimization(PSO) is a stochastic optimization algorithm based on the somewhat

unpredictable flying patterns of bird flocks[12]. PSO is a search algorithm that uses multiple individuals,

or particles, grouped in a swarm. Each of these particles represent a candidate solution to the optimiza-

tion problem. In a PSO system, each particle adjusts its position according to its own experience and

that of neighboring particles, trying to position itself in an optimum solution state. In the end, this means

that each particle will continually try to reach an optimal solution while searching in a wide area, and the

overall flock will converge to that same optimal solution.

Routing-Wasp

By studying the Polistes dominulus wasps, a dynamic task allocation model was created that success-

fully emulated the self-organized behavior of wasps[36]. The model divided wasps in a hive in two

different types, according to their respective tasks: either foraging, or brood caring. The task assign-

ment, or better yet, the decision to perform a task is done by each individual for himself, and it was

based on the response threshold and stimulus emitted by the brood. Stimulus was emitted by the tasks

and affected the individuals’ task selection decisions. Response thresholds represent the individual’s

will to perform certain tasks. Force is used in dominance contests which allow the formation of a certain

hierarchy within the colony. And, finally, Specialization refers to the aptitude of an individual to perform a

certain task. The more an individual performs said task, the lower his response threshold will be, while

the thresholds for the other tasks will increase. This means, the more an individual performs a task, the

more likely it is that he will perform that same task again in the future. Routing-Wasp, a derived algo-

rithm was developed[34], applying the previous concepts to self-configurable factories — it was from this

algorithm that General Motors took advantage in their assembling factories. Santos et al.[8] also derived

from these principles and applied them in gaming. The algorithm WAIST, as they called it, was applied

to a Real-Time Strategy game, and was made responsible for choosing which ’factory’ would spawn a

requested unit, taking some of the micro-management away from the player.

2.4.4 Issues of a Swarm Intelligence Approach

Being a somewhat recent field of study, the use of these algorithms has been confined to solving AI

"benchmark" problems comparing their results to the ones from previous accepted algorithms — this

fact has enabled them to increase in popularity and gain interest from researchers. However, it also

means that their use outside of that scope has been relatively limited. Specifically speaking, in gaming SI

was used to develop AIs capable of learning, and mostly in traditional games — such as Checkers[26],

and other really old games like Go[21](a 3000 years old, Chinese game) and Seega[20] (a really old

Chess-like Egyptian game).

18

Despite the advantages of SI-based approaches, they are not completely without issues[40]. For

starters, there is no definitive way of programming a swarm in order to specifically perform a certain

task. The asynchronous nature of the swarm units in decision making increases the difficulty of an

already hard problem. A possible solution for this problem would be to explore the behaviors of an near-

infinite amount of different swarms, or search that same space of possible swarms for an optimal one by

means of some cost function — this last option would only be viable if a cost function could be defined,

among other requirements.

Secondly, there are a good number of questions that require answering when dealing with these

systems. How complex should each agent be? Should all agents be identical? Should they be able to

learn or make logical inferences? How and what do they communicate? What should they know about

the environment? And so on, and so forth. These questions may have multiple answers, depending

both on who is answering and on the purpose of the system being built. A possible and reasonable

approach is to start with low complexity agents and progressively increase it as needed. Although it

doesn’t necessarily answer all the questions required, this approach is sufficiently systematic to provide

good-enough results.

Lastly, SI systems are not absolutely reliable[40], as it isn’t trivial to predict their behavior when faced

with an unexpected event. There is also the issue of defining an adequate benchmark, suitable for SI

testing. SI systems’ performance shines when acting in dynamic environments, and so dynamically

changing problems. In order to create a benchmark for such an adaptive system, implies that we would

know what to expect from a generic adaptive system. How could we evaluate the performance of a

system (what would be the metrics)? All in all, there are multiple ways of being dynamic, but it could

be possible to show various systems with similar properties in terms of how difficult it would be to solve

their corresponding dynamic problem.

2.5 Artificial Immune System — A Defense Mechanism

The biological immune system is a good metaphor for anomaly detection systems in general. In 2002,

Matzinger offered his views on what he called Danger Theory (DT)[29], something that has become in-

creasingly popular. The DT states that the biological immune response is triggered by sense of danger

and not by sensing foreign (or "non-self") entities. There are still arguments concerning the validity of

the theory from a biological stand-point, however, this knowledge suffices to develop Artificial Systems.

Creating a bridge to the Strategy game scope, this would be equivalent to an invasion by some or many

other player’s units. This theory, DT, has been used to develop AI algorithms for Spam Detection[9] as

well as Intrusion Detection within a network[18]. The scope of these algorithms seems larger than the

one present in strategy games, however we feel as though it is natural for an AI-Player to react under

fear and sense danger, as there is some correlation to the human reaction.

19

The reference to the immune system may seem a bit out of place, but if we consider the cells as a

whole, even though they are void of real intelligence (being purely reactive), their reactions result in a

complex "behavior" that is for the benefit of a greater being — and consequently all of those cells. The

immune system is one of the most complex defense mechanisms, responsible for stopping attacks from

both the outside and the inside — making it a worthwhile study when dealing with military aspects of a

swarm.

2.6 Discussion

We have covered several different aspects related to AI-Player development. The general conclusion

we can make is that an AI-Player is the composition of multiple techniques, ideas, and choices. With the

increase in visibility that games have been getting, alongside their acceptance from larger groups, AI

must take a step forward and improve the notion of intelligence present in games[27]. Hardware-wise,

computers (and generally speaking gaming systems) are now more powerful than they were years ago,

which allows the use of more processing power, and enables the algorithms to perform faster and/or at

a higher level — e.g. a search algorithm can now go deeper in a search tree.

Still, we are faced with the same old issues, such as:

• How to develop an adaptive system?;

• What do we mean by believability?;

• How intelligent do we want our AI-Player to be?;

• How do we achieve a consistent intelligent behavior?;

• Should we use a centralized or decentralized approach?;

All these questions remain unanswered. Better yet, each implementation is the result of assumptions

and more or less personal views around these subjects, and what is valid for one game, one context,

may not be in another — more often than not, it really won’t be. This need for specific solutions in each

case and context, make it difficult, if not impossible, to use a generic approach, adaptable to multiple

contexts and games.

For our purpose, we intend to test the viability of applying Swarm Intelligence principles to an AI-

Player, hoping to improve the results obtained, when compared to a conventional AI-Player. By nature,

SI systems are adaptive, and when correctly implemented react in a believable (at worst understandable)

manner. The resulting behavior may be considered intelligent at unit level, as opposed to commander

level, defining a decentralized approach. We believe these choices may have a positive reflection on our

20

ultimate goal — keeping the player entertained .

21

Chapter 3

Solution

The most common uses for SI in games are learning AIs, organizing scheduled tasks, or path-finding.

However, using Swarm-like behaviors to control the actual units in a Strategy game — similarly to a

decentralized AI — hasn’t been documented. In the previous sections we presented the current state of

the art of AI in the gaming industry, as well as the most relevant developments in AI academic research.

Based on these, our work has the objective to combine the two — industry and academy — in order to

produce a better performing AI for a TBS game.

More specifically, we aim to use the knowledge of Swarm Intelligence to improve the quality of the

decision making process of an AI. For this objective, we need a Strategy game with an interesting amount

of decision options, in order to validate the adequate behavior of our new AI — we chose Almansur.

3.1 Algorithm Design

Our algorithm’s primary goal is to be improve the communication between the units of a swarm, in order

to reach a consensus — this is, a set of intentions or intents per unit that every unit agrees on. So a unit

can be seen as a particle of the swarm, and in order to make a intent final, that intent needs to somehow

approved by all other units.

In order to accomplish this, our algorithm was divided in two stages:

• Selfish phase — in which every unit, analyzing the surrounding environment, decides on the best

course of action for itself;

• Negotiation phase — in which every unit, communicates its intention to every other unit for con-

sideration and evaluation. This allows each unit to reconsider their intentions, and instead follow

another unit’s.

Figure 3.1 shows these two phases as idealized.

22

Figure 3.1: Conceptual design of the algorithm per swarm unit

3.1.1 Intent — Definition

An intent or intention is a structure that represents each unit’s thought of what their ideal action would

be. An action is an interaction with the environment, that is will be the cause for some desired outcome.

In terms of requirements, in regards to our algorithm, the intent must have an heuristic value and

the originating unit’s identification. These are the sufficient conditions for our algorithm to produce some

results — even though not optimal in most cases.

However, both these parameters are completely dependent on the context implementation, espe-

cially for the heuristic value.

Another relevant variable that an intent can have is a type — however this is not a hard requirement.

An intent type allows considering multiple different intents over a same target — e.g. looking out of,

opening, or closing a window. Having a type is completely context dependent, and, as previously stated,

it is not are not mandatory.

Considering studies such as the Danger Theory, referenced in Related Work, it became clear that

there could be a requirement for raising the level of importance of a certain intention — in practical terms,

when one unit detected a threat, or an very valuable target. For this reason, we considered and included

the help flag. This feature, allows any unit to artificially increase the value of its target, in order to call

the attention of others to it. But once again, the algorithm does not make existence of this flag mandatory.

On a side note, all these optional parameters were included in our final implementation of the solution.

23

3.1.2 Selfish Phase

Coming back to the algorithm itself, the first part is really straight-forward — each unit considers each

available target in the environment, and marks the intent with the highest heuristic value found as its

selfish intention.

Same as the intent types and heuristic values, the perceptions of the world (reference as available

targets above) are entirely context dependent.

In Algorithm 1 we represent a simplified version of the first step of our algorithm — the selfish plan-

ning1. This pseudo-code references intent type, as we believe there are more cases that require it than

not.

Algorithm 1 Simplified take on the selfish cycle step

1: input: perceptions

2: output: selfishIntents← map[unit, intent]

3:

4: selfishIntents← newmap[unit, intent]

5: selfishIntent← null

6:

7: for all unit in swarm do

8: for all target in perceptions.availableTargets do

9: for all type in context.intentTypes do

10: intent← newintent(type, target, unit)

11: if intent.heuristic > selfishIntent.heuristic then

12: selfishIntent← intent

13: end if

14: end for

15: end for

16:

17: selfishIntents[unit]← selfishIntent

18:

19: end for

Selfish Phase - Analysis

Considering we have three nested loops, the complexity of the algorithm can be calculated based on the

number of times each of those loops runs. In the worst case scenario:1This and the examples that follow are a simple representation of the actual process — split up for explanation purposes. The

complete algorithm is present in appendix, chapter Complete Algorithm, and is nothing more than these examples combined.

24

• Main Loop - Unit Cycle — runs exactly U(← swarm.length) times — or N in the usual nomen-

clature;

• First Inner Loop - Target Cycle — runs a number of times equal to the target for each unit —

which is a constant, so t times;

• Second Inner Loop - Intent Cycle — runs a number of times equal to the number of intent types

— which is constant, so i;

This allows us to conclude that the this phase of the algorithm will have a number of cycles equal to

a constant (c = i*t) — which depends largely on the context — times the number of units (N) — making

it of complexity O(N).

3.1.3 Negotiation Phase

Immediately after concluding the first phase, the second one begins. In this negotiation phase, each unit

will divulge its own intent to the rest of the swarm — consequentially, each unit will also receive every

other unit’s intention. This will allow for a reconsideration step.

Reconsideration — Definition

By reconsideration it is implied that a unit is rejecting its own selfish intent, and instead decided to follow

another unit’s intention.

A reconsideration is only valid if, by the end of the cycle, every unit is pointing to either:

• Its own selfish intention;

• Another unit’s intention, and that intention’s originating unit still intends on following it.

It is not expected that a unit maintains its own intent, even after someone reconsiders toward their

intent. Our algorithm is also responsible for recovering from an invalid state. This is done through a

rollback, which we will explain later.

Negotiation Phase — Division

As was made clear by the reconsideration definition, the negotiation phase can be further divided in two

phases — reconsideration, and validation / rollback.

These two phases are ran in succession, and together they allow the swarm to reach a consensus.

A consensus is met when there are no units reconsidering, and the end state is valid for all units. While

there are any units reconsidering, the cycle continues. In Algorithm 2 we can see the algorithm for the

negotiation phase.

25

Algorithm 2 Negotiation algorithm

1: input: selfishIntents← map[unit, intent]2: output: consensusIntents← map[unit, intent]3:4: initStepIntents← selfishIntents5: . Reconsideration6: while true do7: hasReconsidered← false8:9: for all unit in swarm do

10: myIntent← initStepIntents[unit]11:12: for all intent in initStepIntents do13: isV alid← finalIntents[intent.unit] != null and finalIntents[intent.unit] != intent14:15: if (intent.unit == unit) or (not isV alid then16: continue17: end if18:19: myHeuristic← myIntent.heuristicFor(unit)20: newHeuristic← intent.heuristicFor(unit)21:22: if newHeuristic > myHeuristic then23: myIntent← intent24: hasReconsidered← true25: end if26: end for27:28: finalStepIntents[unit]← myIntent29: end for30:31: if hasReconsidered then32: break33: end if34: . Validation / Rollback35: for all unit in swarm do36: intent← finalIntents[unit]37: parent← intent.parent38:39: isV alid← (parent == unit) or (finalIntents[parent] == intent)40:41: if not isV alid then42: previousIntent← initStepIntents[unit]43: previousParent← previousIntent.parent44:45: isPreviousV alid ← (previousParent == unit) or (finalIntents[previousParent] ==

previousIntent)46: if isPreviousV alie then47: finalStepIntents[unit]← initStepIntents[unit]48: else49: finalStepIntents[unit]← selfishIntent[unit]50: end if51: end if52: end for53:54: end while55:56: consensusIntents← finalStepIntents

26

Even though there is a while(true) in the algorithm, this is only to make sure the algorithm only

completes when there is a consensus, and every cycle of reconsideration is specially designed to work

towards that goal.

Negotiation Phase - Analysis

Noting that the number of units directly reflects on the number of intentions, in the worst case scenario,

these algorithms will run:

• Reconsideration — Number of Units in the swarm times the number of intentions — O(U ∗ I) or

O(N2). Every unit listens to every other unit in order to understand if it can or should do something

about their intention;

• Validation-Rollback — Number of Units in the swarm times — O(U) or O(N). It is a necessary

condition to have a valid state before proceeding — any unit found with an invalid intent, will fall

back to its original intent.

3.1.4 Algorithm Analysis

Algorithm analysis is usually done in terms of time complexity or resource allocation amount. However,

this algorithm was directly developed on top of the previous game’s AI, which made it especially hard

to profile — forcing us to step away from the conventional time complexity analysis for now. Resource

allocation was also an analysis without much potential, as most of the data is related to the game itself

and the algorithm doesn’t really look into it al that much.

In the end, we performed a simpler worst-case scenario analysis based on the number of iterations

of the algorithm itself and our conclusions are summed up in the table below

Phase Complexity1 Selfish O(N)

2 Negotiation 2.1 Reconsideration O(N x N)2.2 Validation / Rollback O(N)

Table 3.1: Algorithm complexity for each of the phases

In order to consider the complexity of the algorithm as a whole we can consider it as a sequence of

the three steps, and its complexity is the sum of the three — O(N) + O(N2) + O(N) or O(2N + N2).

Simplifying it, we can state that the complexity of the algorithm is, in fact, simply O(N2).

3.2 TBS Game Environment — Almansur

Almansur is a browser Massive Multi-player Online Turn Based Strategy Game with a strong focus on

its military component — this includes, troop movement and recruitment, as well as diplomacy between

27

multiple commanders. As most other Strategy games, another key aspect of the game is building and up-

grading facilities. These are required for resource generation, unit recruitment, faster movement around

the map, and even increase the defensive strength of a player’s land.

The game is played by simultaneous turns, which means every player issues orders and passes the

turn, before any events actually occur. The number of turns in each game varies, but the game may

end before the last turn if some conditions are met — such as, a player or alliance reaches a threshold

population number or holds a certain percentage of land. This means that a player may win as a solo

player or as part of an alliance in a joint victory. Player skill is ranked with an ELO-like system2.

There are multiple game types, in various settings, but the overall behavior and victory conditions

are the same. Games can be distinguished for two characteristics — if new players can join mid-game

(Static vs Dynamic Game), and if the setting is real (Historical vs Fantasy Game):

• Static Game — A set number of players must queue in before the game can begin. Each player

chooses a faction, controlling a predefined piece of land, and works his way up from there;

• Dynamic Game — A set number of players is required before the game can start, but more players

may join as the game progresses. The map, for this type of game, is generated using AI algorithms;

• Historical Game — Historical games are always static in Almansur. These games have predefined

conditions to reflect places and races or cultures from historical events;

• Fantasy Game — Fantasy games can be static or dynamic, and allow each player to choose from

fantasy races each with its own special characteristics;

Figure 3.2: Almansur — Map example of an historical game

2ELO rating system — http://en.wikipedia.org/wiki/Elo_rating_system — Last Accessed on 7 January 2014

28

Relevant to military decisions is the environment. The world map, in Almansur, is constructed with

hexagonal-tiles, each with a specific type — based on forest density, swampiness, and mountain like-

ness, allowing the definition of plains, swamps, forests, mountains, etc. Different units and races have

different marching capabilities in each terrain type.

The commands given to each army fall under many different categories — unit creation, joining and

splitting of multiple armies, army movement, deciding which order to execute after the movement (Bat-

tle, Rest, Train, or Conquest), speed at which the movement should be done (Slow, Cautious, Normal,

Forced). This set of actions makes the state-space quite complex with high probability for error and/or

sub-optimal behavior. Apart from the decisions directly made by the AI, there is still a subset of environ-

ment factors to take into account - such as type of terrain in which the battles take place, type of terrain

that the army will cross, as well as the experience and moral of the army. These factors are, more often

than not, the defining reason for the success or failure of a plan - with consequences that range from

getting to a land early and ambush the opponent to getting to it late and get destroyed.

Each army is divided in units — in other words, unit types (archer, cavalry, militia, etc). Depending

on the race played, different units are available within each type, with some degree of attribute variation

between them. The composition of each army is important for the battle phase. A battle occurs every

time two armies are in the same tile and at least one of them issues a battle command. A battle is a

sequence with the following steps: Ranged ; Charge; Shock ; Melee; and Pursuit . Each unit will be more

more relevant in one context than another — e.g. archers are more important for the ranged phase than

the melee phase.

All these decisions make the game quite complex, even when considering the military aspect alone.

For this reason, this game is interesting enough for us to apply our ideas to it. We have also been

granted access to the code for the currently implemented AI, allowing us to work our way up from there.

3.2.1 AI in Almansur

Almansur is implemented as a Multi-Agent system — even the NPC representing the population is an

agent. As for the AI-Player implemented[5], it is also considered a multi-agent system with three agents

— MilitaryAgent ; EconomicAgent ; and StrategicAgent . The objective goals were for this reason divided

in three aspects, using the technique divide and conquer , creating a simplified goal for each of the three

agents.

To interact with the game, was implemented an AIController , responsible for issuing the commands

resulting of the planning of the three agents stated before. This module has to interact with the game in

the very same way a human player would, thus ensuring no cheating can occur.

29

Figure 3.3: Almansur — Current AI implementation

Focusing on the military agent, it is again divided in three different modules — command ; facilities;

and recruitment . Respectively, these were responsible for army movement, military facility upgrading

and unit recruitment. Our objective lies in the command component, which in this architecture has been

implemented using scripts — this is, specific reactions for specific events. This approach, even though

simple, has proven useful for its intended purpose — that was improving the experience replacing play-

ers that abandon games. However, it still favors, in some way, players that have lands adjacent to those

of the AI, since they can easily exploit this fact, overwhelm the AI and greatly increase their land value

and resource income.

3.3 Solution Implementation

Our solution was built on top of the current AI in Almansur, developed in the work of Barata et al.[5]. The

general AI architecture is briefly explained in the previous section, and we modified their military agent

alone — more concretely the command component of the military agent. Alteration to other modules

were considered an upgrade (or simply a modification) to the old AI, and they remain equivalent in both

old and new AIs.

The main objective, was to shift the responsibility of thought process (this is the planning phase) to a

lower level — to each unit. This change allows each unit to perform their optimal local decision, affecting

decisions aimed at battle events, the order of territory conquering, and even movement speed when

performing each action.

To our knowledge, there are no SI algorithms based on the military aspects of social organisms —

even less documentation exists on their potential application to games. This came to prove as the first

great challenge of the development cycle of this work — the design of an algorithm with SI embedded

30

knowledge that would be adequate to solving the problem at hand.

The remainder of this section will reflect our decisions behind algorithm definition and implementa-

tion, as well as a short analysis on its efficiency, finalizing in its integration with our testing environment

— Almansur.

3.3.1 Algorithm Additional Notes

Order matters

The order of reconsideration is the same order as the selfish intents are decided. The first unit (unit-1)

to pick, will always pick the most valuable intent that is available to it. The only reason why this could

not happen is if a nearby unit (unit-2) has an intent of close (or equal) value, and requires assistance —

raising its awareness and, consequently, its value. This situation would mean that the unit-2 would have

the most valuable intent in sight of unit-1. These are the possible outcomes of this situation:

• Unit-3 (in sight of unit-2) has close or equal value intent and requires assistance — in this

case, unit-2 would follow unit-3’s intent and leave unit-1 in an invalid state. A rollback would ensue

and unit-1 would go to its selfish intent.

• Unit-2 keeps its intent — in this case, unit-1 would remain valid, and so would unit-2. Any other

units would follow intents different from unit-1’s or unit-2’s.

Even though this is not an optimal solution — optimal being, unit-2 following unit-1, if need be — this

choice reflects the nature of uncertainty in choices. People, much like animals, often do what is right for

the collectible, rather than themselves. If unit-1 follows its own selfish intent and unit-2 does the same

(this is, follows its own selfish intent), then it is likely unit-2 is slaughtered, resulting in a loss for the

colony. This mechanism makes it possible for stronger units to protect the weaker ones.

3.3.2 Algorithm Implementation and Heuristic Development

The integration of our algorithm with the context of our problem followed the initial concept of the algo-

rithm itself. A couple of details were immediately evident:

• Type definition — every possible action in the game needed to be translated into an understand-

able type;

• Heuristic function — every possible action needed to have a quantifiable for comparison.

Everything a player could do, every action he could perform over his army, needed to be translated

into AI logic. Part of this effort had already been done in the AI in place, but we felt it deserved an

overhaul. For this reason, we dropped the existing Actions for our new concept Intents, complete with a

new Intent Manager — since our environment, Almansur, is a TBS, we didn’t see a reason to create a

31

complicated structure for each unit, and settled for a structure that would hold all the intents while they

waited for further processing. In the end, we ended up having only four types of actions — defending,

attacking, conquering, and no action. These intended to provide enough diversity for our needs.

Regarding the heuristic, we ended up implementing two — one for value, and one for danger. These

heuristic functions were common to all units, and represented the way they perceived the environment

— they are represented in the algorithms as perceptions. In order to only perform calculations once for

every unit, we chose to implement these via influence maps. Each position, a territory unit in the map,

contained an associated danger level — taking into consideration the number and strength of the ene-

mies on and surrounding it — and a value — taking into consideration the amount of resources available

per type in it.

We can say that danger was more floaty than terrain value. Enemy units tend to move, instead of

idling in certain territories, which in turn causes their threat to be somewhat dynamic. Value, however, is

static. Resources that can be gathered in one territory, and they don’t necessarily enrich the neighboring

territories.

Finally, we needed a way to improve communication, and signal for emergencies — a call for help

or assistance. For this purpose, each intent carries a help package, which contains the amount of fear

the source unit has when performing that intent — unit is more fearful, then it needs more help. As an

example, when a unit notices an enemy army that is far stronger than it.

3.3.3 Integration Additional Notes

One unit, one target — two units, two targets

At first, we allowed every unit to freely choose whatever target they wanted for a selfish intent. The scope

of the play — amount of units vs amount of space — lead to the conclusion that most of the time, units

would choose the same target. This is an issue for two reasons. First, the swarm always converged, too

fast, too inefficiently. Second, communication was pointless, since most of the time, everyone wanted

and stated the same thing.

For these reasons, we decided to try and maximize the area of effect of our swarm, by making sure

that every unit had a unique target. This meant, no two units would leave the selfish phase of our algo-

rithm with the same target for intent. Results proved our initial hypothesis, and our swarm spread more

evenly around the map, greatly improving our conquering and our sensing of danger abilities.

32

Sufficiency vs Efficiency

In our initial implementation, we allowed any unit to target any territory — even if it had one of our units

there. After implementing our one unit, one target mechanic, this became an issue. Units that were on

a territory and didn’t have high rating in the hierarchy, were forced to choose other targets, which, more

often than not, meant going to the opposite end of our territory — correctly, because of the resource

distribution around the map.

This was generating a lot of internal turbulence, within the swarm, much like ants seem to behave

when they are in their colonies, but the results were definitely not good. From this turbulence we only

got slower response times to threats — as the units spent a lot of time aimlessly walking around already

conquered territory.

The solution was to give priority to any unit that is already on a territory, in a way that other units

would only go to that target if the first unit requested assistance. This was the most efficient solution.

On the other hand, the unit at the position could not have enough strength to finalize its intention —

sufficiency. In this case, it would request assistance from nearby targets.

Units can not help themselves

The help factor is quite important for the reconsideration process in allowing units to compensate for

one another and work together towards one same goal. Regardless, we can not leave unmentioned that

this factor only affects units other than the source of the intent. This means, if unit-1 has an intent with

the help factor activated, the intent will have a greater heuristic value (multiplied by the help factor) than

normal, but it will remain the same (unnaffected by the help factor) for unit-1 during the reconsideration

phase.

3.4 Testing methodology and data collection

Evaluating an algorithm is not an easy task. We could have done simulations and more complex analysis

for time and resource allocation, but we believe testing the final integration is more interesting. For this

reason, our tests can be divided in two components:

• Duels with old AI — multiple duels in different scenarios, resulted in fast games allowing for quick

information;

• Multi-player games — between 15-20 players, both real and AIs, new players could enter in the

middle of the game, and turns lasted one day.

33

The original AI was, since the beginning of the development, a benchmark for our AI. In the early

stages of development, we used the duel system for debugging our new AI. In later stages, they were

used to fine tune all parameters before taking the AI for real player tests.

We only took part in one multi-player scenario with real players, because of the game and turn

lengths. However, we had three AIs in it, alongside two old AIs and ten real players — allowing us to

retrieve some interesting data from the game.

As per metrics, we tried to use most of the information available from the game itself:

• Victory Points (VPs) — these represent the score in the game — player with most VPs wins;

• Territory Owned — being an important part of the game and of the military display, the evolution

of territories owned during the game can be an interesting metric;

• Army Power — more than the ability to conquer, the ability to keep their own is an important task

for the military agent — this metric should reflect this quality.

An final additional and important metric, was our reconsideration rate — this is, number of reconsid-

erations per number of actions taken.

34

Chapter 4

Experimental Results

In order to evaluate our solution’s the adequacy to solve the problem, we put it in play in different

scenarios against different types of opponents. In this section we will go through our findings explaining

how their relevance towards the validation of our solution. Following our initial statement, our findings

will be divided in two categories:

• Duels between AIs;

• Multi-player Matches.

It is relevant to state that the evaluation of our developed algorithm is only possible along side its

implementation within the new AI. For this reason, a good performance from said AI can be seen as a

good indicative for our solution.

4.1 Static Scenario Test - Duels

During the development time, the new AI was matched against the old AI in various games. This was

the easiest access benchmark for our implementation. Our main objectives with these matches were:

• Debugging the implementation of the algorithm;

• Identifying issues with the implementation, that could be directly linked to flaws in the algorithm

itself;

• Asserting the correct evaluation of the perceptions of value and danger per territory;

• Improving the ability to conquer multiple territories per turn;

• Assessing the adaptability of our algorithm to a controlled environment, benchmarking with the old

AI.

Almansur’s duels are scenarios that place two players in equal footing, in a symmetric map. Each

game had 24 turns and turns were processed after every player ended their respective turn — since

35

both players were AIs in this scenario, turns were actually processed fast enough to play multiple games,

making it easier to test and iterate on the implementation.

Due to the deterministic nature of the game and the AIs implementation, replaying the same match

up, will generate the same actions from both players, producing the exact same result. For this reason,

the analysis present in this section is based of the last iteration of tests made in this context.

4.1.1 Static Scenario Test Analysis

As seen in Figure 4.1, victory points steadily increase for both players, however there is a noticeable

positive difference towards the new AI.

Figure 4.1: Graphic with the evolution of Victory Points for both players in the duel

Even though constant at first, there are a couple of moments when the victory points line abruptly

changes — around turns 12-14, and 20-22. As can be seen by Figure 4.2, the evolution of victory points

is directly connected to the evolution of territory victory points.

36

Figure 4.2: Graphic with the evolution of Territory Victory Points for both players in the duel

In Figure 4.3, we can see that the amount of territories conquered in the turns when the abrupt

changes were detected (Figures 4.1 and 4.2) do not reflect a great number of territories conquered —

which, in turn, means that the few territories conquered were quite valuable.

Figure 4.3: Graphic with the evolution of Territories Conquered for both players in the duel

Having conquered all interesting territories on its side of the map, in the final stages of the game the

new AI entered the old AI’s territory and started conquering it. This was the cause for a few battles, as

can be seen in Figure 4.4.

37

Figure 4.4: Graphic with the evolution of Battle Victory Points for both players in the duel

In order to fully analyze our solution, we need to take note of the number of considerations per turn

— this is represented in Figure 4.5.

Figure 4.5: Graphic with the number of reconsiderations and intentions per turn.

By looking at this graphic, together with the previous ones, we can see that the moments when more

reconsiderations took place, are connected to some of the most noticeable shifts in the previous graphic

lines. Comparing with Figure 4.4 we can see:

• 9th turn — Reconsideration for a battle;

• 19th turn — Reconsideration for a comeback in battle victory points;

38

• 21th turn — Reconsideration for a great battle win, causing the greatest difference in battle victory

points since the beginning of the duel;

It is also relevant to mention that, during all the game, there was no need for a second reconsideration

iteration, as the swarm always reached consensus after the first reconsideration iteration. Considering

all the turns, a selfish intention could be reconsidered in, approximately, 21% of the time.

However, not all reconsiderations are returning a positive outcome — but they may be returning a

less negative one. For example, at turn 14 we can see a losing battle, which had some reconsiderations

at its base.

4.1.2 Static Scenario Test Conclusion

Being the easiest accessible data source, through the development process these duels provided the

best setting for fine tuning our implementation. These allowed for the discovery of a few shortcomings,

and consequent upgrades to our implementation itself — the following are examples of these upgrades:

• Units of a swarm should be ordered by adequacy for their tasks — otherwise, the strongest

unit in the swarm could be forced to pick a less valuable target — which could lead to the demise

of the whole swarm.

• Each target should only be picked by one unit — otherwise, multiple units could (and would)

point at the same target if it had great value, without having to communicate — invalidating the

reconsideration process.

• A target that has a unit from our swarm, should not be our target — otherwise, a more suitable

unit could pick it as a target, forcing the unit already at the location unit to move — most time would

be spent traveling around the map, instead of actually picking up objectives (conquering or battling

the enemy).

• Great risks should only be considered together with great reward (value) — otherwise, units

would be reckless and attempt to conquer (or battle) any enemy territory in sight.

These concepts were evident after the first few duels between the AIs. They led to implementation

decisions such as the inclusion of influence maps for value and danger. After these upgrades, the new

AI behavior was greatly improved being able to easily out-duel the old AI.

From the figures present in this section, it is possible to conclude that our AI implementation was

more successful in these closed scenarios than the old AI.

4.2 Dynamic Test

Our solution had to be tested in a more complex environment in order to raise more interesting metrics

for our solution and potential shortcomings. One of the advantages of these dynamic scenarios is the

39

possibility of adding players to the game while it is already running. For this reason, a scenario was

created with a total of 14 players, distributed as follows:

• 9 active Human players — 8 joined at the start of the game, and 1 mid game;

• 3 players with New AI — 2 joined at the start of the game, and 1 mid game;

• 2 players with Old AI — 2 joined at the started of the game.

At the time the data was analyzed, this game had completed 16 turns during the course of 3 weeks,

processing 1 turn per day except weekends. Our objectives for this scenario were:

• Identifying flaws in the implementation, that could be directly linked to flaws in the algorithm itself;

• Assessing the capability of our solution, when compared with that of the old AI implementation;

• Assessing the adaptability of our solution to a dynamic environment, with multiple opponents and

threats.

4.2.1 Dynamic Test Analysis

In Figure 4.61 we can see the evolution of average victory points among the 3 types of players. Right

after the second turn, we can see a clear separation between the line of the old AI and the other two. It

is also clear that our solution is able to be on par with the human player until the sixth turn.

Figure 4.6: Graphic with the average evolution of Victory Points for the three types of players

After turn 6, players have a large enough territory that they start to meet — this means, there is

no more neutral, unconquered, territory between two player areas — and players start to fight for one1See Appendix / Graphics for graphics containing discretized data from all 14 players, for this and other interesting data —

that will be summarized in the following sections.

40

another’s territories. In Figure 4.7 we can see the beginning of a shift in territory-related victory points,

in favor of the human player. It is also noteworthy that the new AI is able to have its line over the human’s

at one point.

Figure 4.7: Graphic with the average evolution of Territory Victory Points for the three types of players

The visible growth in Figure 4.7 implies that, in the early game, players are not conquering each

other’s territories, but rather those of neutral units — not controlled by any players. This can be seen as

an expansion period. This is reinforced by Figure 4.8 where we can see more or less stable lines for the

human player and the new AI — if we ignore the first encounter in turn 2.

Figure 4.8: Graphic with the average evolution of Battle Victory Points for the three types of players —lines are affected by battle events, especially visible on symmetric changes.

41

The human line, in this figure, can be seen as declining, slowly, which translates to the small in-

teraction between the human players themselves. The old AI line, however, suffers big loses early in

the game, likely caused by attempting to conquer a fortress without sufficient strength — a fortress is a

fortified territory that requires a lot of manpower to conquer. Its also clear that said fortress was neutral,

as we do not see any reflection of that drop, in the human or new AI’s line.

Also visible in Figure 4.8, is the aggressive nature of our AIs when compared to the human player.

Their lack of better judgment and over-estimation of their capabilities is rather evident. Despite being

different in scale in terms of battle-related victory points, Figures 4.9 and 4.10 show that both new and

old AIs’ military strength behave in a similar way after the initial failed confrontations.

Figure 4.9: Graphic with the average evolution of Army Power for the three types of players.

42

Figure 4.10: Graphic with the average evolution of Army Size for the three types of players.

However, the new AI is able to maintain the same level of strength and size for a longer period of

time, and the old AI keeps getting weaker and weaker.

Finally, another interesting metric is the amount of intents that lead to a reconsideration. In the case

of this game, that percentage was 13% and its average distribution throughout the game is represented

in Figure 4.11. During the 16 turns analyzed, our AIs never had more than 3 units in the swarm, which

makes it hard to take proper conclusions from this parameter. The most evident detail in this figure, is

that the more units have to make a decision, the more likely it is to trigger a reconsideration cycle — in

order to reach a consensus — as it would have been expected.

43

Figure 4.11: Graphic with the average number of intentions and reconsiderations per turn — this is anaverage of the three AIs in the game.

Still related to this metric, the only connection found between this and our other metrics can be show

in Figure 4.12. In this figure, we can see that a reconsideration takes place right before a battle event

— an unsuccessful one, though. Despite the loss, this connection reflects a case of interaction between

the units of the swarm, supporting each other in their decision.

Figure 4.12: Correlation between battle victory points and reconsideration count on the first iterationreconsideration cycle of the algorithm.

44

4.2.2 Dynamic Test Conclusion

The greatest accomplishment for the new AI was being able to keep up with the human player during

the first few turns of the game.

Considering territory victory points, the new AI displays a good perception of value and good prior-

itization of conquer targets. Despite the initial setback with one of the human players, the new AI was

able to recover and stand equal to the human players in terms of this variable. The old AI was not able

to reach such a level at any point in the game, falling short in the first couple of turns — displaying its

inadequacy to prioritize and evaluate both target and self worths.

Our greatest validation should come from the comparison between the new and old AIs though. Not-

ing that the AIs were similar in every aspect except the military agent responsible for issuing commands

— conquering, attacking, and defending — the difference in victory points observed comes as a huge

accomplishment.

Both types of AI, in this game, were playing without the aid of a complex diplomatic agent. This left

them unable to properly interact with one and other, or with the player, at this level — making it impossi-

ble to form alliances, and forcing each AI to achieve their results on their own.

In respect to military stability, the new AI was able to come back from its loses, and to avoid further

decline after an initial struggle. On the other hand, the old AI is unable to keep from decreasing strength,

turn after turn. Although this allows us to conclude that it is possible that the new AI is better at picking

its fights — having a better perception of danger, and a better judgment of its own ability — it is clear

that there is also room for improvement on the AI that is responsible for recruiting additional military units.

From the correlation graphic (Figure 4.12) it was possible to find one connecting point with the battle-

related victory points, but this is not considered enough connecting traits to be entirely meaningful. We

believe that it is necessary to perform more complex tests, with a greater number of players and units in

each swarm, in order to be able to find additional relationships between the relevant metrics.

4.3 Summary - Result Significance

From the tests we ran, we found that the new AI was superior to the old one. Some of the defining

characteristics that allowed our AI to perform better than the old one were:

• Improved perception of danger;

• Improved perception of value;

• Improved prioritization skill;

45

• Improved adaptability, making it less predictable;

• Improved distribution of units in territories, covering more ground.

Both in duel matches and in the dynamic match, it was proven that the new AI was able to evaluate

territories in a more accurate way than the old AI. In the dynamic match it was even shown that the

evaluation was even better than the human’s at one point — Figure 4.7.

Also during the dynamic match, the new AI was able to evade danger — and losses — for a greater

period than the old AI. In the duels, we saw the new AI losing territories, and quickly attempt to conquer

(and succeed in conquering )them again.

The combination of Influence Maps used for both value and danger attributes was a stepping stone

for these results, allowing each unit of the swarm to correctly evaluate territories before attempting to

conquer them.

Despite being a static scenario, the constant movement of units from both players can be seen as

a dynamic change in the environment. By being aware of the enemy units movement and how their

presence influences its safety, the new AI was able to react faster and protect itself better than the old

AI.

4.3.1 Final Remarks

Considering that the AIs were the same, to the exception of the implemented algorithm connected to

the military AI, we can state that the decision process for the military decisions present in the new AI is

definitely better than the old one — which, in turn, is a good indicative that our algorithm (and implemen-

tation) is an improvement to said process.

In turn, we do believe that the tests we ran, however promising, were not sufficient for a definite

conclusion regarding the algorithm itself, as it is highly dependent on the context. Even in this context,

we believe further testing is required, as the number of units per swarm was too small for a completely

reliable affirmation.

46

Chapter 5

Conclusions

In this work, we implemented an algorithm based on swarm intelligence concepts, that was responsible

for an artificial player’s decision process in a strategy game. Our main objective for this work, was to

derive said algorithm from the current swarm intelligence knowledge, and apply it in a different than

usual context.

Our implementation was built over an already existing AI, in the game Almansur. The original version

was adapted and had its military decision process replaced by our new algorithm, ensuring most of the

implementation was shared between both AIs. This decision allowed to benchmark the modified military

decision process without too many influences from the other components — economic and strategic

(Figure 3.3).

This work was based on some already proven theories:

• Heuristic functions based of influence maps are not new;

• Decentralized approaches for game AIs are not new.

This work was also an experiment for testing theories in a different than usual concept (as previously

mentioned):

• Use of influence maps to apply the Danger Theory, originated from immune system studies, in the

context of Game Artificial Intelligence;

• The definition of semantics understandable by a decentralized system, or swarm;

• Designing an algorithm capable of defining rules for communication, in order to reach a beneficial

consensus.

In this sense, the final solution presented is a mixture between the conventional AI and SI concepts.

Given the results presented in the previous chapter, it is possible to state that our solution has some

very promising results. The final implementation displays some of the benefits of a SI system, such as

47

adaptability, unpredictability (which is a good thing in a game AI) and associated emergence — provid-

ing results that rival those of a human (in our case, in terms of conquering territories).

We do not feel, however, that these results are sufficient for concluding the success of this work. At

most, we can reiterate our claim and state that the results are promising. In order to fully commit to the

success or failure of this solution, our implementation needs to go through more tests against users of

different skill levels.

Considering the algorithm base has an agnostic nature, devoid of context, our algorithm needs to

be implemented in other different contexts — other than Almansur — otherwise we are only able to

conclude that this solution works for this case.

Finally, considering the positive results so far, this work will likely stay as a part of the Almansur

game, allowing for more time of testing and potentially some further development in this area of studies.

5.1 Future Work

Regarding Almansur, specifically, there are a few changes that could be implemented in order to improve

results. These would benefit both old and new AIs.

• Implement a Diplomatic Agent — the most relevant issue found in the resulting AI is its lack of

ability to communicate with other players. Frequently, when the player’s territories begin to overlap,

in-game personal messages are sent to question the opponents and check if they are active before

attacking. It is common to avoid being attacked by simply replying, or to form alliances with the

surrounding players, attempting an alliance victory. The current AIs are unable to deal with these

situations.

• Communication between modules — the current communication and is not optimized. For ex-

ample, when evaluating the value of a territory, every resource is considered equal to every other.

Depending on the needs of the player, the economic player should be able to increase the value

of a resource that was required, or decrease the value if it was in excess;

• Responsability for generating/maintining Influence Maps — in the current implementation,

there isn’t a specific class responsible for generating the influence maps. These, depending on

their nature, should be generated by specific agents. For instance, the Influence Danger Map could

be generated by the Strategy agent (or by a new Diplomatic agent), and the Influence Value Map

could be generated by the Economic agent. This would allow for the definition of clear objectives

and improve the communication between all the AI agents (or components).

• Propagate Value Influence in the Map — when an enemy is present at a specific location, its

strength is the value of danger on that position. However, that danger is propagated to nearby

48

territories in order to take into account the possible movements of that enemy unit. The Value

does not propagate this way, though. If it did, it could allow for the definition of optimal paths to a

target — allowing to conquer every territory along the way.

As for the algorithm itself, its presence in this environment could determine some flaw in its core and,

through repetition, allow to fine tune some of the logic behind it. In theory, however, the algorithm would

benefit more from being implemented in a different context, providing further insight on its uses and its

shortcomings. Ultimately, only different implementations would allow for a final confirmation of worth for

this algorithm on its own.

49

Appendix A

Complete Algorithm

Algorithm 3 Complete developed algorithm - part 1

1: input: perceptions . Territories, influence map info, units in range, etc.2: output: consensusIntents← map[unit, intent]3:4: selfishIntents← newmap[unit, intent]5:6: for all unit in swarm do7: for all territory in reachableTerritories do8: for all intent in intentType do9: if (intent.heuristic > selfishIntent.heuristic ) then

10: selfishIntent← intent11: end if12: end for13: end for14:15: selfishIntents[unit]← selfishIntent16: end for17:18: initStepIntents← selfishIntents19: . Reconsideration20: while true do21: hasReconsidered← false22:23: for all unit in swarm do24: myIntent← initStepIntents[unit]25:26: for all intent in initStepIntents do27: isV alid← finalIntents[intent.unit] != null and finalIntents[intent.unit] != intent28:29: if (intent.unit == unit) or (not isV alid then30: continue31: end if

50

Algorithm 4 Complete developed algorithm - part 2

32: myHeuristic← myIntent.heuristicFor(unit)33: newHeuristic← intent.heuristicFor(unit)34:35: if newHeuristic > myHeuristic then36: myIntent← intent37: hasReconsidered← true38: end if39: end for40:41: finalStepIntents[unit]← myIntent42: end for43:44: if hasReconsidered then45: break46: end if47: . Validation / Rollback48: for all unit in swarm do49: intent← finalIntents[unit]50: parent← intent.parent51:52: isV alid← (parent == unit) or (finalIntents[parent] == intent)53:54: if not isV alid then55: previousIntent← initStepIntents[unit]56: previousParent← previousIntent.parent57:58: isPreviousV alid ← (previousParent == unit) or (finalIntents[previousParent] ==

previousIntent)59: if isPreviousV alie then60: finalStepIntents[unit]← initStepIntents[unit]61: else62: finalStepIntents[unit]← selfishIntent[unit]63: end if64: end if65: end for66:67: end while68:69: consensusIntents← finalStepIntents

51

Appendix B

Additional Graphics

B.1 Victory Points in Multi-player Match

Figure B.1: Graphic of the Victory Points evolution throughout all the game for all the players

52

Figure B.2: Graphic of the Victory Points evolution throughout all the game for all the AIs

53

B.2 Territory Victory Points in Multi-player Match

Figure B.3: Graphic of the Territory Victory Points evolution throughout all the game for all the players

54

Figure B.4: Graphic of the Territory Victory Points evolution throughout all the game for all the AIs

55

B.3 Territory Owned in Multi-player Match

Figure B.5: Graphic of the Territory Owned evolution throughout all the game for all the players

56

Figure B.6: Graphic of the Territory Owned evolution throughout all the game for all the AIs

57

B.4 Battle Victory Points in Multi-player Match

Figure B.7: Graphic of the Battle Victory Points evolution throughout all the game for all the players

58

Figure B.8: Graphic of the Battle Victory Points evolution throughout all the game for all the AIs

59

B.5 Army Size in Multi-player Match

Figure B.9: Graphic of the Army Size evolution throughout all the game for all the players

60

Figure B.10: Graphic of the Army Size evolution throughout all the game for all the AIs

61

B.6 Army Power in Multi-player Match

Figure B.11: Graphic of the Army Power evolution throughout all the game for all the players

62

Figure B.12: Graphic of the Army Power evolution throughout all the game for all the AIs

63

Bibliography

[1] Michael Gallagher. Leading the Entertainment Industry through a Difficult Economy. Expo State of

the Industry Address, 2009.

[2] Entertainment Software Association. Essential facts about the computer and video game industry.

Retreived November, 2013.

[3] William Collins, Maxie Carpenter, and Jim Shankle. The iEconomy! pages 1–78, 2013.

[4] MG Carneiro. Artificial Intelligence in Games Evolution. Business, Technological, and Social Di-

mensions of Computer Games: Multidisciplinary Developments, pages 98–114, 2011.

[5] AM Barata, PA Santos, and Rui Prada. AI for Massive Multiplayer Online Strategy Games. AIIDE,

pages 110–115, 2011.

[6] John Levine, CB Congdon, Marc Ebner, and G Kendall. General Video Game Playing. idsia.ch,

pages 1–7, 2013.

[7] M Riedl and V Bulitko. Interactive Narrative: A Novel Application of Artificial Intelligence for Com-

puter Games. AAAI, 2012.

[8] Marco Santos and Carlos Martinho. Wasp-Like Scheduling for Unit Training in Real-Time Strategy

Games. AIIDE, pages 195–200, 2011.

[9] Yuanchun Zhu and Ying Tan. A Danger Theory Inspired Learning Model and Its Application to

Spam Detection. pages 382–389, 2011.

[10] K Potisartra and V Kotrajaras. An evenly matched opponent AI in Turn-based Strategy games. 3rd

IEEE International Conference on Computer Science and Information Technology, 2010.

[11] Marco Dorigo and Mauro Birattari. Ant colony optimization. In Encyclopedia of Machine Learning,

pages 36–39. Springer, 2010.

[12] James Kennedy. Particle swarm optimization. In Encyclopedia of Machine Learning, pages 760–

766. Springer, 2010.

[13] H Shah-Hosseini. The intelligent water drops algorithm: a nature-inspired swarm-based optimiza-

tion algorithm. International Journal of Bio-Inspired Computation, 1(1/2):71, 2009.

64

[14] MHJ Bergsma and Pieter Spronck. Adaptive Spatial Reasoning for Turn-based Strategy Games.

AIIDE, 2008.

[15] C Weddle. Artificial Intelligence and Computer Games. Artificial Intelligence and Computer Games,

pages 1–8, 2008.

[16] F. Sailer, M. Buro, and M. Lanctot. Adversarial planning through strategy simulation. . . . and Games,

2007. CIG 2007. IEEE . . . , 2007.

[17] A Sanchez-Ruiz and S Lee-Urban. Game AI for a Turn-based Strategy Game with Plan Adaptation

and Ontology-based retrieval. (0642882), 2007.

[18] Uwe Aickelin and Julie Greensmith. Sensing danger: Innate Immunology for Intrusion Detection.

Information Security Technical Report, 12(4):218–227, January 2007.

[19] Risto Miikkulainen. Computational intelligence in games. Computational . . . , 2006.

[20] Ashraf M Abdelbar, Sherif Ragab, and Sara Mitri. Co-evolutionary particle swarm optimization

applied to the 7× 7 seega game. In Neural Networks, 2004. Proceedings. 2004 IEEE International

Joint Conference on, volume 1. IEEE, 2004.

[21] Tapani Raiko. The go-playing program called go81. In Proceedings of the Finnish Artificial Intelli-

gence Conference, STeP 2004, pages 197–206, 2004.

[22] Michael Buro. Real-time strategy gaines: a new AI research challenge. Proceedings of the 18th

International Joint Conference, pages 1534–1535, 2003.

[23] R. Rasch, A. Kott, and K.D. Forbus. Incorporating AI into military decision making: an experiment.

Intelligent Systems, IEEE, 18(4), 2003.

[24] Michael Mateas. Expressive AI: Games and Artificial Intelligence. DIGRA Conf., 2003.

[25] Joyce Wycoff. Learning from Ants. pages 1–3, 2003.

[26] Nelis Franken and Andries Petrus Engelbrecht. Comparing pso structures to learn the game of

checkers from zero knowledge. In Evolutionary Computation, 2003. CEC’03. The 2003 Congress

on, volume 1, pages 234–241. IEEE, 2003.

[27] Eike F Anderson. Playing smart-artificial intelligence in computer games. 2003.

[28] K.D. Forbus, J.V. Mahoney, and K. Dill. How qualitative spatial reasoning can improve strategy

game AIs. Intelligent Systems, IEEE, 17(4):1–8, 2002.

[29] Polly Matzinger. The Danger Model: a Renewed Sense of Self. Science (New York, N.Y.),

296(5566):301–5, April 2002.

[30] William Van Der Sterren. Squad tactics: Team ai and emergent maneuvers. AI Game Programming

Wisdom, pages 233–246, 2002.

65

[31] E Bonabeau and C Meyer. Swarm intelligence: A whole new way to think about business. Harvard

business review, 79(5):106–114, 165, 2001.

[32] Chris Fairclough, Michael Fagan, B. Mac Namee, and Pádraig Cunningham. Research Directions

for AI in Computer Games. Irish Conference on Artificial Intelligence and Cognitive Science, pages

333–344, 2001.

[33] D. Johnson and J. Wiles. Computer games with intelligence. 10th IEEE International Conference

on Fuzzy Systems. (Cat. No.01CH37297), 2:1355–1358.

[34] Vincent A Cicirello and Stephen F Smith. Wasp nests for self-configurable factories. In Proceedings

of the fifth international conference on Autonomous agents, pages 473–480. ACM, 2001.

[35] Paul Tozour. Influence mapping. Game programming gems, 2:287–297, 2001.

[36] Guy THBR AULAL, Sirnon GOSS, Jacques GERVET, and Jean-Louis DENELBOIIRG. Task differ-

entiation in polistes wasp colonies: a model for self-organizing groups of robots.

[37] D Floreano and C Mattiussi. Bio-inspired artificial intelligence: theories, methods, and technologies.

2008.

[38] AP Engelbrecht. Computational Intelligence: An Introduction. 2nd edition, 2007.

[39] Colin Fyfe, Stephen Mcglinchey, and Colin Fyfe. Biologically Inspired Artificial Intelligence for Com-

puter Games. IGI Global, November 2007.

[40] E Bonabeau, M Dorigo, and G Theraulaz. Swarm intelligence: From Natural to Artificial Systems.

1999.

66

swarm intelligence in strategy games · swarm intelligence (si) is the sub-ﬁeld of ai that...

Documents