swarm intelligence in strategy games · swarm intelligence (si) is the sub-field of ai that...
TRANSCRIPT
Swarm Intelligence in Strategy Games
Auguste Antoine Cunha
Thesis to obtain the Master of Science Degree in
Information Systems and Computer Engineering
Supervisors: Prof. Pedro Alexandre Simões dos SantosProf. Carlos António Roque Martinho
Examination Commitee
Chairperson: Prof. Mário Jorge Costa Gaspar da SilvaSupervisor: Prof. Pedro Alexandre Simões dos Santos
Member of the Committee: Prof. César Figueiredo Pimentel
May 2015
Acknowledgments
I want to thank my family for supporting me and helping me see reason in finishing this chapter of my
life.
I would also like to thank my coordinators Pedro Santos and Carlos Martinho for the insight they
provided throughout the development of this thesis.
I want to send word for Ivo Capelo and Pedro Engana for helping with the formatting of this thesis,
facilitating my work with LATEX.
A word of appreciation towards my colleagues at Miniclip for the moral support they have provided.
I’d like to thank the Almansur community for helping me test the work I’ve done with their game.
And finally, a great big thank you to all other people that have been a part of my academic life, as
they were also part of why I reached this far.
iii
Resumo
Desenvolver um jogador inteligente para um jogo não é tarefa fácil. Cada Jogador Artificial Inteligente
é criado especificamente para o seu contexto e, por essa razão, é não é facilmente reutilizável. No
entanto, alguns dos desenvolvimentos a mais baixo nível possuem maior significância tanto para jogos
como para outras área — como é o caso de algoritmos de procura, optimização de caminhos (pathing),
ou optimização geral.
Neste trabalho, desenhámos e implementámos um algoritmo que combina conceitos de Inteligência
de Enxame (Swarm Intelligence) com os mechanismos de decisões tradicionais utilizados em jogadores
artificiais inteligentes — especificamente aqueles usados em Jogos de Estratégia. O nosso principal
objectivo era, portanto, averiguar a adequação do conhecimento actual em Inteligência de Enxame para
com os requisitos do Jogador Artificial Inteligente, seguido do desenvolvimento do algoritmo de teste
em si. O conceito básico passo pelo o afastamento da comum solução centralizada, e aproximação de
uma solução descentralizada, complementada pela aplicação de algumas noções relativas à Inteligên-
cia de Enxame actualmente documentadas. O algoritmo resultante estava responsável pelo método de
comunicação entre as unidades de um Jogador Artificial Inteligente. Um Jogador Artificial Inteligente
de implementação centralizada e scriptada foi usado como referência para a nossa solução baseada
em Inteligência de Enxame.
Este trabalho é assim uma tentativa de resposta aos problemas resultantes de Jogadores Artificiais
previsíveis — um problema comum de implementações scriptadas — e melhorar a sua capacidade de
adaptação — ao tirar partido do comportamento emergente resultante dos conceitos de Inteligência de
Swarm.
Palavras-chave: Jogador Artificial Inteligente, Jogos de Estratégia, Inteligência de Enx-
ame, Inteligência Descentralizada, Algoritmo, Comunicação, Previsibilidade, Adaptabilitdade, Compor-
tamento Emergente
iv
Abstract
Developing an intelligent player for a game is no easy task. Each Artificial Intelligent Player is created
specifically for each context with very little re-usability. However, some lower level developments have
great significance in both games and other areas — such as the search, pathing, or optimization algo-
rithms.
In this work, we designed and implemented an algorithm that combined Swarm Intelligence concepts
with the traditional decision mechanisms of modern Artificial Intelligent Players — specifically those used
in Strategy Games. Our main objective was to assert the adequacy of Swarm Intelligence current knowl-
edge to the Artificial Intelligent Player requirements, followed by the development of a test algorithm in
itself. The basic concept was to distance our implementation from a common centralized solution, into
a decentralized solution, and complementing it by applying some of the currently documented Swarm
Intelligence notions. The resulting algorithm was especially responsible for the means of communication
between the units of an Artificial Intelligent Player. A centralized and scripted Artificial Intelligence was
used as benchmark for our Swarm Intelligence based solution.
This work is an attempt to answer the problems resulting from predictable Artificial Players — a com-
mon issue with scripted implementations — and to improve its adaptability — taking advantage of the
emergent behavior resulting of the Swarm Intelligence concepts.
Keywords: Artificial Intelligent Player, Strategy games, Swarm Intelligence, Decentralized In-
telligence, Algorithm, Communication, Predictability, Adaptability, Emergent behavior
v
Contents
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Resumo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Introduction 2
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Document outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Related Work 6
2.1 Artificial Intelligence in Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Game AI in Commercial Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 Industry vs Academy AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 AI in Games — in Short . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Artificial Intelligence in Strategy Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Strategy Games — Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Strategy Games and Game AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.3 AI-Player in Strategy Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.4 AI in Strategy Games — in Short . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Implementing an AI-Player . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.1 Centralized vs Decentralized Approach . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3.2 Human-Like Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Swarm Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.1 Emergence — Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.2 Swarm Intelligence — Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.3 Algorithms in Swarm Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.4 Issues of a Swarm Intelligence Approach . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 Artificial Immune System — A Defense Mechanism . . . . . . . . . . . . . . . . . . . . . . 19
vi
2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Solution 22
3.1 Algorithm Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.1 Intent — Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1.2 Selfish Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.1.3 Negotiation Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1.4 Algorithm Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 TBS Game Environment — Almansur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.1 AI in Almansur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Solution Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3.1 Algorithm Additional Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.2 Algorithm Implementation and Heuristic Development . . . . . . . . . . . . . . . . 31
3.3.3 Integration Additional Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Testing methodology and data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4 Experimental Results 35
4.1 Static Scenario Test - Duels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.1 Static Scenario Test Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1.2 Static Scenario Test Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Dynamic Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2.1 Dynamic Test Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2.2 Dynamic Test Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3 Summary - Result Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3.1 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5 Conclusions 47
5.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
A Complete Algorithm 50
B Additional Graphics 52
B.1 Victory Points in Multi-player Match . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
B.2 Territory Victory Points in Multi-player Match . . . . . . . . . . . . . . . . . . . . . . . . . . 54
B.3 Territory Owned in Multi-player Match . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
B.4 Battle Victory Points in Multi-player Match . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
B.5 Army Size in Multi-player Match . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
B.6 Army Power in Multi-player Match . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Bibliography 66
vii
List of Figures
3.1 Conceptual design of the algorithm per swarm unit . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Almansur — Map example of an historical game . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 Almansur — Current AI implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1 Graphic with the evolution of Victory Points for both players in the duel . . . . . . . . . . . 36
4.2 Graphic with the evolution of Territory Victory Points for both players in the duel . . . . . . 37
4.3 Graphic with the evolution of Territories Conquered for both players in the duel . . . . . . 37
4.4 Graphic with the evolution of Battle Victory Points for both players in the duel . . . . . . . 38
4.5 Graphic with the number of reconsiderations and intentions per turn. . . . . . . . . . . . . 38
4.6 Graphic with the average evolution of Victory Points for the three types of players . . . . . 40
4.7 Graphic with the average evolution of Territory Victory Points for the three types of players 41
4.8 Graphic with the average evolution of Battle Victory Points for the three types of players
— lines are affected by battle events, especially visible on symmetric changes. . . . . . . 41
4.9 Graphic with the average evolution of Army Power for the three types of players. . . . . . 42
4.10 Graphic with the average evolution of Army Size for the three types of players. . . . . . . 43
4.11 Graphic with the average number of intentions and reconsiderations per turn — this is an
average of the three AIs in the game. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.12 Correlation between battle victory points and reconsideration count on the first iteration
reconsideration cycle of the algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
B.1 Graphic of the Victory Points evolution throughout all the game for all the players . . . . . 52
B.2 Graphic of the Victory Points evolution throughout all the game for all the AIs . . . . . . . 53
B.3 Graphic of the Territory Victory Points evolution throughout all the game for all the players 54
B.4 Graphic of the Territory Victory Points evolution throughout all the game for all the AIs . . 55
B.5 Graphic of the Territory Owned evolution throughout all the game for all the players . . . . 56
B.6 Graphic of the Territory Owned evolution throughout all the game for all the AIs . . . . . . 57
B.7 Graphic of the Battle Victory Points evolution throughout all the game for all the players . 58
B.8 Graphic of the Battle Victory Points evolution throughout all the game for all the AIs . . . . 59
B.9 Graphic of the Army Size evolution throughout all the game for all the players . . . . . . . 60
B.10 Graphic of the Army Size evolution throughout all the game for all the AIs . . . . . . . . . 61
B.11 Graphic of the Army Power evolution throughout all the game for all the players . . . . . . 62
viii
B.12 Graphic of the Army Power evolution throughout all the game for all the AIs . . . . . . . . 63
ix
Chapter 1
Introduction
1.1 Motivation
Alongside all the technology advancements, we have seen video-game’s environments become beau-
tiful — with the most recent demonstration of technology prowess by Square Enix 1 — and immersive
— with the appearance of Virtual Reality (VR) and Augmented Reality (AR) helmets and glasses). With
this information accessible by all, video-games are expected to have the most perfect graphics possible.
However, the more beautiful a game is, the more prone players are to notice flaws in other aspects of
the game. The damage caused by these issues in game reviews can be much worse than the lack of a
very high graphic level. Linear Level Design, weak Storytelling, and unrealistic Artificial Intelligence (AI)
are examples of common noticed flaws — for example, the beautiful game with linear content nature
called The Order:1886 was not very well accepted by the critics2.
Despite the worldwide economic crisis and the above mentioned potential flaws, the video game
industry is still alive and well[1]. In 2012, over 20 billion dollars were spent in the U.S. alone[2]. The
Strategy genre is also one of the most successful, as around 24.9% of all Computer Games sold being
Strategy Games (SG) in the U.S. in 2012[2], making any and all improvements worthwhile. Being gen-
erally AI heavy, SG are an interesting study case for AI enthusiasts and investigators. And seeing how
far the industry has come in the computer generated graphics department, it is worthwhile to study ways
to improve the quality of games in other areas — potentially improving the revenue of this industry by it’s
own merit, instead of by looks alone.
Swarm Intelligence (SI) is the sub-field of AI that studies how a group of AI agents may work in a
self-organized and coordinated way, without the use of a centralized control. The algorithms developed
in this area, are a derivative of Nature studies, often from insect colonies (such as ants, bees, wasps, or
termites). Perfected by nature, these algorithms have become an interesting part of AI study, giving new
1Square Enix’s DirectX 12 demo — http://www.eurogamer.net/articles/2015-04-30-square-enix-challenges-the-uncanny-valley-in-directx-12-demo — Gamer Network, Last Accessed on 01 May 2015
2Forbes take on The Order:1886 — http://www.forbes.com/sites/insertcoin/2015/02/19/the-order-1886-is-a-beautiful-failed-experiment-in-cinematic-gaming/ — Forbes, Last Accessed on 01 May 2015
2
insights to the way people approach each problem, and even changing the way we perceive ’intelligence’.
The use of SI-based algorithms have already began to help with real world problems[3]. General
Motors Corp. used an algorithm based on wasps to reduce the idle time of their painting machines at
their Fort Wayne Assembly Plant. Another company, Cemex (Cementos Mexicanos), offered their truck
drivers the power to act after being given the full information available — through real-time GPS location
signals of every truck and massive telecommunications throughout the company. This allowed them to
improve their on-time delivery rate from 35 to an impressive 98%. Railways (like the Japanese bullet
express trains), similar to what currently happens in the Internet, use swarm algorithms to direct traffic
and ensure punctuality. In the end, by applying SI appropriately to a business it was possible to solve
previous known issues, and resulted in an increase of revenue.
However, the use of SI in computer games is limited, and usually only applied to under the hood
features — uses not directly visible to the player. For this reason, a question is left floating — How can
we take advantage of these field developments in gaming AI?.
1.2 Problem Description
Even though we understand that AIs may come in many forms, possibly even be part of the scenery —
which is a use of AI to improve the immersiveness of the game — we will focus on SG and AIs that are
responsible for the decisions of an opposing player (as if they were another human player in the same
game). As players become more and more aware of the lack of good AI present in some games —
either by seeing constant bad AI decisions or AIs that simply let them win — they are becoming more
demanding in that respect[4]. Our objective will always be to develop AIs capable of entertaining the
player — which can have multiple interpretations or definitions as different people are entertained by
different things (e.g. challenge seeking vs always have to win). In this work we will try to do just that, as
we will develop a new AI to control the units of an army in the Multi-player Online Turn-Based Strategy
Game Almansur3. For this reason, and as stated before, our study focus will be on AIs used in SG,
namely Turn-Base Strategy Games (TBS).
Going one step further, our goal will be to test the viability of using the current knowledge of Swarm
Intelligence (SI), adapting it to improve the decision process of an AI in Almansur. With this approach,
we will attempt to solve some of the issues that traditional AI faces in a TBS environment. More con-
cretely, we will develop and apply an algorithm based on our SI research to the military decision process
of the current AI of Almansur[5] — a scripted and purely reactive AI. We expect the use of a SI based AI
will improve the quality of the AI itself, improving it’s adaptability to unpredictable situations and constant
changes in the environment.
3Almansur — http://www.almansur.net/ — Almansur LDA, Last Accessed on 27 November 2013
3
With this work, we aim to answer the question left in the previous section, bringing some of the val-
ues of SI to a concrete case of video-game AIs, putting it in charge of the military decisions in a TBS
environment.
As a final note, we’d like to reinforce that our goal is to improve the quality of an existing AI by ap-
plying a developed SI algorithm to the decision process of said AI. In order to validate our goal, we will
run multiple scenarios between the two AIs, as well as with real players to validate our application. We
expect to see a more responsible, a more adaptive, a more competitive and — generally speaking — a
more intelligent AI form, followed by an improvement in Game Experience for the player.
In order to improve its quality, we implemented a new algorithm that combines the usual knowledge
used by AI players and the SI concepts to improve the communication between the units of a decen-
tralized multi-agent system. By applying the concepts of SI to this process, we will are able to improve
its general performance, and produce optimized solutions for specific problems. These however, are
not without risk, as there are still quite a few uncertainties related to SI in general, which we hope to
overcome.
1.3 Document outline
The remainder of this document is divided in four components.
The section Related Work will cover the current state of the art, when it comes to AI in the gaming
industry. We will introduce relevant concepts for our work, as well as the most frequently used tech-
niques when designing an AI to play a game. At the end of the section, a brief discussion will be held,
with the intent of creating bridges between the related work, and relate that to our intended purposes.
Building up to our proposed solution, we will offer a brief review of the game in which we will work on —
Almansur — focusing the most relevant aspects for our work, the military planning.
The following section, Solution, will detail the process we followed to develop our SI algorithm, as
well as its application in Almansur. We will then perform a complexity algorithmic analysis of our solu-
tion. In the end, we’ll describe the methodology behind the evaluation process / experimental procedure.
Data Analysis will contain the results of our experiments with real players and the old non-swarm
AI. We will analyze the detailed data and discuss the finding’s relevance towards our objectives.
In the end, Conclusions we will summary of the reasoning behind our decisions, followed by a
connection between findings and objectives. Finally we will discuss potential future work that could
follow the results and findings we have gathered.
4
Chapter 2
Related Work
Nowadays, Artificial Intelligence(AI) can be found virtually anywhere. In order to understand why there
is a need to improve the AI in games, it is relevant to understand how it manifests inside a game. Also,
we will want to know why use games to test AI theories or algorithms?.
Even though some of the AI uses may be the same across genres — e.g. using path-finding algo-
rithms — some other are genre specific or, at the very least, genre intensive (this means, more used in
one genre than another).
In this section, we will talk about the many forms AI takes, how it improves game quality, and why we
should be weary of bad executions. As we go further in, we will increase the focus on the Strategy and
on the AI uses within said genre. We will emphasize the AI meant to control a player or opponent, and
lastly, we will introduce the concept of Swarm Intelligence.
2.1 Artificial Intelligence in Games
Computer generated behaviors can be divided in two components[24] — Game AI and Game Physics.
Respectively, these are responsible for the living and the dead parts of the game. Game AI refers to
entities (human or otherwise) that react to the presence of the player, displaying intelligent or intentional
behavior. On the other hand, Game Physics refers to the multiple aspects that do not have a behavior
derived from intentions, but rather behaves the same way in every iteration (within the same starting
conditions) — e.g. falling rocks, gravity, and the flow of a river. In other words, Game Physics is respon-
sible for the purely causal behaviors in a game.
When we think about it, the AI found in the video game industry can be quite overwhelming — it can
be responsible for multiple behaviors and mechanics and even some aesthetic aspects (like the flock-
like movement of the background birds in a scenery). This means, AI can be presented in a somewhat
6
"silent" form, almost unnoticed or assumed as a given by the player. Such is the case of algorithms used
in path-finding, that help the player better navigate his avatar in the virtual game world, and that produce
optimal pathing for a Non-Player Character (NPC). The use of AI can even have more subtle results,
like increasing the engagement factor of a game — e.g. when the AI is responsible for the behavior of
a NPC animal that follows the player around the game map, often making the player more attached to
both the virtual pet and the game. Studies from multiple sources (big game companies, indie studios and
universities) have also resulted in some progress within the narrative driven game genre, using some
AI to convey their narrative[7] — e.g. the narrator in Bastion1 is one of the most remarkable narrators
in gaming and, more recently, the narrator in The Stanley Parable2 also had a very good acceptance by
the gaming community.
2.1.1 Game AI in Commercial Games
In this work, we will focus on Game AI.
Game AI produces the part of a game’s behavior that players can best understand by "read-
ing" the behavior as if it results from the pursuit of goals given some knowledge. Creating a
sense of aliveness (...) the sense that there is an entity living within the computer that has its
own life independently of the player and cares about how the player’s actions impact this life.
(Michael Mateas[24])
Our ambition is always to improve the experience of play. And as we consider Game AI to be
our presence within the game after it’s released, the quality of the interaction between the player and
the Game AI must be as polished as possible — thus improving the interaction between Player and
Game. The most usual (and noticeable) Game AI manifestation in games are the supporting role and
the opponent role.
• Supporting Role — when the AI is responsible for aiding the main character (controlled by the
player) in his quest. This may also translate to allies in multi-player games — meaning AI and
player stand on equal footing.
• Opponent Role — when the AI is responsible to (pretend to) block the player’s progress. This
may also translate to enemies in multi-player games — same as in the supporting role, in equal
footing and same abilities as the player.
Each type of AI has it’s own inherited potential issues — a supporting AI that’s more of an hindrance
than an aid is just as bad as an AI too predictable (often scripted) or simply impossible to beat (clearly
cheating or otherwise).
1Second Person: Behind Bastion’s Unique Narrative — http://www.1up.com/features/second-person-bastion-narrative —Posted by Agnello, A. on September 2011, Accessed on 27 November 2013
2The Stanley Parable Calls Shenanigans on Narrative-Driven Design — http://www.vg247.com/2013/10/25/the-stanley-parable-calls-shenanigans-on-narrative-driven-design — Posted by Brenna, H. on October 2013, Last Accessed on 27 November 2013
7
If you do want to talk about poor enemy AI in shooters you have to think about what makes a
fun game. Brilliant AI is no fun to play against, if they keep you suppressed, never miss, and
flank you without warning players feel the game is unfair.3
(Metro.co.uk reader DarKerR)
In the end, no matter how hard it may seem to please the player and implement an acceptable AI be-
havior, it is not impossible. When this feat is achieved, the results are very impressive, both in respect to
player experience and to game sales — e.g. the development of Elizabeth, the helpful NPC of BioShock
Infinite, is a recent case of success when it comes to an AI in a support role4.
2.1.2 Industry vs Academy AI
The AI techniques in commercial games are often simplistic in comparison to the ones developed and
used in academic research or in other industrial applications[32]. This fact should not be understood as
AI in games is poorly done, as we could tell from the previous given examples. Even back in the 2000s,
when there was a big funneling of effort and resources to graphical fidelity, there was a set of well estab-
lished techniques widely used by game developers — e.g. Fuzzy State Machines, the A* Path-Finding
algorithm, and the BOIDS flocking of Craig Reynolds are examples of such techniques.
On a side note, we find it relevant to point out that these techniques were (and still are) so useful in the
industry that there were some who thought of creating Software Development Kits(SDKs) with generic
implementations of AI components, with the intention of lowering the game’s development times[32].
Not many people ended up using these SDKs, because of their lack of flexibility — they were not usable
without a great deal of effort from the developer, often requiring specific solutions for each problem to
solve. Ultimately these SDKs did not solve any problem.
Despite these early issues, nowadays, the average household computer has much higher specifi-
cations. This, aligned with the development and improvement of the Graphics Processing Unit(GPU),
made it possible to allocate a processing unit dedicated to graphics, freeing (some) of the other cores
for general processing. All these advancements make it possible to further develop and to use higher
demanding AI techniques inside our games, without breaking the performance nor the experience. Also,
the current graphical level is so high that a game is expected to have some awesome or innovative
gameplay mechanic, and/or noticeable good AI.
3Gamers have never had it so good — http://metro.co.uk/2015/05/02/gamers-have-never-had-it-so-good-readers-feature-5177285/ — Posted on 02 May 2015 on Metro.co.uk by reader DarKerR, Last Accessed on 02 May 2015
4BioShock Infinite: The Revolutionary AI Behind Elizabeth — http://uk.ign.com/videos/2013/03/01/bioshock-infinite-the-revolutionary-ai-behind-elizabeth — IGN UK, Posted on 1 March 2013, Last Accessed on 27 November 2013
8
2.1.3 AI in Games — in Short
Even context dependent animation and audio use AI.
(Charles Weddle[15])
To sum it up, in gaming, AI is used to do anything from improving the aesthetic feel, to compete
against the player in some way. Some (most) games even use multiple types of AI blending seamlessly.
However, our primary focus is on Strategy Games, especially Turn-Base Strategy Games. The list of
relevant uses for AI in this context shortens a bit.
2.2 Artificial Intelligence in Strategy Games
Since the dawn of gaming, a clear distinction was made across genres. Taking the risk of over-simplifying
things a bit, we can say we have Action, Adventure, Strategy, Simulation, Puzzle, Platform, etc, and all
of these have multiple sub-genres. Our focus study is the Strategy genre. Within this genre, it is possible
to identify two big sub-genres (again sub-divided in multiple others) — Turn-Based Strategy (TBS), and
Real-Time Strategy (RTS) games. According to Fairclough et al.[32], the main distinctive point between
the two sub-genres is the time available for planning. RTS, as the name suggests, forces decision mak-
ing to be in real-time, while TBS has a softer (sometimes nonexistent) time constraint. This means that it
is possible to allow a longer period of time for planning in a TBS than in a RTS game, which, in principle,
should mean the a decision in a TBS should be the result of a more thorough and careful plan.
2.2.1 Strategy Games — Definition
The main characteristic of Strategy games, both TBS and RTS, is the ability to command. A player may
control one of multiple units through indirect control, only expressing his (or her) desire. The selected
unit will make way, through the shortest known path, to the designated location and will perform the
action selected by the player at that location. Some games, may have a tiled map, which simplifies
the pathing and, in some way, limits the unpredictability of the path-finding algorithms — this is more
common in TBS — while others have a more open field and require stronger algorithms — in opposition,
more common in RTS. There is a set of know-how skills that a player must have to be successful, which
is mostly common between the two sub-genres of Strategy games[22]. Those skills have a direct or
indirect connection to the actions a player can perform within the game, and they all fall back to the
player’s ability to command his (her) hero or army. They are:
• Resource Management — refers to the knowledge required to decide which resources to search/produce
and how to spend them in buildings or units.
9
• Decision Making Under Uncertainty — refers to the knowledge required to perform actions
without absolute certainty of the outcome, e.g. when exploring in fog of war5.
• Spatial and Temporal Reasoning — refers to the knowledge required to understand the nature
of the environment and to perform actions where and when it is most favorable.
• Collaboration — refers to the knowledge required to play while supporting or being supported by
some other player (that may also be an AI).
• Opponent Modeling / Learning— refers to the ability to learn from one game to another, increas-
ing the performance against a same opponent or tactic.
• Adversarial Planning — refers to the knowledge required to predict the future intentions and
actions of an opponent and then planning appropriate responses.
This set of required skills results in a strong need for parallel thinking, and takes an enormous amount
of detail into account. Each skill, even when considered separately, can easily be seen as a complex
case-study. And all of them together, make it harder to develop an intelligent AI capable of exploring all
these factors at the same time.
2.2.2 Strategy Games and Game AI
It is very easy to assume RTS and TBS games as a simplified take on military simulations — multi-
ple players (commanders) order their troops to gain access to resources scattered around the map,
which in itself sets up the economy required to invest in more units and defeat the opponent. A natural
step would be considering the use of these games as a simulated scenario to develop AI concepts and
algorithms[22], since a miss-implementation within a game is less harmful than one in a real world sce-
nario. Taking this notion one step further, the very nature of the RTS sub-genre and the constraints by
which games of this sub-genre are bound, make them an ideal testing ground for real-time decision mak-
ing AI agents, systems and algorithms. Analogously, the TBS sub-genre and the relative time-constraint
free environment, make TBS games ideal for testing planning centered AI agents, systems and algo-
rithms.
The complex environment, associated with the multiple unit and terrain types, and complemented by
the dynamic involvement of all entities within a Strategy game, result in a large variety of AI research
opportunities — often more complex than in other genres.
As a side note, it’s important to remember that some AI present in Strategy games is similar to that
of other genres. Even in Strategy games, AI can be responsible for the story-telling (adapting to player’s
actions), the tutorial (introducing the game at a pace the player can follow), or the occurrence of random5Definition of "fog of war" — http://en.wikipedia.org/wiki/Fog_of_war — Last Accessed on 7 January 2014
10
events (to keep the player engaged), just to name a few. All of these can be more or less obvious, that
is, more or less noticed by the player, and more or less easily identifiable as AI. Other things are even
taken for granted, like the pathing (path-finding) algorithms for units in games like the franchises Age
of Empires6 or Starcraft7 — these quality of life features are expected to be present in any game with
indirect control over units or avatars, and that expectancy makes people forget their real value.
However, the most studied AI field related to Strategy games, are agents (or even systems) respon-
sible for playing the role of an artificial player — opponent or ally. A central unit, a commander, that takes
upon itself to plan the strategy required to defeat its adversaries, giving out orders to its underlings —
all the units under its command. This means that an AI responsible for this kind of behavior must take
into account all the know-how skills referred to in the end of the previous section, as they are the central
point of playing games within this genre — and they are the same for a human player.
2.2.3 AI-Player in Strategy Games
Since very early in gaming, there was the need for an artificial intelligent player — e.g. single player
strategy games required opponents, or an AI might fill for a disconnected player (in an online multiplayer
game).
In the Strategy genre, an AI-Player is responsible for making the same decisions a human player
has to make — commanding the various units at its disposal, to achieve its own end goals. Knowledge
on how to design a Player-AI, went through many stages and many techniques were tested. In older
games, an artificial player that cheated was a very common practice. This could mean, for example,
instantly generating resources or units required to counter another player’s attack. While this technique
is quite appealing efficiency-wise and presents decent results — this is, the challenge presented to the
player is adequate — it’s flawed execution may be the cause of a feeling of injustice in the player, for
ruining the illusion of being challenged by an equally skilled opponent. This means the technique could
lead to a negative experience and, therefore, should be avoided.
Developers took a step forward when they started to consider the strategic knowledge required from
human players and began to introduce such notions in the AI-Player. However, the development of an
AI with strategic knowledge, capable of coherent planning against a human player is a complex pro-
cess. This complexity is partially due to the fact that a good AI will not resort to the same strategy every
game[32]. A human player would notice the repetition after a few games, and would work to counter it,
instead of learning from his own mistakes. In short, this would lead to an exploitation of that AI strategy
by the player, and it would, ultimately, leave the player bored or uninterested in the game — exactly the
opposite of the intended purpose. Such solutions would also fail to adapt to the different player decisions6Microsoft Studios c©Age of Empires — http://www.ageofempires.com/ — Last Accessed on 2 January 20147Blizzard Entertainment c©Starcraft — http://us.blizzard.com/en-us/%20/games/sc/ — Last Accessed on 2 January 2014
11
and would eventually fail to perform at the desired level.
In the AI research field, and for a long time, the ultimate goal has been to produce an AI capable of
challenging a human player, in the same way another human player would[10]. One of the most common
investigation components of AIs for TBS is the development of AI-Players capable of defeating a Human
Player (or another AI-Player) — notice that defeating and challenging are different concepts.
• Defeat — is playing to win, leaving no room for mistakes.
• Challenge — is to play on par with another, offering an even challenge (or the illusion of it).
The second, challenging, offers the player the opportunity to learn from mistakes and still come out
ahead. It is both a learning tool and a companion through each game. The first, is a pure adversary, a
harsh, hard wall to climb.
The ability to defeat was the direction chosen when developing AIs capable of playing Chess —
which, after all, is a TBS. Some AIs even ended up being able to beat human world champions. How-
ever, in the end, Potisartra et al.[10] concluded that when developing an AI-Player, we do not always
want an AI capable of defeating the human player. For a game to succeed, commercially-wise and gen-
erally entertainment-wise, the AI cannot be too hard to beat and, in an optimal scenario, the AI should
adapt its skill according to the skill of the person playing. The reason for this is quite simple as well — if
an AI is too agressive or simply too good at the game, the player will be frustrated for not being able to
beat it; just as much as he will end up bored , if the AI is too soft or too bad at it. In simpler terms, the
perfect AI is one that provides an adequate learning environment to the player, as well as a challenging
environment when the player has learned "enough".
2.2.4 AI in Strategy Games — in Short
Our reasoning up to this point lifts up different issues and different areas of interest. A perfect AI must
adapt itself to both game changes (this is, changes in game state), and to the player learning curve and
current ability. Also, we are saying the AI is partially responsible for the player engagement (or enter-
tainment), the player learning and directly changes the environment in which each game takes place —
at least, it should feel like a different opponent each game. These areas of interest have been studied
separately, but there isn’t much common ground between them.
As it is enough from the planing point of view, most of the studies on that area focus on single-agent
planning without considering adversaries who actively try to prevent the agent from achieving its goal or
that have goals that may conflict with its own[16] — this is usually done on a play-by-play perspective,
limiting the computation required, and optimizing for the immediate result. Considering the adversar-
ial planning greatly increases the size of possible states set, which means a greater complexity in the
12
planning and decision algorithms. In the field of adversarial planning, one of the biggest developments
was the minimax game tree8 applied to chess and checkers — producing AI systems able to challenge
and beat human experts. However, these games have a relatively small branching factor, allowing some
algorithms to look far ahead, without a great detriment of the algorithm speed and producing winning or
beneficial strategies much sooner than they would foreseeable for a human player. In Strategy games
the branching of the game tree is far greater as there are many more factors to take into account: ex-
istence of fog-of-war; more complex movements and actions; multiple adversaries; possibility of allies;
higher number of units (which can reach thousands); multiple unit types; and others. These are all fac-
tors that players have to take into account while playing, which means that an AI that wouldn’t take them
into account would not serve its purpose well enough.
On their work, Sailer et al.[16], refer that to tackle these problems and improve results, it is common
to divide the AI-Player in a sub-set of goal driven agents. This would mean that the complex AI would
have several components responsible for the achievement of each sub-goal in order to perform more
efficiently — something that has been known to help throughout all of the Artificial Intelligence field,
Divide-and-Conquer tactics. Resource gathering, scouting, and effective targeting are examples of sub-
goals. In the end, this set of agents would need to combine its results to ultimately achieve the original
AI-Player’s goal. For example, the knowledge gained from scouting the map, needs to be passed to the
resource manager and to the army manager, in order to further plan their next actions.
2.3 Implementing an AI-Player
The development of an AI for Strategy Games isn’t an easy task. One of the reasons why,
is their dependence on the underlying game world implementations — which present hard
variations from game to game within the genres. This also means that the development of
the AI is very dependent on having the game world up and running before hand.
(Forbus et al.[28])
2.3.1 Centralized vs Decentralized Approach
There are generally two ways of implementing an AI-Player for a Strategy Game — using a centralized
or a decentralized approach.
We mention the concept of manager, and we have said units hold some intelligence (like the know-
how of path-finding). In some way, this means we are dealing with a case of multi-agent system. More-
over, we are dealing with a centralized multi-agent system, as we have a controlling unit and multiple
controlled units. In a centralized approach[30] — the most common in Strategy games — a central unit,
8Minimax Decision Theory — http://en.wikipedia.org/wiki/Minimax — Last Accessed on 7 January 2014
13
usually god-like (unseen and all seeing), holds the most knowledge and conveys its intentions to the
other units. The units receive said intentions (e.g. attack this, defend that, or moves there), and execute
the appropriate action to fulfill that intention. In this case, the lower units need to be less ’intelligent’
than the controlling unit. One of the main characteristics of the centralized approach is the discrepancy
between the levels of intelligence required from each AI, and this is the most common approach for AI-
Players in Strategy games — since it requires less effort from the developers, as they may use the same
unit intelligence for all AI-Players and all human players, and it is less complicated. This approach can
be seen as a direct interpretation of player interaction — as we can see the controlling unit the ’player’.
By opposition, the decentralized approach[30] entitles each unit with a more complex thought pro-
cess, allowing them to perform local planning and communicate exchanging requests and intentions at
unit level. This more complex thought process is always built on top of the previous know-how the units
already had — such as the previously mentioned path-finding skills and the like.
A correct implementation of a decentralized approach allows quicker reactions to unpredictable lo-
calized events. The downside, though, is the ill-suited nature of the decentralized approach for actions
that require high coordination — like strategy execution ("I go this way, you go that way"). This means,
units are able to request aid, or even request the execution of an action from another unit, but due to
the nature of this kind of control (or this lack of full control), a unit is free to decide to help or not another
unit. In order to solve this issue, it is necessary to pair each request with a certain priority value, to help
each receiving unit plan its own action. Also, each unit may have a level of altruism/selfishness, making
it more/less prone to help its companions.
There can be different levels of centralization/decentralization. There may be multiple agents acting
towards the same ultimate goal (this is, win) but still have different sub-goals to achieve — e.g. divid-
ing the responsibilities of different types of planning to different entities, such as economic, military, or
diplomatic. Each objective-driven AI can be centralized or decentralized within the same system.
The advantages/disadvantages of each approach need to be weighted in before actually developing
an AI-Player, and there is no right answer as it should be completely intention dependent — referring
back to the challenge vs defeat the player intention. Despite the differences between these approaches,
they are still both related to the implementation of a Player-AI. It makes sense that, in a way, in a game
with multiple players (a mix or human and AI players), we would want them all to play with an approx-
imate level of human-like intelligence, since this would create some balance and improve the player
experience.
14
2.3.2 Human-Like Intelligence
Many are the attempts to approach the AI to a more human like intelligence. Qualitative Reasoning(QR)
may come in aid in that regard. QR is an area that studies ways to transform quantitative information
in a qualitative description. This means, QR offers the possibility for an AI to understand quantitative
information through qualitative notions, representing a stronger link between machine and human knowl-
edge. For example, being able to logically attribute tags to numbers in the right context — e.g. 100 may
be high in one context, but low in another, or even mean small army in a third context. In his paper,
Forbus et al.[28] suggest that QR may offer a valuable contribute to the video game industry, limiting the
dependency of the internal world/environment implementation. They believe that by using QR systems
it is possible to achieve better opponents, advisors and other NPCs. According to the author, the use
of QR may bring potential advantages, like: more human-like behavior; better communication of intent;
better path-finding; and more reusable strategy libraries. We dare even say, the use of a QR approach
may improve the extensibility of a developed AI.
QR may help decision making, but it doesn’t fully allow an AI to decide when and how to attack an
enemy. The process behind these decisions often lies in either using influence maps or using prede-
fined strategies (usually based in scripted behaviors). Using predefined strategies will not allow much
adaptivity, however. The AI-Player may feel and act differently in the first few games, but with repetition,
its strategies will start to seem predictable, ruining the experience. The way to improve this subject is to
offer some personality [28] to the AI-Player, modifying its behavior slightly — e.g. tweaking it’s levels of
aggressiveness. Also, the more strategies to choose from, the more unpredictable the AI.
Influence maps[35], however, are a different matter. Influence maps are an abstract representation
of the environment in a way AI can understand it. This technique allows the attribution of values to the
environment that represent abstract concepts — for example, it is possible to attribute a quantification of
how valuable is to attack, defend or move to a certain position, improving the following planning process.
In contrast to predefined strategies, the use of this technique allows the development of an adaptive AI,
as it will react differently, according to the faced opposition. The use of influence maps allows a better
comparison between different possible actions, which is also desired.
Finally, the last topic in human-like behavior we would like to discuss is related to Believability. Taking
the risk of over-simplifying the question at hand, it is possible to separate AI believability in two groups[39]
— AI that correctly simulates a human player, and AI that correctly acts like the character it is playing.
However, in Strategy games, the AI-Player controls not only a player, but the individual units that a player
normally controls. The concept of believable must be extended to support this case. Should an AI be
considered believable if the AI-Player uses human-like strategies? Or should it be considered believable
if its units perform logical (intelligent) actions after being given orders? Or maybe it should be a mix of
both, to different extents? This brings us back to the discussion between centralized and decentralized
multi-agent systems, and it is an answer we cannot provide due to its highly subjective nature.
15
It seems that AI has progressed to the point where it cannot be considered to be a binary
concept. Rather, in practice, the term refers to a spectrum of ideas ranging from a simple
system that can perform only basic tasks to a fully adaptive system that is able to solve highly
complex problems by using techniques that reflect the nature of human intelligence.
(Johnson et al.[33])
More than reflecting the nature of human intelligence, AI began to reflect a larger scope of Natural In-
telligence. Although there were impressive achievements when trying to imitate the human intelligence,
studies have started to diverge to other natural intelligence manifestations. Some of those studies were
in the origin of new AI fields with ’nature-based’ theories as their background, such as Swarm Intelli-
gence.
2.4 Swarm Intelligence
It is a well known fact that Man learned a lot from studying natural systems. This is also true for Computer
Science[38], as the studies derived from natural system inspired the development of such algorithm
models like artificial neural networks, evolutionary computation, swarm intelligence, artificial immune
systems, and fuzzy state machines. These mentioned breakthroughs in computer science are the re-
spective models of biological neural networks, evolution, swarm behavior of social organisms, natural
immune systems, and human thinking processes.
2.4.1 Emergence — Definition
From the studies on social organisms, it became evident that their ability to perform complex tasks had
the interactions between the individuals of the swarm in its core. This means, that the complexity was
not innate within any of the individuals, but rather present when analyzing their behavior as a whole.
The interaction in these biological swarm systems may be direct — through the natural senses of touch,
smell, hear or seeing — or indirect — through changes in the environment. The ability to perform com-
plex tasks as a result of individual independent labor is called emergence and it is not easy to predict
or deduct the complex resulting behavior from observing the simple behavior of the individuals. Engel-
brecht et al.[38] define emergence as the process deriving some new and coherent structures, patterns
and properties (or behaviors) in a complex system — structures, patterns and properties (or behaviors)
that come to be without the presence of a central commanding unit delegating or tasks to the individuals.
2.4.2 Swarm Intelligence — Definition
Swarm Intelligence (SI) is the terminology used to describe the problem-solving behavior emergent from
the interaction between agents within a swarm (or colony). In the same way that Computer Swarm In-
16
telligence is the terminology used to the algorithmic representations that model those same emergent
behavior.
SI studies ways to implement collective behavior resulting from decentralized and self-organized sys-
tems. Even though small and with limited sensory and cognitive skills, these insects are able to form
colonies and work together in order to persevere. This perseverance often implies the need to perform
complex tasks as a group, such as food foraging, brood clustering, and construction and maintenance
of the nest. The reason why this is so impressive, is because these insects are able to perform all
these tasks without a central unit controlling or defining the objectives, nor assigning each member of
the colony to a specific task.
We can exemplify this concept with ants, for example. Like other insects, ants have several built-in
systems that allow them to be so productive[25], and have helped with problem. By defining a clear
objective, like foraging for food or building a nest, they help the colony units understand their purpose.
They are committed in the greater good, even if they may seem to wander aimlessly on their own, they
are continuously searching for ways to serve the colony. Ants live in an empowering culture, which
means that each ant (or each colony unit) is allowed to try and experiment as many possibilities as
possible to reach a goal, without adverse consequences for failure — this empowerment may result in
finding a better supply of food, for example. And, finally, ants dispose of an automatic communication
system that they simply cannot turn off — this allows any ant who follows is always benefited by the in-
formation gathered by previous ants. This allows them to efficiently search for an optimal or near-optimal
solution for their goal.
2.4.3 Algorithms in Swarm Intelligence
Many are the SI-based algorithms and many are their applications in various areas. These algorithms
have been proven very efficient in solving AI problems that range from optimization to clustering algo-
rithms. We will focus on algorithms that have had some application in gaming.
The knowledge gained from studying ants has proven a great aid in the development of algorithms
such as the Ant Colony Optimization[11], and its usefulness falls under various categories — e.g. routing
problems (path-finding for distribution), assignment problems (distributing tasks to works, given some
constraints), scheduling problems (allocation of resources over time), or subset problems (selecting
items from a set that, together, form a solution).
17
Particle Swarm Optimization
The Particle Swarm Optimization(PSO) is a stochastic optimization algorithm based on the somewhat
unpredictable flying patterns of bird flocks[12]. PSO is a search algorithm that uses multiple individuals,
or particles, grouped in a swarm. Each of these particles represent a candidate solution to the optimiza-
tion problem. In a PSO system, each particle adjusts its position according to its own experience and
that of neighboring particles, trying to position itself in an optimum solution state. In the end, this means
that each particle will continually try to reach an optimal solution while searching in a wide area, and the
overall flock will converge to that same optimal solution.
Routing-Wasp
By studying the Polistes dominulus wasps, a dynamic task allocation model was created that success-
fully emulated the self-organized behavior of wasps[36]. The model divided wasps in a hive in two
different types, according to their respective tasks: either foraging, or brood caring. The task assign-
ment, or better yet, the decision to perform a task is done by each individual for himself, and it was
based on the response threshold and stimulus emitted by the brood. Stimulus was emitted by the tasks
and affected the individuals’ task selection decisions. Response thresholds represent the individual’s
will to perform certain tasks. Force is used in dominance contests which allow the formation of a certain
hierarchy within the colony. And, finally, Specialization refers to the aptitude of an individual to perform a
certain task. The more an individual performs said task, the lower his response threshold will be, while
the thresholds for the other tasks will increase. This means, the more an individual performs a task, the
more likely it is that he will perform that same task again in the future. Routing-Wasp, a derived algo-
rithm was developed[34], applying the previous concepts to self-configurable factories — it was from this
algorithm that General Motors took advantage in their assembling factories. Santos et al.[8] also derived
from these principles and applied them in gaming. The algorithm WAIST, as they called it, was applied
to a Real-Time Strategy game, and was made responsible for choosing which ’factory’ would spawn a
requested unit, taking some of the micro-management away from the player.
2.4.4 Issues of a Swarm Intelligence Approach
Being a somewhat recent field of study, the use of these algorithms has been confined to solving AI
"benchmark" problems comparing their results to the ones from previous accepted algorithms — this
fact has enabled them to increase in popularity and gain interest from researchers. However, it also
means that their use outside of that scope has been relatively limited. Specifically speaking, in gaming SI
was used to develop AIs capable of learning, and mostly in traditional games — such as Checkers[26],
and other really old games like Go[21](a 3000 years old, Chinese game) and Seega[20] (a really old
Chess-like Egyptian game).
18
Despite the advantages of SI-based approaches, they are not completely without issues[40]. For
starters, there is no definitive way of programming a swarm in order to specifically perform a certain
task. The asynchronous nature of the swarm units in decision making increases the difficulty of an
already hard problem. A possible solution for this problem would be to explore the behaviors of an near-
infinite amount of different swarms, or search that same space of possible swarms for an optimal one by
means of some cost function — this last option would only be viable if a cost function could be defined,
among other requirements.
Secondly, there are a good number of questions that require answering when dealing with these
systems. How complex should each agent be? Should all agents be identical? Should they be able to
learn or make logical inferences? How and what do they communicate? What should they know about
the environment? And so on, and so forth. These questions may have multiple answers, depending
both on who is answering and on the purpose of the system being built. A possible and reasonable
approach is to start with low complexity agents and progressively increase it as needed. Although it
doesn’t necessarily answer all the questions required, this approach is sufficiently systematic to provide
good-enough results.
Lastly, SI systems are not absolutely reliable[40], as it isn’t trivial to predict their behavior when faced
with an unexpected event. There is also the issue of defining an adequate benchmark, suitable for SI
testing. SI systems’ performance shines when acting in dynamic environments, and so dynamically
changing problems. In order to create a benchmark for such an adaptive system, implies that we would
know what to expect from a generic adaptive system. How could we evaluate the performance of a
system (what would be the metrics)? All in all, there are multiple ways of being dynamic, but it could
be possible to show various systems with similar properties in terms of how difficult it would be to solve
their corresponding dynamic problem.
2.5 Artificial Immune System — A Defense Mechanism
The biological immune system is a good metaphor for anomaly detection systems in general. In 2002,
Matzinger offered his views on what he called Danger Theory (DT)[29], something that has become in-
creasingly popular. The DT states that the biological immune response is triggered by sense of danger
and not by sensing foreign (or "non-self") entities. There are still arguments concerning the validity of
the theory from a biological stand-point, however, this knowledge suffices to develop Artificial Systems.
Creating a bridge to the Strategy game scope, this would be equivalent to an invasion by some or many
other player’s units. This theory, DT, has been used to develop AI algorithms for Spam Detection[9] as
well as Intrusion Detection within a network[18]. The scope of these algorithms seems larger than the
one present in strategy games, however we feel as though it is natural for an AI-Player to react under
fear and sense danger, as there is some correlation to the human reaction.
19
The reference to the immune system may seem a bit out of place, but if we consider the cells as a
whole, even though they are void of real intelligence (being purely reactive), their reactions result in a
complex "behavior" that is for the benefit of a greater being — and consequently all of those cells. The
immune system is one of the most complex defense mechanisms, responsible for stopping attacks from
both the outside and the inside — making it a worthwhile study when dealing with military aspects of a
swarm.
2.6 Discussion
We have covered several different aspects related to AI-Player development. The general conclusion
we can make is that an AI-Player is the composition of multiple techniques, ideas, and choices. With the
increase in visibility that games have been getting, alongside their acceptance from larger groups, AI
must take a step forward and improve the notion of intelligence present in games[27]. Hardware-wise,
computers (and generally speaking gaming systems) are now more powerful than they were years ago,
which allows the use of more processing power, and enables the algorithms to perform faster and/or at
a higher level — e.g. a search algorithm can now go deeper in a search tree.
Still, we are faced with the same old issues, such as:
• How to develop an adaptive system?;
• What do we mean by believability?;
• How intelligent do we want our AI-Player to be?;
• How do we achieve a consistent intelligent behavior?;
• Should we use a centralized or decentralized approach?;
All these questions remain unanswered. Better yet, each implementation is the result of assumptions
and more or less personal views around these subjects, and what is valid for one game, one context,
may not be in another — more often than not, it really won’t be. This need for specific solutions in each
case and context, make it difficult, if not impossible, to use a generic approach, adaptable to multiple
contexts and games.
For our purpose, we intend to test the viability of applying Swarm Intelligence principles to an AI-
Player, hoping to improve the results obtained, when compared to a conventional AI-Player. By nature,
SI systems are adaptive, and when correctly implemented react in a believable (at worst understandable)
manner. The resulting behavior may be considered intelligent at unit level, as opposed to commander
level, defining a decentralized approach. We believe these choices may have a positive reflection on our
20
Chapter 3
Solution
The most common uses for SI in games are learning AIs, organizing scheduled tasks, or path-finding.
However, using Swarm-like behaviors to control the actual units in a Strategy game — similarly to a
decentralized AI — hasn’t been documented. In the previous sections we presented the current state of
the art of AI in the gaming industry, as well as the most relevant developments in AI academic research.
Based on these, our work has the objective to combine the two — industry and academy — in order to
produce a better performing AI for a TBS game.
More specifically, we aim to use the knowledge of Swarm Intelligence to improve the quality of the
decision making process of an AI. For this objective, we need a Strategy game with an interesting amount
of decision options, in order to validate the adequate behavior of our new AI — we chose Almansur.
3.1 Algorithm Design
Our algorithm’s primary goal is to be improve the communication between the units of a swarm, in order
to reach a consensus — this is, a set of intentions or intents per unit that every unit agrees on. So a unit
can be seen as a particle of the swarm, and in order to make a intent final, that intent needs to somehow
approved by all other units.
In order to accomplish this, our algorithm was divided in two stages:
• Selfish phase — in which every unit, analyzing the surrounding environment, decides on the best
course of action for itself;
• Negotiation phase — in which every unit, communicates its intention to every other unit for con-
sideration and evaluation. This allows each unit to reconsider their intentions, and instead follow
another unit’s.
Figure 3.1 shows these two phases as idealized.
22
Figure 3.1: Conceptual design of the algorithm per swarm unit
3.1.1 Intent — Definition
An intent or intention is a structure that represents each unit’s thought of what their ideal action would
be. An action is an interaction with the environment, that is will be the cause for some desired outcome.
In terms of requirements, in regards to our algorithm, the intent must have an heuristic value and
the originating unit’s identification. These are the sufficient conditions for our algorithm to produce some
results — even though not optimal in most cases.
However, both these parameters are completely dependent on the context implementation, espe-
cially for the heuristic value.
Another relevant variable that an intent can have is a type — however this is not a hard requirement.
An intent type allows considering multiple different intents over a same target — e.g. looking out of,
opening, or closing a window. Having a type is completely context dependent, and, as previously stated,
it is not are not mandatory.
Considering studies such as the Danger Theory, referenced in Related Work, it became clear that
there could be a requirement for raising the level of importance of a certain intention — in practical terms,
when one unit detected a threat, or an very valuable target. For this reason, we considered and included
the help flag. This feature, allows any unit to artificially increase the value of its target, in order to call
the attention of others to it. But once again, the algorithm does not make existence of this flag mandatory.
On a side note, all these optional parameters were included in our final implementation of the solution.
23
3.1.2 Selfish Phase
Coming back to the algorithm itself, the first part is really straight-forward — each unit considers each
available target in the environment, and marks the intent with the highest heuristic value found as its
selfish intention.
Same as the intent types and heuristic values, the perceptions of the world (reference as available
targets above) are entirely context dependent.
In Algorithm 1 we represent a simplified version of the first step of our algorithm — the selfish plan-
ning1. This pseudo-code references intent type, as we believe there are more cases that require it than
not.
Algorithm 1 Simplified take on the selfish cycle step
1: input: perceptions
2: output: selfishIntents← map[unit, intent]
3:
4: selfishIntents← newmap[unit, intent]
5: selfishIntent← null
6:
7: for all unit in swarm do
8: for all target in perceptions.availableTargets do
9: for all type in context.intentTypes do
10: intent← newintent(type, target, unit)
11: if intent.heuristic > selfishIntent.heuristic then
12: selfishIntent← intent
13: end if
14: end for
15: end for
16:
17: selfishIntents[unit]← selfishIntent
18:
19: end for
Selfish Phase - Analysis
Considering we have three nested loops, the complexity of the algorithm can be calculated based on the
number of times each of those loops runs. In the worst case scenario:1This and the examples that follow are a simple representation of the actual process — split up for explanation purposes. The
complete algorithm is present in appendix, chapter Complete Algorithm, and is nothing more than these examples combined.
24
• Main Loop - Unit Cycle — runs exactly U(← swarm.length) times — or N in the usual nomen-
clature;
• First Inner Loop - Target Cycle — runs a number of times equal to the target for each unit —
which is a constant, so t times;
• Second Inner Loop - Intent Cycle — runs a number of times equal to the number of intent types
— which is constant, so i;
This allows us to conclude that the this phase of the algorithm will have a number of cycles equal to
a constant (c = i*t) — which depends largely on the context — times the number of units (N) — making
it of complexity O(N).
3.1.3 Negotiation Phase
Immediately after concluding the first phase, the second one begins. In this negotiation phase, each unit
will divulge its own intent to the rest of the swarm — consequentially, each unit will also receive every
other unit’s intention. This will allow for a reconsideration step.
Reconsideration — Definition
By reconsideration it is implied that a unit is rejecting its own selfish intent, and instead decided to follow
another unit’s intention.
A reconsideration is only valid if, by the end of the cycle, every unit is pointing to either:
• Its own selfish intention;
• Another unit’s intention, and that intention’s originating unit still intends on following it.
It is not expected that a unit maintains its own intent, even after someone reconsiders toward their
intent. Our algorithm is also responsible for recovering from an invalid state. This is done through a
rollback, which we will explain later.
Negotiation Phase — Division
As was made clear by the reconsideration definition, the negotiation phase can be further divided in two
phases — reconsideration, and validation / rollback.
These two phases are ran in succession, and together they allow the swarm to reach a consensus.
A consensus is met when there are no units reconsidering, and the end state is valid for all units. While
there are any units reconsidering, the cycle continues. In Algorithm 2 we can see the algorithm for the
negotiation phase.
25
Algorithm 2 Negotiation algorithm
1: input: selfishIntents← map[unit, intent]2: output: consensusIntents← map[unit, intent]3:4: initStepIntents← selfishIntents5: . Reconsideration6: while true do7: hasReconsidered← false8:9: for all unit in swarm do
10: myIntent← initStepIntents[unit]11:12: for all intent in initStepIntents do13: isV alid← finalIntents[intent.unit] != null and finalIntents[intent.unit] != intent14:15: if (intent.unit == unit) or (not isV alid then16: continue17: end if18:19: myHeuristic← myIntent.heuristicFor(unit)20: newHeuristic← intent.heuristicFor(unit)21:22: if newHeuristic > myHeuristic then23: myIntent← intent24: hasReconsidered← true25: end if26: end for27:28: finalStepIntents[unit]← myIntent29: end for30:31: if hasReconsidered then32: break33: end if34: . Validation / Rollback35: for all unit in swarm do36: intent← finalIntents[unit]37: parent← intent.parent38:39: isV alid← (parent == unit) or (finalIntents[parent] == intent)40:41: if not isV alid then42: previousIntent← initStepIntents[unit]43: previousParent← previousIntent.parent44:45: isPreviousV alid ← (previousParent == unit) or (finalIntents[previousParent] ==
previousIntent)46: if isPreviousV alie then47: finalStepIntents[unit]← initStepIntents[unit]48: else49: finalStepIntents[unit]← selfishIntent[unit]50: end if51: end if52: end for53:54: end while55:56: consensusIntents← finalStepIntents
26
Even though there is a while(true) in the algorithm, this is only to make sure the algorithm only
completes when there is a consensus, and every cycle of reconsideration is specially designed to work
towards that goal.
Negotiation Phase - Analysis
Noting that the number of units directly reflects on the number of intentions, in the worst case scenario,
these algorithms will run:
• Reconsideration — Number of Units in the swarm times the number of intentions — O(U ∗ I) or
O(N2). Every unit listens to every other unit in order to understand if it can or should do something
about their intention;
• Validation-Rollback — Number of Units in the swarm times — O(U) or O(N). It is a necessary
condition to have a valid state before proceeding — any unit found with an invalid intent, will fall
back to its original intent.
3.1.4 Algorithm Analysis
Algorithm analysis is usually done in terms of time complexity or resource allocation amount. However,
this algorithm was directly developed on top of the previous game’s AI, which made it especially hard
to profile — forcing us to step away from the conventional time complexity analysis for now. Resource
allocation was also an analysis without much potential, as most of the data is related to the game itself
and the algorithm doesn’t really look into it al that much.
In the end, we performed a simpler worst-case scenario analysis based on the number of iterations
of the algorithm itself and our conclusions are summed up in the table below
Phase Complexity1 Selfish O(N)
2 Negotiation 2.1 Reconsideration O(N x N)2.2 Validation / Rollback O(N)
Table 3.1: Algorithm complexity for each of the phases
In order to consider the complexity of the algorithm as a whole we can consider it as a sequence of
the three steps, and its complexity is the sum of the three — O(N) + O(N2) + O(N) or O(2N + N2).
Simplifying it, we can state that the complexity of the algorithm is, in fact, simply O(N2).
3.2 TBS Game Environment — Almansur
Almansur is a browser Massive Multi-player Online Turn Based Strategy Game with a strong focus on
its military component — this includes, troop movement and recruitment, as well as diplomacy between
27
multiple commanders. As most other Strategy games, another key aspect of the game is building and up-
grading facilities. These are required for resource generation, unit recruitment, faster movement around
the map, and even increase the defensive strength of a player’s land.
The game is played by simultaneous turns, which means every player issues orders and passes the
turn, before any events actually occur. The number of turns in each game varies, but the game may
end before the last turn if some conditions are met — such as, a player or alliance reaches a threshold
population number or holds a certain percentage of land. This means that a player may win as a solo
player or as part of an alliance in a joint victory. Player skill is ranked with an ELO-like system2.
There are multiple game types, in various settings, but the overall behavior and victory conditions
are the same. Games can be distinguished for two characteristics — if new players can join mid-game
(Static vs Dynamic Game), and if the setting is real (Historical vs Fantasy Game):
• Static Game — A set number of players must queue in before the game can begin. Each player
chooses a faction, controlling a predefined piece of land, and works his way up from there;
• Dynamic Game — A set number of players is required before the game can start, but more players
may join as the game progresses. The map, for this type of game, is generated using AI algorithms;
• Historical Game — Historical games are always static in Almansur. These games have predefined
conditions to reflect places and races or cultures from historical events;
• Fantasy Game — Fantasy games can be static or dynamic, and allow each player to choose from
fantasy races each with its own special characteristics;
Figure 3.2: Almansur — Map example of an historical game
2ELO rating system — http://en.wikipedia.org/wiki/Elo_rating_system — Last Accessed on 7 January 2014
28
Relevant to military decisions is the environment. The world map, in Almansur, is constructed with
hexagonal-tiles, each with a specific type — based on forest density, swampiness, and mountain like-
ness, allowing the definition of plains, swamps, forests, mountains, etc. Different units and races have
different marching capabilities in each terrain type.
The commands given to each army fall under many different categories — unit creation, joining and
splitting of multiple armies, army movement, deciding which order to execute after the movement (Bat-
tle, Rest, Train, or Conquest), speed at which the movement should be done (Slow, Cautious, Normal,
Forced). This set of actions makes the state-space quite complex with high probability for error and/or
sub-optimal behavior. Apart from the decisions directly made by the AI, there is still a subset of environ-
ment factors to take into account - such as type of terrain in which the battles take place, type of terrain
that the army will cross, as well as the experience and moral of the army. These factors are, more often
than not, the defining reason for the success or failure of a plan - with consequences that range from
getting to a land early and ambush the opponent to getting to it late and get destroyed.
Each army is divided in units — in other words, unit types (archer, cavalry, militia, etc). Depending
on the race played, different units are available within each type, with some degree of attribute variation
between them. The composition of each army is important for the battle phase. A battle occurs every
time two armies are in the same tile and at least one of them issues a battle command. A battle is a
sequence with the following steps: Ranged ; Charge; Shock ; Melee; and Pursuit . Each unit will be more
more relevant in one context than another — e.g. archers are more important for the ranged phase than
the melee phase.
All these decisions make the game quite complex, even when considering the military aspect alone.
For this reason, this game is interesting enough for us to apply our ideas to it. We have also been
granted access to the code for the currently implemented AI, allowing us to work our way up from there.
3.2.1 AI in Almansur
Almansur is implemented as a Multi-Agent system — even the NPC representing the population is an
agent. As for the AI-Player implemented[5], it is also considered a multi-agent system with three agents
— MilitaryAgent ; EconomicAgent ; and StrategicAgent . The objective goals were for this reason divided
in three aspects, using the technique divide and conquer , creating a simplified goal for each of the three
agents.
To interact with the game, was implemented an AIController , responsible for issuing the commands
resulting of the planning of the three agents stated before. This module has to interact with the game in
the very same way a human player would, thus ensuring no cheating can occur.
29
Figure 3.3: Almansur — Current AI implementation
Focusing on the military agent, it is again divided in three different modules — command ; facilities;
and recruitment . Respectively, these were responsible for army movement, military facility upgrading
and unit recruitment. Our objective lies in the command component, which in this architecture has been
implemented using scripts — this is, specific reactions for specific events. This approach, even though
simple, has proven useful for its intended purpose — that was improving the experience replacing play-
ers that abandon games. However, it still favors, in some way, players that have lands adjacent to those
of the AI, since they can easily exploit this fact, overwhelm the AI and greatly increase their land value
and resource income.
3.3 Solution Implementation
Our solution was built on top of the current AI in Almansur, developed in the work of Barata et al.[5]. The
general AI architecture is briefly explained in the previous section, and we modified their military agent
alone — more concretely the command component of the military agent. Alteration to other modules
were considered an upgrade (or simply a modification) to the old AI, and they remain equivalent in both
old and new AIs.
The main objective, was to shift the responsibility of thought process (this is the planning phase) to a
lower level — to each unit. This change allows each unit to perform their optimal local decision, affecting
decisions aimed at battle events, the order of territory conquering, and even movement speed when
performing each action.
To our knowledge, there are no SI algorithms based on the military aspects of social organisms —
even less documentation exists on their potential application to games. This came to prove as the first
great challenge of the development cycle of this work — the design of an algorithm with SI embedded
30
knowledge that would be adequate to solving the problem at hand.
The remainder of this section will reflect our decisions behind algorithm definition and implementa-
tion, as well as a short analysis on its efficiency, finalizing in its integration with our testing environment
— Almansur.
3.3.1 Algorithm Additional Notes
Order matters
The order of reconsideration is the same order as the selfish intents are decided. The first unit (unit-1)
to pick, will always pick the most valuable intent that is available to it. The only reason why this could
not happen is if a nearby unit (unit-2) has an intent of close (or equal) value, and requires assistance —
raising its awareness and, consequently, its value. This situation would mean that the unit-2 would have
the most valuable intent in sight of unit-1. These are the possible outcomes of this situation:
• Unit-3 (in sight of unit-2) has close or equal value intent and requires assistance — in this
case, unit-2 would follow unit-3’s intent and leave unit-1 in an invalid state. A rollback would ensue
and unit-1 would go to its selfish intent.
• Unit-2 keeps its intent — in this case, unit-1 would remain valid, and so would unit-2. Any other
units would follow intents different from unit-1’s or unit-2’s.
Even though this is not an optimal solution — optimal being, unit-2 following unit-1, if need be — this
choice reflects the nature of uncertainty in choices. People, much like animals, often do what is right for
the collectible, rather than themselves. If unit-1 follows its own selfish intent and unit-2 does the same
(this is, follows its own selfish intent), then it is likely unit-2 is slaughtered, resulting in a loss for the
colony. This mechanism makes it possible for stronger units to protect the weaker ones.
3.3.2 Algorithm Implementation and Heuristic Development
The integration of our algorithm with the context of our problem followed the initial concept of the algo-
rithm itself. A couple of details were immediately evident:
• Type definition — every possible action in the game needed to be translated into an understand-
able type;
• Heuristic function — every possible action needed to have a quantifiable for comparison.
Everything a player could do, every action he could perform over his army, needed to be translated
into AI logic. Part of this effort had already been done in the AI in place, but we felt it deserved an
overhaul. For this reason, we dropped the existing Actions for our new concept Intents, complete with a
new Intent Manager — since our environment, Almansur, is a TBS, we didn’t see a reason to create a
31
complicated structure for each unit, and settled for a structure that would hold all the intents while they
waited for further processing. In the end, we ended up having only four types of actions — defending,
attacking, conquering, and no action. These intended to provide enough diversity for our needs.
Regarding the heuristic, we ended up implementing two — one for value, and one for danger. These
heuristic functions were common to all units, and represented the way they perceived the environment
— they are represented in the algorithms as perceptions. In order to only perform calculations once for
every unit, we chose to implement these via influence maps. Each position, a territory unit in the map,
contained an associated danger level — taking into consideration the number and strength of the ene-
mies on and surrounding it — and a value — taking into consideration the amount of resources available
per type in it.
We can say that danger was more floaty than terrain value. Enemy units tend to move, instead of
idling in certain territories, which in turn causes their threat to be somewhat dynamic. Value, however, is
static. Resources that can be gathered in one territory, and they don’t necessarily enrich the neighboring
territories.
Finally, we needed a way to improve communication, and signal for emergencies — a call for help
or assistance. For this purpose, each intent carries a help package, which contains the amount of fear
the source unit has when performing that intent — unit is more fearful, then it needs more help. As an
example, when a unit notices an enemy army that is far stronger than it.
3.3.3 Integration Additional Notes
One unit, one target — two units, two targets
At first, we allowed every unit to freely choose whatever target they wanted for a selfish intent. The scope
of the play — amount of units vs amount of space — lead to the conclusion that most of the time, units
would choose the same target. This is an issue for two reasons. First, the swarm always converged, too
fast, too inefficiently. Second, communication was pointless, since most of the time, everyone wanted
and stated the same thing.
For these reasons, we decided to try and maximize the area of effect of our swarm, by making sure
that every unit had a unique target. This meant, no two units would leave the selfish phase of our algo-
rithm with the same target for intent. Results proved our initial hypothesis, and our swarm spread more
evenly around the map, greatly improving our conquering and our sensing of danger abilities.
32
Sufficiency vs Efficiency
In our initial implementation, we allowed any unit to target any territory — even if it had one of our units
there. After implementing our one unit, one target mechanic, this became an issue. Units that were on
a territory and didn’t have high rating in the hierarchy, were forced to choose other targets, which, more
often than not, meant going to the opposite end of our territory — correctly, because of the resource
distribution around the map.
This was generating a lot of internal turbulence, within the swarm, much like ants seem to behave
when they are in their colonies, but the results were definitely not good. From this turbulence we only
got slower response times to threats — as the units spent a lot of time aimlessly walking around already
conquered territory.
The solution was to give priority to any unit that is already on a territory, in a way that other units
would only go to that target if the first unit requested assistance. This was the most efficient solution.
On the other hand, the unit at the position could not have enough strength to finalize its intention —
sufficiency. In this case, it would request assistance from nearby targets.
Units can not help themselves
The help factor is quite important for the reconsideration process in allowing units to compensate for
one another and work together towards one same goal. Regardless, we can not leave unmentioned that
this factor only affects units other than the source of the intent. This means, if unit-1 has an intent with
the help factor activated, the intent will have a greater heuristic value (multiplied by the help factor) than
normal, but it will remain the same (unnaffected by the help factor) for unit-1 during the reconsideration
phase.
3.4 Testing methodology and data collection
Evaluating an algorithm is not an easy task. We could have done simulations and more complex analysis
for time and resource allocation, but we believe testing the final integration is more interesting. For this
reason, our tests can be divided in two components:
• Duels with old AI — multiple duels in different scenarios, resulted in fast games allowing for quick
information;
• Multi-player games — between 15-20 players, both real and AIs, new players could enter in the
middle of the game, and turns lasted one day.
33
The original AI was, since the beginning of the development, a benchmark for our AI. In the early
stages of development, we used the duel system for debugging our new AI. In later stages, they were
used to fine tune all parameters before taking the AI for real player tests.
We only took part in one multi-player scenario with real players, because of the game and turn
lengths. However, we had three AIs in it, alongside two old AIs and ten real players — allowing us to
retrieve some interesting data from the game.
As per metrics, we tried to use most of the information available from the game itself:
• Victory Points (VPs) — these represent the score in the game — player with most VPs wins;
• Territory Owned — being an important part of the game and of the military display, the evolution
of territories owned during the game can be an interesting metric;
• Army Power — more than the ability to conquer, the ability to keep their own is an important task
for the military agent — this metric should reflect this quality.
An final additional and important metric, was our reconsideration rate — this is, number of reconsid-
erations per number of actions taken.
34
Chapter 4
Experimental Results
In order to evaluate our solution’s the adequacy to solve the problem, we put it in play in different
scenarios against different types of opponents. In this section we will go through our findings explaining
how their relevance towards the validation of our solution. Following our initial statement, our findings
will be divided in two categories:
• Duels between AIs;
• Multi-player Matches.
It is relevant to state that the evaluation of our developed algorithm is only possible along side its
implementation within the new AI. For this reason, a good performance from said AI can be seen as a
good indicative for our solution.
4.1 Static Scenario Test - Duels
During the development time, the new AI was matched against the old AI in various games. This was
the easiest access benchmark for our implementation. Our main objectives with these matches were:
• Debugging the implementation of the algorithm;
• Identifying issues with the implementation, that could be directly linked to flaws in the algorithm
itself;
• Asserting the correct evaluation of the perceptions of value and danger per territory;
• Improving the ability to conquer multiple territories per turn;
• Assessing the adaptability of our algorithm to a controlled environment, benchmarking with the old
AI.
Almansur’s duels are scenarios that place two players in equal footing, in a symmetric map. Each
game had 24 turns and turns were processed after every player ended their respective turn — since
35
both players were AIs in this scenario, turns were actually processed fast enough to play multiple games,
making it easier to test and iterate on the implementation.
Due to the deterministic nature of the game and the AIs implementation, replaying the same match
up, will generate the same actions from both players, producing the exact same result. For this reason,
the analysis present in this section is based of the last iteration of tests made in this context.
4.1.1 Static Scenario Test Analysis
As seen in Figure 4.1, victory points steadily increase for both players, however there is a noticeable
positive difference towards the new AI.
Figure 4.1: Graphic with the evolution of Victory Points for both players in the duel
Even though constant at first, there are a couple of moments when the victory points line abruptly
changes — around turns 12-14, and 20-22. As can be seen by Figure 4.2, the evolution of victory points
is directly connected to the evolution of territory victory points.
36
Figure 4.2: Graphic with the evolution of Territory Victory Points for both players in the duel
In Figure 4.3, we can see that the amount of territories conquered in the turns when the abrupt
changes were detected (Figures 4.1 and 4.2) do not reflect a great number of territories conquered —
which, in turn, means that the few territories conquered were quite valuable.
Figure 4.3: Graphic with the evolution of Territories Conquered for both players in the duel
Having conquered all interesting territories on its side of the map, in the final stages of the game the
new AI entered the old AI’s territory and started conquering it. This was the cause for a few battles, as
can be seen in Figure 4.4.
37
Figure 4.4: Graphic with the evolution of Battle Victory Points for both players in the duel
In order to fully analyze our solution, we need to take note of the number of considerations per turn
— this is represented in Figure 4.5.
Figure 4.5: Graphic with the number of reconsiderations and intentions per turn.
By looking at this graphic, together with the previous ones, we can see that the moments when more
reconsiderations took place, are connected to some of the most noticeable shifts in the previous graphic
lines. Comparing with Figure 4.4 we can see:
• 9th turn — Reconsideration for a battle;
• 19th turn — Reconsideration for a comeback in battle victory points;
38
• 21th turn — Reconsideration for a great battle win, causing the greatest difference in battle victory
points since the beginning of the duel;
It is also relevant to mention that, during all the game, there was no need for a second reconsideration
iteration, as the swarm always reached consensus after the first reconsideration iteration. Considering
all the turns, a selfish intention could be reconsidered in, approximately, 21% of the time.
However, not all reconsiderations are returning a positive outcome — but they may be returning a
less negative one. For example, at turn 14 we can see a losing battle, which had some reconsiderations
at its base.
4.1.2 Static Scenario Test Conclusion
Being the easiest accessible data source, through the development process these duels provided the
best setting for fine tuning our implementation. These allowed for the discovery of a few shortcomings,
and consequent upgrades to our implementation itself — the following are examples of these upgrades:
• Units of a swarm should be ordered by adequacy for their tasks — otherwise, the strongest
unit in the swarm could be forced to pick a less valuable target — which could lead to the demise
of the whole swarm.
• Each target should only be picked by one unit — otherwise, multiple units could (and would)
point at the same target if it had great value, without having to communicate — invalidating the
reconsideration process.
• A target that has a unit from our swarm, should not be our target — otherwise, a more suitable
unit could pick it as a target, forcing the unit already at the location unit to move — most time would
be spent traveling around the map, instead of actually picking up objectives (conquering or battling
the enemy).
• Great risks should only be considered together with great reward (value) — otherwise, units
would be reckless and attempt to conquer (or battle) any enemy territory in sight.
These concepts were evident after the first few duels between the AIs. They led to implementation
decisions such as the inclusion of influence maps for value and danger. After these upgrades, the new
AI behavior was greatly improved being able to easily out-duel the old AI.
From the figures present in this section, it is possible to conclude that our AI implementation was
more successful in these closed scenarios than the old AI.
4.2 Dynamic Test
Our solution had to be tested in a more complex environment in order to raise more interesting metrics
for our solution and potential shortcomings. One of the advantages of these dynamic scenarios is the
39
possibility of adding players to the game while it is already running. For this reason, a scenario was
created with a total of 14 players, distributed as follows:
• 9 active Human players — 8 joined at the start of the game, and 1 mid game;
• 3 players with New AI — 2 joined at the start of the game, and 1 mid game;
• 2 players with Old AI — 2 joined at the started of the game.
At the time the data was analyzed, this game had completed 16 turns during the course of 3 weeks,
processing 1 turn per day except weekends. Our objectives for this scenario were:
• Identifying flaws in the implementation, that could be directly linked to flaws in the algorithm itself;
• Assessing the capability of our solution, when compared with that of the old AI implementation;
• Assessing the adaptability of our solution to a dynamic environment, with multiple opponents and
threats.
4.2.1 Dynamic Test Analysis
In Figure 4.61 we can see the evolution of average victory points among the 3 types of players. Right
after the second turn, we can see a clear separation between the line of the old AI and the other two. It
is also clear that our solution is able to be on par with the human player until the sixth turn.
Figure 4.6: Graphic with the average evolution of Victory Points for the three types of players
After turn 6, players have a large enough territory that they start to meet — this means, there is
no more neutral, unconquered, territory between two player areas — and players start to fight for one1See Appendix / Graphics for graphics containing discretized data from all 14 players, for this and other interesting data —
that will be summarized in the following sections.
40
another’s territories. In Figure 4.7 we can see the beginning of a shift in territory-related victory points,
in favor of the human player. It is also noteworthy that the new AI is able to have its line over the human’s
at one point.
Figure 4.7: Graphic with the average evolution of Territory Victory Points for the three types of players
The visible growth in Figure 4.7 implies that, in the early game, players are not conquering each
other’s territories, but rather those of neutral units — not controlled by any players. This can be seen as
an expansion period. This is reinforced by Figure 4.8 where we can see more or less stable lines for the
human player and the new AI — if we ignore the first encounter in turn 2.
Figure 4.8: Graphic with the average evolution of Battle Victory Points for the three types of players —lines are affected by battle events, especially visible on symmetric changes.
41
The human line, in this figure, can be seen as declining, slowly, which translates to the small in-
teraction between the human players themselves. The old AI line, however, suffers big loses early in
the game, likely caused by attempting to conquer a fortress without sufficient strength — a fortress is a
fortified territory that requires a lot of manpower to conquer. Its also clear that said fortress was neutral,
as we do not see any reflection of that drop, in the human or new AI’s line.
Also visible in Figure 4.8, is the aggressive nature of our AIs when compared to the human player.
Their lack of better judgment and over-estimation of their capabilities is rather evident. Despite being
different in scale in terms of battle-related victory points, Figures 4.9 and 4.10 show that both new and
old AIs’ military strength behave in a similar way after the initial failed confrontations.
Figure 4.9: Graphic with the average evolution of Army Power for the three types of players.
42
Figure 4.10: Graphic with the average evolution of Army Size for the three types of players.
However, the new AI is able to maintain the same level of strength and size for a longer period of
time, and the old AI keeps getting weaker and weaker.
Finally, another interesting metric is the amount of intents that lead to a reconsideration. In the case
of this game, that percentage was 13% and its average distribution throughout the game is represented
in Figure 4.11. During the 16 turns analyzed, our AIs never had more than 3 units in the swarm, which
makes it hard to take proper conclusions from this parameter. The most evident detail in this figure, is
that the more units have to make a decision, the more likely it is to trigger a reconsideration cycle — in
order to reach a consensus — as it would have been expected.
43
Figure 4.11: Graphic with the average number of intentions and reconsiderations per turn — this is anaverage of the three AIs in the game.
Still related to this metric, the only connection found between this and our other metrics can be show
in Figure 4.12. In this figure, we can see that a reconsideration takes place right before a battle event
— an unsuccessful one, though. Despite the loss, this connection reflects a case of interaction between
the units of the swarm, supporting each other in their decision.
Figure 4.12: Correlation between battle victory points and reconsideration count on the first iterationreconsideration cycle of the algorithm.
44
4.2.2 Dynamic Test Conclusion
The greatest accomplishment for the new AI was being able to keep up with the human player during
the first few turns of the game.
Considering territory victory points, the new AI displays a good perception of value and good prior-
itization of conquer targets. Despite the initial setback with one of the human players, the new AI was
able to recover and stand equal to the human players in terms of this variable. The old AI was not able
to reach such a level at any point in the game, falling short in the first couple of turns — displaying its
inadequacy to prioritize and evaluate both target and self worths.
Our greatest validation should come from the comparison between the new and old AIs though. Not-
ing that the AIs were similar in every aspect except the military agent responsible for issuing commands
— conquering, attacking, and defending — the difference in victory points observed comes as a huge
accomplishment.
Both types of AI, in this game, were playing without the aid of a complex diplomatic agent. This left
them unable to properly interact with one and other, or with the player, at this level — making it impossi-
ble to form alliances, and forcing each AI to achieve their results on their own.
In respect to military stability, the new AI was able to come back from its loses, and to avoid further
decline after an initial struggle. On the other hand, the old AI is unable to keep from decreasing strength,
turn after turn. Although this allows us to conclude that it is possible that the new AI is better at picking
its fights — having a better perception of danger, and a better judgment of its own ability — it is clear
that there is also room for improvement on the AI that is responsible for recruiting additional military units.
From the correlation graphic (Figure 4.12) it was possible to find one connecting point with the battle-
related victory points, but this is not considered enough connecting traits to be entirely meaningful. We
believe that it is necessary to perform more complex tests, with a greater number of players and units in
each swarm, in order to be able to find additional relationships between the relevant metrics.
4.3 Summary - Result Significance
From the tests we ran, we found that the new AI was superior to the old one. Some of the defining
characteristics that allowed our AI to perform better than the old one were:
• Improved perception of danger;
• Improved perception of value;
• Improved prioritization skill;
45
• Improved adaptability, making it less predictable;
• Improved distribution of units in territories, covering more ground.
Both in duel matches and in the dynamic match, it was proven that the new AI was able to evaluate
territories in a more accurate way than the old AI. In the dynamic match it was even shown that the
evaluation was even better than the human’s at one point — Figure 4.7.
Also during the dynamic match, the new AI was able to evade danger — and losses — for a greater
period than the old AI. In the duels, we saw the new AI losing territories, and quickly attempt to conquer
(and succeed in conquering )them again.
The combination of Influence Maps used for both value and danger attributes was a stepping stone
for these results, allowing each unit of the swarm to correctly evaluate territories before attempting to
conquer them.
Despite being a static scenario, the constant movement of units from both players can be seen as
a dynamic change in the environment. By being aware of the enemy units movement and how their
presence influences its safety, the new AI was able to react faster and protect itself better than the old
AI.
4.3.1 Final Remarks
Considering that the AIs were the same, to the exception of the implemented algorithm connected to
the military AI, we can state that the decision process for the military decisions present in the new AI is
definitely better than the old one — which, in turn, is a good indicative that our algorithm (and implemen-
tation) is an improvement to said process.
In turn, we do believe that the tests we ran, however promising, were not sufficient for a definite
conclusion regarding the algorithm itself, as it is highly dependent on the context. Even in this context,
we believe further testing is required, as the number of units per swarm was too small for a completely
reliable affirmation.
46
Chapter 5
Conclusions
In this work, we implemented an algorithm based on swarm intelligence concepts, that was responsible
for an artificial player’s decision process in a strategy game. Our main objective for this work, was to
derive said algorithm from the current swarm intelligence knowledge, and apply it in a different than
usual context.
Our implementation was built over an already existing AI, in the game Almansur. The original version
was adapted and had its military decision process replaced by our new algorithm, ensuring most of the
implementation was shared between both AIs. This decision allowed to benchmark the modified military
decision process without too many influences from the other components — economic and strategic
(Figure 3.3).
This work was based on some already proven theories:
• Heuristic functions based of influence maps are not new;
• Decentralized approaches for game AIs are not new.
This work was also an experiment for testing theories in a different than usual concept (as previously
mentioned):
• Use of influence maps to apply the Danger Theory, originated from immune system studies, in the
context of Game Artificial Intelligence;
• The definition of semantics understandable by a decentralized system, or swarm;
• Designing an algorithm capable of defining rules for communication, in order to reach a beneficial
consensus.
In this sense, the final solution presented is a mixture between the conventional AI and SI concepts.
Given the results presented in the previous chapter, it is possible to state that our solution has some
very promising results. The final implementation displays some of the benefits of a SI system, such as
47
adaptability, unpredictability (which is a good thing in a game AI) and associated emergence — provid-
ing results that rival those of a human (in our case, in terms of conquering territories).
We do not feel, however, that these results are sufficient for concluding the success of this work. At
most, we can reiterate our claim and state that the results are promising. In order to fully commit to the
success or failure of this solution, our implementation needs to go through more tests against users of
different skill levels.
Considering the algorithm base has an agnostic nature, devoid of context, our algorithm needs to
be implemented in other different contexts — other than Almansur — otherwise we are only able to
conclude that this solution works for this case.
Finally, considering the positive results so far, this work will likely stay as a part of the Almansur
game, allowing for more time of testing and potentially some further development in this area of studies.
5.1 Future Work
Regarding Almansur, specifically, there are a few changes that could be implemented in order to improve
results. These would benefit both old and new AIs.
• Implement a Diplomatic Agent — the most relevant issue found in the resulting AI is its lack of
ability to communicate with other players. Frequently, when the player’s territories begin to overlap,
in-game personal messages are sent to question the opponents and check if they are active before
attacking. It is common to avoid being attacked by simply replying, or to form alliances with the
surrounding players, attempting an alliance victory. The current AIs are unable to deal with these
situations.
• Communication between modules — the current communication and is not optimized. For ex-
ample, when evaluating the value of a territory, every resource is considered equal to every other.
Depending on the needs of the player, the economic player should be able to increase the value
of a resource that was required, or decrease the value if it was in excess;
• Responsability for generating/maintining Influence Maps — in the current implementation,
there isn’t a specific class responsible for generating the influence maps. These, depending on
their nature, should be generated by specific agents. For instance, the Influence Danger Map could
be generated by the Strategy agent (or by a new Diplomatic agent), and the Influence Value Map
could be generated by the Economic agent. This would allow for the definition of clear objectives
and improve the communication between all the AI agents (or components).
• Propagate Value Influence in the Map — when an enemy is present at a specific location, its
strength is the value of danger on that position. However, that danger is propagated to nearby
48
territories in order to take into account the possible movements of that enemy unit. The Value
does not propagate this way, though. If it did, it could allow for the definition of optimal paths to a
target — allowing to conquer every territory along the way.
As for the algorithm itself, its presence in this environment could determine some flaw in its core and,
through repetition, allow to fine tune some of the logic behind it. In theory, however, the algorithm would
benefit more from being implemented in a different context, providing further insight on its uses and its
shortcomings. Ultimately, only different implementations would allow for a final confirmation of worth for
this algorithm on its own.
49
Appendix A
Complete Algorithm
Algorithm 3 Complete developed algorithm - part 1
1: input: perceptions . Territories, influence map info, units in range, etc.2: output: consensusIntents← map[unit, intent]3:4: selfishIntents← newmap[unit, intent]5:6: for all unit in swarm do7: for all territory in reachableTerritories do8: for all intent in intentType do9: if (intent.heuristic > selfishIntent.heuristic ) then
10: selfishIntent← intent11: end if12: end for13: end for14:15: selfishIntents[unit]← selfishIntent16: end for17:18: initStepIntents← selfishIntents19: . Reconsideration20: while true do21: hasReconsidered← false22:23: for all unit in swarm do24: myIntent← initStepIntents[unit]25:26: for all intent in initStepIntents do27: isV alid← finalIntents[intent.unit] != null and finalIntents[intent.unit] != intent28:29: if (intent.unit == unit) or (not isV alid then30: continue31: end if
50
Algorithm 4 Complete developed algorithm - part 2
32: myHeuristic← myIntent.heuristicFor(unit)33: newHeuristic← intent.heuristicFor(unit)34:35: if newHeuristic > myHeuristic then36: myIntent← intent37: hasReconsidered← true38: end if39: end for40:41: finalStepIntents[unit]← myIntent42: end for43:44: if hasReconsidered then45: break46: end if47: . Validation / Rollback48: for all unit in swarm do49: intent← finalIntents[unit]50: parent← intent.parent51:52: isV alid← (parent == unit) or (finalIntents[parent] == intent)53:54: if not isV alid then55: previousIntent← initStepIntents[unit]56: previousParent← previousIntent.parent57:58: isPreviousV alid ← (previousParent == unit) or (finalIntents[previousParent] ==
previousIntent)59: if isPreviousV alie then60: finalStepIntents[unit]← initStepIntents[unit]61: else62: finalStepIntents[unit]← selfishIntent[unit]63: end if64: end if65: end for66:67: end while68:69: consensusIntents← finalStepIntents
51
Appendix B
Additional Graphics
B.1 Victory Points in Multi-player Match
Figure B.1: Graphic of the Victory Points evolution throughout all the game for all the players
52
B.2 Territory Victory Points in Multi-player Match
Figure B.3: Graphic of the Territory Victory Points evolution throughout all the game for all the players
54
Figure B.4: Graphic of the Territory Victory Points evolution throughout all the game for all the AIs
55
B.3 Territory Owned in Multi-player Match
Figure B.5: Graphic of the Territory Owned evolution throughout all the game for all the players
56
B.4 Battle Victory Points in Multi-player Match
Figure B.7: Graphic of the Battle Victory Points evolution throughout all the game for all the players
58
Figure B.8: Graphic of the Battle Victory Points evolution throughout all the game for all the AIs
59
B.5 Army Size in Multi-player Match
Figure B.9: Graphic of the Army Size evolution throughout all the game for all the players
60
B.6 Army Power in Multi-player Match
Figure B.11: Graphic of the Army Power evolution throughout all the game for all the players
62
Bibliography
[1] Michael Gallagher. Leading the Entertainment Industry through a Difficult Economy. Expo State of
the Industry Address, 2009.
[2] Entertainment Software Association. Essential facts about the computer and video game industry.
Retreived November, 2013.
[3] William Collins, Maxie Carpenter, and Jim Shankle. The iEconomy! pages 1–78, 2013.
[4] MG Carneiro. Artificial Intelligence in Games Evolution. Business, Technological, and Social Di-
mensions of Computer Games: Multidisciplinary Developments, pages 98–114, 2011.
[5] AM Barata, PA Santos, and Rui Prada. AI for Massive Multiplayer Online Strategy Games. AIIDE,
pages 110–115, 2011.
[6] John Levine, CB Congdon, Marc Ebner, and G Kendall. General Video Game Playing. idsia.ch,
pages 1–7, 2013.
[7] M Riedl and V Bulitko. Interactive Narrative: A Novel Application of Artificial Intelligence for Com-
puter Games. AAAI, 2012.
[8] Marco Santos and Carlos Martinho. Wasp-Like Scheduling for Unit Training in Real-Time Strategy
Games. AIIDE, pages 195–200, 2011.
[9] Yuanchun Zhu and Ying Tan. A Danger Theory Inspired Learning Model and Its Application to
Spam Detection. pages 382–389, 2011.
[10] K Potisartra and V Kotrajaras. An evenly matched opponent AI in Turn-based Strategy games. 3rd
IEEE International Conference on Computer Science and Information Technology, 2010.
[11] Marco Dorigo and Mauro Birattari. Ant colony optimization. In Encyclopedia of Machine Learning,
pages 36–39. Springer, 2010.
[12] James Kennedy. Particle swarm optimization. In Encyclopedia of Machine Learning, pages 760–
766. Springer, 2010.
[13] H Shah-Hosseini. The intelligent water drops algorithm: a nature-inspired swarm-based optimiza-
tion algorithm. International Journal of Bio-Inspired Computation, 1(1/2):71, 2009.
64
[14] MHJ Bergsma and Pieter Spronck. Adaptive Spatial Reasoning for Turn-based Strategy Games.
AIIDE, 2008.
[15] C Weddle. Artificial Intelligence and Computer Games. Artificial Intelligence and Computer Games,
pages 1–8, 2008.
[16] F. Sailer, M. Buro, and M. Lanctot. Adversarial planning through strategy simulation. . . . and Games,
2007. CIG 2007. IEEE . . . , 2007.
[17] A Sanchez-Ruiz and S Lee-Urban. Game AI for a Turn-based Strategy Game with Plan Adaptation
and Ontology-based retrieval. (0642882), 2007.
[18] Uwe Aickelin and Julie Greensmith. Sensing danger: Innate Immunology for Intrusion Detection.
Information Security Technical Report, 12(4):218–227, January 2007.
[19] Risto Miikkulainen. Computational intelligence in games. Computational . . . , 2006.
[20] Ashraf M Abdelbar, Sherif Ragab, and Sara Mitri. Co-evolutionary particle swarm optimization
applied to the 7× 7 seega game. In Neural Networks, 2004. Proceedings. 2004 IEEE International
Joint Conference on, volume 1. IEEE, 2004.
[21] Tapani Raiko. The go-playing program called go81. In Proceedings of the Finnish Artificial Intelli-
gence Conference, STeP 2004, pages 197–206, 2004.
[22] Michael Buro. Real-time strategy gaines: a new AI research challenge. Proceedings of the 18th
International Joint Conference, pages 1534–1535, 2003.
[23] R. Rasch, A. Kott, and K.D. Forbus. Incorporating AI into military decision making: an experiment.
Intelligent Systems, IEEE, 18(4), 2003.
[24] Michael Mateas. Expressive AI: Games and Artificial Intelligence. DIGRA Conf., 2003.
[25] Joyce Wycoff. Learning from Ants. pages 1–3, 2003.
[26] Nelis Franken and Andries Petrus Engelbrecht. Comparing pso structures to learn the game of
checkers from zero knowledge. In Evolutionary Computation, 2003. CEC’03. The 2003 Congress
on, volume 1, pages 234–241. IEEE, 2003.
[27] Eike F Anderson. Playing smart-artificial intelligence in computer games. 2003.
[28] K.D. Forbus, J.V. Mahoney, and K. Dill. How qualitative spatial reasoning can improve strategy
game AIs. Intelligent Systems, IEEE, 17(4):1–8, 2002.
[29] Polly Matzinger. The Danger Model: a Renewed Sense of Self. Science (New York, N.Y.),
296(5566):301–5, April 2002.
[30] William Van Der Sterren. Squad tactics: Team ai and emergent maneuvers. AI Game Programming
Wisdom, pages 233–246, 2002.
65
[31] E Bonabeau and C Meyer. Swarm intelligence: A whole new way to think about business. Harvard
business review, 79(5):106–114, 165, 2001.
[32] Chris Fairclough, Michael Fagan, B. Mac Namee, and Pádraig Cunningham. Research Directions
for AI in Computer Games. Irish Conference on Artificial Intelligence and Cognitive Science, pages
333–344, 2001.
[33] D. Johnson and J. Wiles. Computer games with intelligence. 10th IEEE International Conference
on Fuzzy Systems. (Cat. No.01CH37297), 2:1355–1358.
[34] Vincent A Cicirello and Stephen F Smith. Wasp nests for self-configurable factories. In Proceedings
of the fifth international conference on Autonomous agents, pages 473–480. ACM, 2001.
[35] Paul Tozour. Influence mapping. Game programming gems, 2:287–297, 2001.
[36] Guy THBR AULAL, Sirnon GOSS, Jacques GERVET, and Jean-Louis DENELBOIIRG. Task differ-
entiation in polistes wasp colonies: a model for self-organizing groups of robots.
[37] D Floreano and C Mattiussi. Bio-inspired artificial intelligence: theories, methods, and technologies.
2008.
[38] AP Engelbrecht. Computational Intelligence: An Introduction. 2nd edition, 2007.
[39] Colin Fyfe, Stephen Mcglinchey, and Colin Fyfe. Biologically Inspired Artificial Intelligence for Com-
puter Games. IGI Global, November 2007.
[40] E Bonabeau, M Dorigo, and G Theraulaz. Swarm intelligence: From Natural to Artificial Systems.
1999.
66