research article chess-like games may have no uniform...
TRANSCRIPT
-
Hindawi Publishing CorporationGameTheoryVolume 2013, Article ID 534875, 10 pageshttp://dx.doi.org/10.1155/2013/534875
Research ArticleChess-Like Games May Have No Uniform Nash EquilibriaEven in Mixed Strategies
Endre Boros, Vladimir Gurvich, and Emre Yamangil
RUTCOR, Rutgers University, 640 Bartholomew Road, Piscataway, NJ 08854-8003, USA
Correspondence should be addressed to Vladimir Gurvich; [email protected]
Received 2 February 2013; Accepted 22 April 2013
Academic Editor: Walter Briec
Copyright Β© 2013 Endre Boros et al. This is an open access article distributed under the Creative Commons Attribution License,which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Recently, it was shown that Chess-like games may have no uniform (subgame perfect) Nash equilibria in pure positional strategies.Moreover, Nash equilibria may fail to exist already in two-person games in which all infinite plays are equivalent and ranked asthe worst outcome by both players. In this paper, we extend this negative result further, providing examples that are uniform Nashequilibria free, even inmixed or independentlymixed strategies. Additionally, in case of independentlymixed strategies we considertwo different definitions for effective payoff: the Markovian and the a priori realization.
1. Introduction
1.1. Nash-Solvability in Pure and Mixed Strategies: MainResults. There are two very important classes of the so-called uniformlyNash-solvable positional games with perfectinformation, for which a Nash equilibrium (NE) in purestationary strategies, which are also independent of the initialposition, exists for arbitrary payoffs.These two classes are thetwo-person zero-sum games and the π-person acyclic games.
However, when (directed) cycles are allowed and thegame is not zero sum, then a positional game with perfectinformation may have no uniform NE in pure stationarystrategies. This may occur already in the special case of twoplayers with all cycles equivalent and ranked as the worstoutcome by both players. Such an example was recentlyconstructed in [1].
Here we strengthen this result and show that for thesame example no uniformNE exists even inmixed stationarystrategies, not only in pure ones. Moreover, the same negativeresult holds for the so-called independently mixed strategies.In the latter case we consider two different definitions forthe effective payoffs, based on Markovian and a priorirealizations.
In the rest of the introduction we give precise definitionsand explain the above result in more details.
Remark 1. In contrast, for the case of a fixed initial position,Nash-solvability in pure positional strategies holds for thetwo-person case and remains an open problem for π > 2;see [1] for more details; see also [2β20] for different cases ofNash-solvability in pure strategies.
Furthermore, for a fixed initial position, the solvability inmixed strategies becomes trivial, due to the general result ofNash [21, 22]. Thus, our main example shows that Nashβs the-orem cannot be extended for positional games to the case ofuniform equilibria. It is shown for the following four types ofpositional strategies: pure, mixed, and independently mixed,where in the last case we consider two types of effectivepayoffs, defined by Markovian and a priori realizations.
1.2. Positional Game Structures. Given a finite directed graph(digraph) πΊ = (π, πΈ) in which loops and multiple arcs areallowed, a vertex V β π is interpreted as a position and adirected edge (arc) π = (V, V) β πΈ as a move from V toV. A position of outdegree 0 (one with no moves) is calledterminal. Let π
π= {π1, . . . , π
π} be the set of all terminal
positions. Let us also introduce a set of π players πΌ = {1, . . . , π}and a partition π· : π = π
1βͺ β β β βͺ π
πβͺ ππ, assuming that
each player π β πΌ is in control of all positions in ππ.
-
2 GameTheory
1
3
2
1
2
1
2
1
2π1
π2
π3
π’1
π’2
π1
π2
π3
π4
π5
π6
Figure 1: Two Chess-like game structuresG1andG
2. InG
1, there are 3 players controlling one position each (i.e.,G
1is play-once), while in
G2there are only two players who alternate turns; hence, each of them controls 3 positions: V
1, V3, V5are controlled by player 1 and V
2, V4, V6by
player 2. In each position Vβ, the corresponding player has only two options: (π) to proceed to V
β+1and (π‘) to terminate at π
β, where β β {1, 2, 3}
(V4= V1) inG
1and β β {1, . . . , 6} (V
7= V1) inG
2. To save space, we show only symbols π
β, while the corresponding vertex names are omitted.
An initial position V0β π may be fixed. The triplet
(πΊ,π·, V0) or pair G = (πΊ,π·) is called a Chess-like positional
game structure (or just a game structure, for short), initializedor noninitialized, respectively. By default, we assume that it isnot initialized.
Two examples of (noninitialized) game structuresG1and
G2are given in Figure 1.
1.3. Plays, Outcomes, Preferences, and Payoffs. Given an ini-tialized positional game structure (πΊ,π·, V
0), a play is defined
as a directed path that begins in V0and either ends in a
terminal position π β ππor is infinite. In this paper we
assume that all infinite plays form one outcome πβ
(or π),in addition to the standard Terminal outcomes of π
π. (In [5],
this condition was referred to as AIPFOOT.)A utility (or payoff) function is amapping π’ : πΌΓπ΄ β R,
whose value π’(π, π) is interpreted as a profit of the player π βπΌ = {1, . . . , π} in case of the outcome π β π΄ = {π
1, . . . , π
π, πβ}.
A payoff is called zero-sum ifβπβπΌπ’(π, π) = 0 for every π β
π΄. Two-person zero sum games are important. For example,the standard Chess and Backgammon are two-person zero-sum games in which every infinite play is a draw, π’(1, π
β) =
π’(2, πβ) = 0. It is easy to realize that π’(π, π
β) = 0 can be
assumed for all players π β πΌ without any loss of generality.Another important class of payoffs is defined by the
condition π’(π, πβ) < π’(π, π) for all π β πΌ and π β π
π; in other
words, the infinite outcome πβ
is ranked as the worst one byall players. Several possible motivations for this assumptionare discussed in [4, 5].
A quadruple (πΊ,π·, V0, π’) and triplet (πΊ,π·, π’) will be
called a Chess-like game, initialized and noninitialized,respectively.
Remark 2. From the other side, the Chess-like games can beviewed as the transition-free deterministic stochastic gameswith perfect information; see, for example, [8β11].
In these games, every nonterminal position is controlledby a player π β πΌ and the local reward π(π, π) is 0 for each playerπ β πΌ and move π, unless π = (V, V) is a terminal move, that is,V β ππ. Obviously, in the considered case all infinite plays are
equivalent since the effective payoff is 0 for every such play.Furthermore, obviously, π
βis the worst outcome for a player
π β πΌ if and only if π(π, π) > 0 for every terminal move π.If |πΌ| = π = π = |π΄| = 2, then the zero-sum Chess-like
games turn into a subclass of the so-called simple stochasticgames, which were introduced by Condon in [23].
1.4. Pure Positional Strategies. Given game structure G =(πΊ,π·), a (pure positional) strategy π₯
πof a player π β πΌ is a
mapping π₯π: ππβ πΈπthat assigns to each position V β π
πa
move (V, V) from this position.The concept of mixed strategies will be considered in
Section 1.10; till then only pure strategies are considered.Moreover, in this paper, we restrict the players to theirpositional (pure) strategies. In other words, the move (V, V)of a player π β πΌ in a position V β π
πdepends only on the
position V itself, not on the preceding positions or moves.Let π
πbe the set of all strategies of a player π β πΌ and
π = βπβπΌππbe the direct product of these sets. An element
π₯ = {π₯1, . . . , π₯
π} β π is called a strategy profile or situation.
1.5. Normal Forms. A positional game structure can berepresented in the normal (or strategic) form.
Let us begin with the initialized case. Given a gamestructure G = (πΊ,π·, V
0) and a strategy profile π₯ β π, a play
π(π₯) is uniquely defined by the following rules: it begins inV0and in each position V β π
πproceeds with the arc (V, V)
determined by the strategy π₯π. Obviously, π(π₯) either ends
in a terminal position π β ππor π(π₯) is infinite. In the
latter case π(π₯) is a lasso; that is, it consists of an initial partand a directed cycle (dicycle) repeated infinitely. This holds,because all players are restricted to their positional strategies.In either case, an outcome π = π(π₯) β π΄ = {π
1, . . . , π
π, πβ}
-
GameTheory 3
1
2
3
π1
π‘
π
ππ‘
π‘
π
π
π
π‘
π‘π1 π2 π3 π1 π3 π3
π2 π2 π3 π3 π3 π3
π1 π2 π1 π1 π1 π1
π2 π2 π2 π π π
π’(1, π2) > π’(1, π1) > π’(1, π3) > π’(1, π)
π’(2, π3) > π’(2, π2) > π’(2, π1) > π’(2, π)
π’(3, π1) > π’(3, π3) > π’(3, π2) > π’(3, π)
Figure 2: The normal form π1of the positional game structures G
1from Figure 1. Each player has only two strategies: to terminate (π‘) or
proceed (π). Hence, π1is represented by a 2 Γ 2 Γ 2 table, each entry of which contains 3 terminals corresponding to the 3 potential initial
positions V1, V2, V3of G1. The rows and columns are the strategies of the players 1 and 2, while two strategies of the player 3 are the left and
right 2 Γ 2 subtables. The corresponding game (π1, π’) has no uniform NE whenever a utility function π’ : πΌ Γ π΄ β R satisfies the constraints
π1specified in the figure.
is assigned to each strategy profile π₯ β π. Thus, a game formπV0
: π β π΄ is defined. It is called the normal form of theinitialized positional game structureG.
If the game structureG = (πΊ,π·) is not initialized, thenwerepeat the above construction for every initial position V
0β
π\ππto obtain a playπ = π(π₯, V
0), outcome π = π(π₯, V
0), and
mapping π : π Γ (π \ ππ) β π΄, which is the normal form
of G in this case. In general we have π(π₯, V0) = πV0
(π₯). Forthe (noninitialized) game structures in Figure 1 their normalforms are given in Figures 2 and 3.
Given also a payoff π’ : πΌ Γ π΄ β R, the pairs (πV0 , π’) and(π, π’) define the games in the normal form, for the above twocases.
Of course, these games can be also represented by thecorresponding real-valued mappings:
πV0: πΌ Γ π β R, π : πΌ Γ π Γ (
π
ππ
) β R, (1)
where πV0(π, π₯) = π(π, π₯, V0) = π’(π, πV0(π₯)) = π’(π, π(π₯, V0)) forall π β πΌ, π₯ β π, V
0β π \ π
π.
Remark 3. Yet, it seems convenient to separate the game fromπ and utility function π’.
By this approach, π βtakes responsibility for structuralpropertiesβ of the game (π, π’), that is, the properties that holdfor any π’.
1.6. Nash Equilibria in Pure Strategies. The concept of Nashequilibria is defined standardly [21, 22] for the normal formgames.
First, let us consider the initialized case. Given πV0 :π β π΄ and π’ : πΌ Γ π΄ β R, a situation π₯ β π iscalled a Nash equilibrium (NE) in the normal form game(πV0
, π’) if πV0(π, π₯) β₯ πV0(π, π₯) for each player π β πΌ and every
strategy profile π₯ β π that can differ from π₯ only in the πthcomponent. In other words, no player π β πΌ can profit bychoosing a new strategy if all opponents keep their old strate-gies.
In the noninitialized case, the similar property is requiredfor each V
0β π\π
π. Given a payoffπ : πΌΓπΓ(π\π
π) β R,
a strategy profile π₯ β π is called a uniform NE if π(π, π₯, V0) β₯
π(π, π₯, V0) for each π β πΌ, every π₯ defined as above, and for
all V0β π \ π
π, too.
Remark 4. In the literature, the last concept is frequentlycalled a subgame perfect NE rather than a uniform NE. Thisname is justified when the digraph πΊ = (π, πΈ) is acyclic andeach vertex V β π can be reached from V
0. Indeed, in this
case (πΊ,π·, V, π’) is a subgame of (πΊ,π·, V0, π’) for each V β π.
However, if πΊ has a dicycle then any two its vertices V andV can be reached one from the other; that is, (πΊ,π·, V, π’)is a subgame of (πΊ,π·, V, π’) and vice versa. Thus, the nameuniform (or ergodic) NE seems more accurate.
1.7. Uniformly Best Responses. Again, let us start with theinitialized case. Given the normal form πV0 : πΌ Γ π β Rof an initialized Chess-like game, a player π β πΌ, and a pair ofstrategy profiles π₯, π₯ such that π₯ may differ from π₯ only inthe πth component, we say that π₯improves π₯ (for the playerπ) if πV0(π, π₯) < πV0(π, π₯
). Let us underline that the inequality
is strict. Furthermore, by this definition, a situation π₯ β πis a NE if and only if it can be improved by no player π β πΌ;in other words, any sequence of improvements either can beextended, or terminates in an NE.
Given a player π β πΌ and situation π₯ = (π₯π| π β πΌ), a
strategy π₯βπβ ππis called a best response (BR) of π in π₯ if
πV0(π, π₯β) β₯ πV0
(π, π₯) for any π₯, where π₯β and π₯ are both
obtained from π₯ by replacement of its πth component π₯πby
π₯β
πand π₯
π, respectively. A BR π₯β
πis not necessarily unique
but the corresponding best achievable value πV0(π, π₯β) is, of
course, unique. Moreover, somewhat surprisingly, such bestvalues can be achieved by aBRπ₯β
πsimultaneously for all initial
positions V0β π \ π
π. (See, e.g., [1, 4β6], of course, this result
is well known inmuchmore general probabilistic setting; see,e.g., textbooks [24β26].)
-
4 GameTheory
π‘π‘π‘ π‘π‘π π‘ππ‘ π‘ππ ππ‘π‘ ππ‘π πππ‘ πππ
π‘π‘π‘
π‘π‘π
π‘ππ‘
π‘ππ
ππ‘π‘
ππ‘π
πππ‘
πππ
π1 : π6 > π5 > π2 > π1 > π3 > π4 > π, π2 : π3 > π2 > π6 > π4 > π5 > π; π6 > π1 > π
π1π2π3π4π5π6 π1π2π3π4π5π1 π1π2π3π5π5π6 π1π2π3π5π5π1 π1π3π3π4π5π6 π1π3π3π4π5π1 π1π3π3π5π5π6 π1π3π3π5π5π1
π1π2π3π4π6π6 π1π2π3π4π1π1 π1π2π3π6π6π6 π1π2π3π1π1π1 π1π3π3π4π6π6 π1π3π3π4π1π1 π1π3π3π6π6π6 π1π3π3π1π1π1
π1π2π4π4π5π6 π1π2π4π4π5π1 π1π2π5π5π5π6 π1π2π5π5π5π1 π1π4π4π4π5π6 π1π4π4π4π5π1 π1π5π5π5π5π6 π1π5π5π5π5π1
π1π2π4π4π6π6 π1π2π4π4π1π1 π1π2π6π6π6π6 π1π2π1π1π1π1 π1π4π4π4π6π6 π1π4π4π4π1π1 π1π6π6π6π6π6 π1π1π1π1π1π1
π2π2π3π4π5π6 π2π2π3π4π5π2 π2π2π3π5π5π6 π2π2π3π5π5π2 π3π3π3π4π5π6 π3π3π3π4π5π3 π3π3π3π5π5π6 π3π3π3π5π5π3
π2π2π3π4π6π6 π2π2π3π4π2π2 π2π2π3π6π6π6 π2π2π3π2π2π2 π3π3π3π4π6π6 π3π3π3π4π3π3 π3π3π3π6π6π6 π3π3π3π3π3π3
π2π2π4π4π5π6 π2π2π4π4π5π2 π2π2π5π5π5π6 π2π2π5π5π5π2 π4π4π4π4π5π6 π4π4π4π4π5π4 π5π5π5π5π5π6 π5π5π5π5π5π5
π2π2π4π4π6π6 π2π2π4π4π2π2 π2π2π6π6π6π6 π2π2π2π2π2π2 π4π4π4π4π6π6 π4π4π4π4π4π4 π6π6π6π6π6π6 π π π π π π
π2
Figure 3: The normal form π2of the positional game structuresG
2from Figure 1. There are two players controlling 3 positions each. Again,
in every position there are only two options: to terminate (π‘) or proceed (π). Hence, in G2, each player has 8 strategies, which are naturally
coded by the 3-letter words in the alphabet {π‘, π}. Respectively, π2is represented by the 8 Γ 8 table, each entry of which contains 6 terminals
corresponding to the 6 (nonterminal) potential initial positions V1, . . . , V
6of G2. Again, players 1 and 2 control the rows and columns,
respectively. The corresponding game (π1, π’) has no uniform NE whenever a utility function π’ : πΌ Γ π΄ β R satisfies the constraints π
2
specified under the table. Indeed, a (unique) uniformly best response of the player 1 (resp. 2) to each strategy of 2 (resp. 1) is shown by thewhite discs (resp. black squares). Since the obtained two sets are disjoint, no uniform NE exists in (π
1, π’).
Theorem 5. Let π : πΌ Γ π Γ (π \ ππ) β R be the normal
form of a (noninitialized) Chess-like game (πΊ,π·, π’). Given aplayer π β πΌ and a situation π₯ β π, there is a (pure positional)strategy π₯β
πβ ππwhich is a BR of π in π₯ for all initial positions
V0β π \ π
πsimultaneously.
We will call such a strategy π₯βπa uniformly BR of the
player π in the situation π₯. Obviously, the nonstrict inequalityπV(π, π₯) β€ πV(π, π₯
β) holds for each position V β π. We will
say that π₯βπimproves π₯ if this inequality is strict, πV0(π, π₯) <
πV0(π, π₯β), for at least one V
0β π. This statement will
serve as the definition of a uniform improvement for thenoninitialized case. Let us remark that, by this definition, asituation π₯ β π is a uniform NE if and only if π₯ can beuniformly improved by no player π β πΌ; in other words, anysequence of uniform improvements either can be extended orterminates in a uniform NE.
For completeness, let us repeat here the simple proof ofTheorem 5 suggested in [1].
Given a noninitialized Chess-like game G = (πΊ,π·, π’), aplayer π β πΌ, and a strategy profile π₯ β π, in every positionV β π \ (π
πβͺ ππ) let us fix a move (V, V) in accordance with
π₯ and delete all other moves. Then, let us order π΄ accordingto the preference π’
π= π’(π, β). Let π1 β π΄ be a best outcome.
(Note that theremight be several such outcomes and also thatπ1= π might hold.) Let π1 denote the set of positions from
which player π can reach π1 (in particular, π1 β π1). Let usfix corresponding moves in π1 β© π
π. Obviously, there is no
move to π1 from π \ π1. Moreover, if π1 = π, then player πcannot reach a dicycle beginning from π \ π1; in particular,the induced digraph πΊ
1= πΊ[π \ π
1] contains no dicycle.
Then, let us consider an outcome π2 that is the best forπ in π΄, except maybe π1, and repeat the same arguments asabove for πΊ
1and π2, and so forth. This procedure will result
in a uniformly BR π₯βπof π in π₯ since the chosen moves of π are
optimal independently of V0.
1.8. Two Open Problems Related to Nash-Solvability of Initial-ized Chess-Like Game Structures. Given an initialized gamestructureG = (πΊ,π·, V
0), it is an open questionwhether anNE
(in pure positional strategies) exists for every utility functionπ’. In [4], the problemwas raised and solved in the affirmativefor two special cases: |πΌ| β€ 2 or |π΄| β€ 3. The last result wasstrengthened to |π΄| β€ 4 in [7]. More details can be found in[1] and in the last section of [6].
In general the above problem is still open even if weassume that π is the worst outcome for all players.
Yet, if we additionally assume that G is play-once (i.e.,|ππ| = 1 for each π β πΌ), then the answer is positive [4].
However, in the next subsection we will show that it becomesnegative if we ask for the existence of a uniform NE ratherthan an initialized one.
1.9. Chess-Like Games with a Unique Dicycle and withoutUniform Nash Equilibria in Pure Positional Strategies. Letus consider two noninitialized Chess-like positional gamestructures G
1and G
2given in Figure 1. For π = 1, 2, the
corresponding digraph πΊπ= (ππ, πΈπ) consists of a unique
dicycleπΆπof length 3π and amatching connecting each vertex
Vπ
βof πΆπto a terminal ππ
β, where β = 1, . . . , 3π and π =
1, 2. The digraph πΊ2is bipartite; respectively, G
2is a two-
person game structures in which two players take turns; in
-
GameTheory 5
other words, players 1 and 2 control positions V1, V3, V5and
V2, V4, V6, respectively. In contrast, G
1is a play-once three-
person game structure, that is, each player controls a uniqueposition. In every nonterminal position Vπ
βthere are only two
moves: one of them (π‘) immediately terminates in ππβ, while
the other one (π) proceeds to Vπβ+1
; by convention, we assume3π + 1 = 1.
Remark 6. In Figure 1, the symbols ππβfor the terminal
positions are shown but Vπβfor the corresponding positions
of the dicycle are skipped; moreover, in Figures 1β3, we omitthe superscript π in ππ
β, for simplicity and to save space.
Thus, in G1each player has two strategies coded by the
letters π‘ and π, while inG2each player has 8 strategies coded
by the 3-letter words in the alphabet {π‘, π}. For example, thestrategy (π‘ππ‘) of player 2 inG
2requires to proceed to V2
5from
V24and to terminate in π2
2from V2
2and in π2
6from V2
6.
The corresponding normal game forms π1and π
2of size
2 Γ 2 Γ 2 and 8 Γ 8 are shown in Figures 2 and 3, respectively.Since both game structures are noninitialized, each situationis a set of 2 and 6 terminals, respectively. These terminalscorrespond to the nonterminal positions of G
1and G
2, each
of which can serve as an initial position.A uniform NE free example for G
1was suggested in
[4]; see also [1, 8]. Let us consider a family π1of the utility
functions defined by the following constraints:
π’ (1, π2) > π’ (1, π
1) > π’ (1, π
3) > π’ (1, π) ,
π’ (2, π3) > π’ (2, π
2) > π’ (2, π
1) > π’ (2, π) ,
π’ (3, π1) > π’ (3, π
3) > π’ (3, π
2) > π’ (3, π) .
(2)
In other words, for each player π β πΌ = {1, 2, 3} toterminate is an average outcome; it is better (worse) whenthe next (previous) player terminates; finally, if nobody does,then the dicycle π appears, which is the worst outcome for all.The considered game has an improvement cycle of length 6,which is shown in Figure 2. Indeed, let player 1 terminatesat π1, while 2 and 3 proceed. The corresponding situation
(π1, π1, π1) can be improved by 2 to (π
1, π2, π1), which in its
turn can be improved by 1 to (π2, π2, π2). Repeating the similar
procedure two times more we obtain the improvement cycleshown in Figure 2.
There are two more situations, which result in (π1, π2, π3)
and (π, π, π). They appear when all three players terminate orproceed simultaneously. Yet, none of these two situations is anNE either. Moreover, each of them can be improved by everyplayer π β πΌ = {1, 2, 3}.
Thus, the following negative result holds, which we recallwithout proof from [4]; see also [1].
Theorem 7. Game (G1, π’) has no uniform NE in pure strate-
gies whenever π’ β π1.
We note that each player has positive payoffs. This iswithout loss of generality as we can shift the payoffs by apositive constant without changing the game.
A similar two-person uniform NE-free example wassuggested in [1], for G
2. Let us consider a family π
2of the
utility functions defined by the following constraints:
π’ (1, π6) > π’ (1, π
5) > π’ (1, π
2) > π’ (1, π
1)
> π’ (1, π3) > π’ (1, π
4) > π’ (1, π) ,
π’ (2, π3) > π’ (2, π
2) > π’ (2, π
6)
> π’ (2, π4) > π’ (2, π
5) > π’ (2, π) ,
π’ (2, π6) > π’ (2, π
1) > π’ (2, π) .
(3)
We claim that the Chess-like game (G2, π’) has no uniform
NE whenever π’ β π2.
Let us remark that |π2| = 3 and that π is theworst outcome
for both players for all π’ β π2. To verify this, let us consider
the normal form π2in Figure 3. By Theorem 5, there is a
uniformly BR of player 2 to each strategy of player 1 and viceversa. It is not difficult to check that the obtained two setsof the BRs (which are denoted by the white discs and blacksquares in Figure 3) are disjoint. Hence, there is no uniformNE. Furthermore, it is not difficult to verify that the obtained16 situations induce an improvement cycle of length 10 andtwo improvement paths of lengths 2 and 4 that end in thiscycle.
Theorem8 (see [1]). Game (G2, π’) has no uniformNE in pure
strategies whenever π’ β π2.
The goal of the present paper is to demonstrate that theabove two game structures may have no uniform NE notonly in pure but also in mixed strategies. Let us note that byNashβs theorem [21, 22] NE in mixed strategies exist in anyinitialized game structure. Yet, this result cannot be extendedto the noninitialized game structure and uniform NE. In thisresearch we are motivated by the results of [8, 11].
1.10. Mixed and Independently Mixed Strategies. Standardly,a mixed strategy π¦
πof a player π β πΌ is defined as a
probabilistic distribution over the setππof his pure strategies.
Furthermore, π¦πis called an independently mixed strategy if
π randomizes in his positions V β ππindependently. We
will denote by ππand by π
πβ ππthe sets of mixed and
independently mixed strategies of player π β πΌ, respectively.
Remark 9. Let us recall that the players are restricted totheir positional strategies and let us also note that the latterconcept is closely related to the so-called behavioral strategiesintroduced by Kuhn [19, 20]. Although Kuhn restrictedhimself to trees, yet his construction can be extended todirected graphs, too.
Let us recall that a game structure is called play-once ifeach player is in control of a unique position. For example,G1is play-once. Obviously, the classes of mixed and inde-
pendently mixed strategies coincide for a play-once gamestructure. However, for G
2these two notion differ. Each
player π β πΌ = {1, 2} controls 3 positions and has 8
-
6 GameTheory
pure strategies. Hence, the set of mixed strategies ππis of
dimension 7, while the setππβ ππof the independentlymixed
strategies is only 3-dimensional.
2. Markovian and A Priori Realizations
For the independently mixed strategies we will consider twodifferent options.
For every player π β πΌ let us consider a probabilitydistribution ππV for all positions V β ππ, which assignsa probability π(V, V) to each move (V, V) from V β π
π,
standardly assuming
0 β€ π (V, V) β€ 1, β
Vβπ
π (V, V) = 1,
π (V, V) = 0 whenever (V, V) β πΈ.
(4)
Now, the limit distributions of the terminals π΄ ={π1, . . . , π
π, πβ} can be defined in two ways, which we will be
referred to as theMarkovian and a priori realizations.The first approach is classical; the limit distribution can
be found by solving a π Γ π system of linear equations; see,for example, [27] and also [26].
For example, let us consider G1and let π
πbe the
probability to proceed in Vπfor π = 1, 2, 3. If π
1= π2=
π3= 1, then, obviously, the play will cycle with probability 1
resulting in the limit distribution (0, 0, 0, 1) for (π1, π2, π3, π).
Otherwise, assuming that V1is the initial position, we obtain
the limit distribution:
(
1 β π1
1 β π1π2π3
,
π1(1 β π
2)
1 β π1π2π3
,
π1π2(1 β π
3)
1 β π1π2π3
, 0) . (5)
Indeed, positions V1, V2, V3are transient and the probabil-
ity of cycling forever is 0 whenever π1π2π3< 1. Obviously,
the sum of the above four probabilities is 1.The Markovian approach assumes that for π‘ = 0, 1, . . .
the move π(π‘) = (V(π‘), V(π‘ + 1)) is chosen randomly, inaccordance with the distribution πV(π‘), and independently forall π‘ (furthermore, V(0) = V
0is a fixed initial position). In
particular, if the play comes to the same position again, that is,V = V(π‘) = V(π‘) for some π‘ < π‘, then the moves π(π‘) and π(π‘)may be distinct although they are chosen (independently)with the same distribution πV.
The concept of a priori realization is based on the follow-ing alternative assumptions. A move (V, V) is chosen accord-ing to πV, independently for all V β π \ ππ, but only once,before the game starts. Being chosen themove (V, V) is appliedwhenever the play comes at V. By these assumptions, eachinfinite play β is a lasso; that is, it consists of an initial part(that might be empty) and an infinitely repeated dicycle π
β.
Alternatively, β may be finite; that is, it terminates in a ππ.
In both cases, β begins in V0and the probability of β is the
product of the probabilities of all its moves, πβ= βπββπ(π).
In this way, we obtain a probability distribution on the setof lassos of the digraph. In particular, the effective payoff isdefined as the expected payoffs for the corresponding lassos.Let us also note that (in contrast to the Markovian case)
the computation of limit distribution is not computationallyefficient, since the set of plays may grow exponentially in sizeof the digraph. No polynomial algorithm computing the limitdistribution is known for a priori realizations. Returning toour exampleG
1, we obtain the following limit distribution:
(1 β π1, π1(1 β π
2) , π1π2(1 β π
3) , π1π2π3)
for the outcomes (π1, π2, π3, π) ,
(6)
with initial position V1. The probability of outcome π is
π1π2π3; it is strictly positive whenever π
π> 0 for all π β πΌ.
Indeed, in contrast to theMarkovian realization, the cyclewillbe repeated infinitely whenever it appears once under a priorirealization.
Remark 10. Thus, solving the Chess-like games in the inde-pendently mixed strategies looks more natural under apriori (rather than Markovian) realizations. Unfortunately, itseems not that easy to suggest more applications of a priorirealizations and we have to acknowledge that the conceptof the Markovian realization is much more fruitful. Let usalso note that playing in pure strategies can be viewed as aspecial case of both Markovian and a priori realizations withdegenerate probability distributions.
As we already mentioned, the mixed and independentlymixed strategies coincide for G
1since it is play-once. Yet,
these two classes of strategies differ inG2.
3. Chess-Like Games with No Uniform NE
In the present paper, we will strengthen Theorems 7 and 8,showing that games (G
1, π’) and (G
2, π’) may fail to have an
NE (not only in pure, but even) in mixed strategies, as well asin the independentlymixed strategies, under bothMarkovianand a priori realizations.
For convenience, let π½ = {1, . . . , π} denote the set ofindices of nonterminal positions. We will refer to positionsgiving only these indices.
Let us recall the definition of payoff function πV0(π, π₯) ofplayer π for the initial position V
0and the strategy profile
π₯; see Theorem 5. Let us extend this definition introducingthe payoff function for the mixed and independently mixedstrategies. In both cases, we define it as the expected payoff,under one of the above realizations, and denote by πΉV0(π, π),where π is an π-vector whose πth coordinate π
πis the
probability of proceeding (not terminating) at position π β π½.
Remark 11. Let us observe that, in both G1and G
2, the
payoff functionsπΉV0(π, π), π β πΌ are continuously differentiablefunctions of π
πwhen 0 < π
π< 1 for all π β π½, for all players
π β πΌ. Hence, if π is a uniform NE such that 0 < ππ< 1 for all
π β π½ (under either a priori or Markovian realization), then
ππΉV0(π, π)
πππ
= 0, βπ β πΌ, π β π½, V0β π. (7)
In the next two sections, we will construct games thathave no uniform NE under both, a priori and Markovian,
-
GameTheory 7
realizations. Assuming that a uniform mixed NE exists, wewill obtain a contradiction with (7) whenever 0 < π
π< 1 for
all π β π½.
3.1. (G1,π’) Examples. The next lemma will be instrumental
in the proofs of the following two theorems.
Lemma 12. The probabilities to proceed satisfy 0 < ππ< 1 for
all π β π½ = {1, 2, 3} in any independently mixed uniform NEin game (G
1, π’), where π’ β π
1, and under both a priori and
Markovian realizations.
Proof. Let us assume indirectly that there is an (indepen-dently) mixed uniform NE under a priori realization withππ= 0 for some π β π½. This would imply the existence
of an acyclic game with uniform NE, in contradiction withTheorem 7. Now let us consider the case π
π= 1. Due to the
circular symmetry of (G1, π’), we can choose any player, say,
π = 1. The preference list of player 3 is π’(3, π1) > π’(3, π
3) >
π’(3, π2) > π’(3, π). His most favorable outcome, π
1, is not
achievable since π1= 1. Hence, π
3= 0 because his second
best outcome is π3. Thus, the game is reduced to an acyclic
one, in contradiction withTheorem 7, again.
Theorem 13. Game (G1, π’) has no uniform NE in indepen-
dently mixed strategies under a priori realization wheneverπ’ β π
1.
Proof. To simplify our notation we denote by π+and π
β
the following and preceding positions along the 3-cycle ofG1, respectively. Assume indirectly that (π
1, π2, π3) forms a
uniform NE and considers the effective payoff of player 1:
πΉπ(1, π) = (1 β π
π) π’ (1, π
π) + ππ(1 β π
π+) π’ (1, π
π+)
+ ππππ+(1 β π
πβ) π’ (1, π
πβ) + ππππ+ππβπ’ (1, π) ,
(8)
where π is the initial position.By Lemma 12, we must have 0 < π
π< 1 for π β π½ =
{1, 2, 3}. Therefore (7) must hold. Hence, (ππΉπ(1, π)/ππ
πβ) =
ππππ+(π’(1, π) β π’(1, π
πβ)) = 0 and π
πππ+
= 0 follows sinceπ’(1, ππβ) > π’(1, π). Thus, π
1π2π3= 0, in contradiction to our
assumption.
Let us recall that for G1, independently mixed strategies
and mixed strategies are the same.Now, let us consider the Markovian realization. Game
(G1, π’)may have noNE inmixed strategies underMarkovian
realization either, yet, only for some special payoffs π’ β π1.
Theorem 14. Game (G1, π’), with π’ β π
1, has no uniform NE
in independentlymixed strategies underMarkovian realizationif and only if π
1π2π3β₯ 1, where
π1=
π’ (1, π2) β π’ (1, π
1)
π’ (1, π1) β π’ (1, π
3)
,
π2=
π’ (2, π3) β π’ (2, π
2)
π’ (2, π2) β π’ (2, π
1)
,
π3=
π’ (3, π1) β π’ (3, π
3)
π’ (3, π3) β π’ (3, π
2)
.
(9)
It is easy to verify that ππ> 0 for π = 1, 2, 3 whenever
π’ β π1. Let us also note that in the symmetric case π
1= π2=
π3= π the above condition π
1π2π3β₯ 1 turns into π β₯ 1.
Proof. Let π = (π1, π2, π3) be a uniform NE in the game
(G1, π’) underMarkovian realization.Then, by Lemma 12, 0 <
ππ< 1 for π β πΌ = {1, 2, 3}.Thepayoff function of a player, with
respect to the initial position that this player controls, is givenby one of the next three formulas:
πΉ1(1, π)
=
(1 β π1) π’ (1, π
1)+π1(1 β π
2) π’ (1, π
2)+π1π2(1 β π
3) π’ (1, π
3)
1 β π1π2π3
,
πΉ2(2, π)
=
(1 β π2) π’ (2, π
2)+π2(1 β π
3) π’ (2, π
3)+π2π3(1 β π
1) π’ (2, π
1)
1 β π1π2π3
,
πΉ3(3, π)
=
(1 β π3) π’ (3, π
3)+π3(1 β π
1) π’ (3, π
1)+π3π1(1 β π
2) π’ (3, π
2)
1 β π1π2π3
.
(10)
By Lemma 12, (7) holds for any uniformNE.Therefore wehave
(1 β π1π2π3)2 ππΉ1
(1, π)
ππ1
= π2(1 β π
3) π’ (1, π
3) + (π
2π3β 1) π’ (1, π
1)
+ (1 β π2) π’ (1, π
2) = 0,
(1 β π1π2π3)2 ππΉ2
(2, π)
ππ2
= π3(1 β π
1) π’ (2, π
1) + (π
1π3β 1) π’ (2, π
2)
+ (1 β π3) π’ (2, π
3) = 0,
(1 β π1π2π3)2 ππΉ3
(3, π)
ππ3
= π1(1 β π
2) π’ (3, π
2) + (π
1π2β 1) π’ (3, π
3)
+ (1 β π1) π’ (3, π
1) = 0.
(11)
Setting ππ= ππ+ 1 for π = 1, 2, 3, we can transform the
above equations to the following form:
π1(1 β π
2) = 1 β π
2π3,
π2(1 β π
3) = 1 β π
1π3,
π3(1 β π
1) = 1 β π
1π2.
(12)
-
8 GameTheory
Assuming 0 < ππ< 1, π β π½ and using successive elimination,
we uniquely express π via π as follows:
0 < π1=
π2+ π3β π1π2β π2π3+ π1π2π3β 1
π1π3β π1+ 1
< 1,
0 < π2=
π1+ π3β π1π3β π2π3+ π1π2π3β 1
π1π2β π2+ 1
< 1,
0 < π3=
π1+ π2β π1π2β π1π3+ π1π2π3β 1
π2π3β π3+ 1
< 1.
(13)
Interestingly, all three ππ< 1 inequalities are equivalent
with the condition (π1β1)(π
2β1)(π
3β1) < 1, that is,π
1π2π3<
1, which completes the proof.
3.2. (G2,π’) Examples. Here we will show that (G
2, π’) may
have no uniform NE for both Markovian and a priorirealizations, in independently mixed strategies, wheneverπ’ β π
2. As for the mixed (unlike the independently mixed)
strategies, we obtain NE-free examples only for some (not forall) π’ β π
2.
We begin with extending Lemma 12 to game (G2, π’) and
π’ β π2as follows.
Lemma 15. The probabilities to proceed satisfy 0 < ππ< 1 for
all π β π½ = {1, 2, 3, 4, 5, 6} in any independently mixed uniformNE in game (G
2, π’), where π’ β π
2, and under both a priori
and Markovian realizations.
Proof. To prove that ππ< 1 for all π β π½ let us consider the
following six cases:
(i) If π1= 1, then player 2 will proceed at position 6, as
π2> π6in π2, implying π
6= 1.
(ii) If π2= 1, then either π
1= 0 or π
3= 1, as player 1,
prefers π1to π3.
(iii) If π3= 1, then π
2= 0, as player 2 cannot achieve his
best outcome of π3, while π
2is his second best one.
(iv) If π4= 1, then π
3= 1, as player 1βs worst outcome is
π3in the current situation.
(v) If π5= 1, then π
4= 1, as player 2, prefers π
6to π4.
(vi) If π6= 1, then π
5= 0, as player 1βs best outcome, is π
5
now.
It is easy to verify that, by the above implications, in all sixcases at least one of the proceeding probabilities should be 0,in contradiction toTheorem 8.
Let us show that the game (G2, π’) might have no NE in
independently mixed strategies under both Markovian and apriori realizations. Let us consider the Markovian one first.
Theorem 16. Game (G2, π’) has no uniform NE in the inde-
pendently mixed strategies under Markovian realization for allπ’ β π
2.
Proof. Let us consider the uniform NE conditions for player2. Lemma 15 implies that (7) must be satisfied. Applying it tothe partial derivatives with respect to π
4and π
6we obtain
(1 β π1π2π3π4π5π6)2
π1π2π3π4π5
ππΉ1(2, π)
ππ6
= ((1 β π1) π’ (2, π
1) + π1(1 β π
2) π’ (2, π
2)
+ π1π2(1 β π
3) π’ (2, π
3)
+ π1π2π3(1 β π
4) π’ (2, π
4)
+ π1π2π3π4(1 β π
5) π’ (2, π
5)
β (1 β π1π2π3π4π5) π’ (2, π
6)) = 0,
(1 β π1π2π3π4π5π6)2
π1π2π3π5π6
ππΉ5(2, π)
ππ4
= (1 β π5) π’ (2, π
5) + π5(1 β π
6) π’ (2, π
6)
+ π5π6(1 β π
1) π’ (2, π
1) + π5π6π1(1 β π
2) π’ (2, π
2)
+ π5π6π1π2(1 β π
3) π’ (2, π
3)
β (1 β π1π2π3π5π6) π’ (2, π
4) = 0.
(14)
Let us multiply the first equation by π5π6and subtract it
from the second one, yielding
(1 β π1π2π3π4π5π6) [βπ’ (2, π
4) + (1 β π
5) π’ (2, π
5)
+π5π’ (2, π
6)] = 0,
(15)
or equivalently, π’(2, π4) β (1 β π
5)π’(2, π
5) β π5π’(2, π6) = 0.
From this equation, we find
π5=
π’ (2, π4) β π’ (2, π
5)
π’ (2, π6) β π’ (2, π
5)
. (16)
Furthermore, the condition 0 < π5< 1 implies that either
π’(2, π5) < π’(2, π
4) < π’(2, π
6) or π’(2, π
5) > π’(2, π
4) >
π’(2, π6). Both orders contradict the preference list π
2, thus,
completing the proof.
Now let us consider the case of a priori realization.
Theorem 17. Game (G2, π’) has no uniform NE in indepen-
dentlymixed strategies under a priori realization for all π’ β π2.
Proof. Let us assume indirectly that π =(π1, π2, π3, π4, π5, π6) form a uniform NE. Let us consider
the effective payoff of the player 1 with respect to the initialposition 2:
πΉ2(1, π) = (1 β π
2) π’ (1, π
2) + π2(1 β π
3) π’ (1, π
3)
+ π2π3(1 β π
4) π’ (1, π
4)
-
GameTheory 9
+ π2π3π4(1 β π
5) π’ (1, π
5)
+ π2π3π4π5(1 β π
6) π’ (1, π
6)
+ π2π3π4π5π6(1 β π
1) π’ (1, π
1) .
(17)By Lemma 15, we have 0 < π
π< 1 for π β π½ = {1, 2, 3, 4, 5, 6}.
Hence, (7) must hold; in particular, (ππΉ2(1, π)/ππ
1) = 0 and,
since π’ β π2is positive, we obtain π
2π3π4π5π6= 0, that is a
contradiction.
The last result can be extended from the independentlymixed to mixed strategies. However, the correspondingexample is constructed not for all but only for some π’ β π
2.
Theorem 18. The game (G2, π’) has no uniform, NE in mixed
strategies, at least for some π’ β π2.
Proof. Let us recall that there are two players inG2controling
three positions each and there are two possible moves inevery position. Thus, each player has eight pure strategies.Standardly, the mixed strategies are defined as probabilitydistributions on the set of the pure strategies, that is, π₯, π¦ βS8, where π§ = (π§
1, . . . , π§
8) β S8if and only if β8
π=1π§π= 1 and
π§ β₯ 0.Furthermore, let us denote by π
ππ(V0) the outcome of the
game beginning in the initial position V0β π in case when
player 1 chooses his pure strategy π and player 2 chooses herpure strategy π, where π, π β {1, . . . , 8}.
Given a utility function π’ : πΌ Γπ΄ β R, if a pair of mixedstrategies π₯, π¦ β S
8form a uniform NE then
8
β
π=1
π₯ππ’ (2, π
ππ(V0)) {
= π§V0, if π¦
π> 0
β€ π§V0, otherwise
(18)
must hold for some π§V0 value for all initial positions V0 βπ. Indeed, otherwise player 2 would change the probabilitydistribution π¦ to get a better value. Let π = {π | π¦
π> 0} denote
the set of indices of all positive components ofπ¦ β S8. By (19),
there exists a subset π β {1, . . . , π} such that the next systemis feasible:
8
β
π=1
π₯ππ’ (2, π
ππ(V0)) = π§V0
, βπ β π,
8
β
π=1
π₯ππ’ (2, π
ππ(V0)) β€ π§V0
, βπ β π,
8
β
π=1
π₯π= 1,
π₯πβ₯ 0, βπ = 1, . . . , 8,
π§V0unrestricted, βV
0β π.
(19)
Then, let us consider, for example, a utility function π’ βπ2with the following payoffs of player 2:
π’ (2, π1) = 43, π’ (2, π
2) = 81, π’ (2, π
3) = 93,
π’ (2, π4) = 50, π’ (2, π
5) = 15, π’ (2, π
1) = 80,
π’ (2, π) = 0.
(20)
We verified that (19) is infeasible for all subsets π β {1, . . . , 8}such that |π| β₯ 2. Since for anyπ’ β π
2there is no pure strategy
NE either, we obtain a contradiction.
3.3. Concluding Remarks
Remark 19. In the last two theorems, in contrast withTheorem 14, uniform NE exist for no π’ β π
2.
Remark 20. Let us note that Nashβs results [21, 22], guarantee-ing the existence of an NE in mixed strategies for any normalform games, are applicable in case of a fixed initial position.Yet, our results show that Nashβs theorem, in general, does notextend to the case of uniform NE, except for the π-personacyclic case [12, 19, 20] and the two-person zero sum cases.
Remark 21. It seems that the same holds for all π’ β π2.
We tested (19) for many randomly chosen π’ β π2and
encountered infeasibility for all π β {1, . . . , 8} such that |π| β₯2. Yet, we have no proof and it still remains open whether forany π’ β π
2there is no NE in mixed strategies.
Remark 22. Finally, let us note that for an arbitrary Chess-likegame structure (not only for G
1and G
2) in independently
mixed strategies under both the Markovian and a priorirealizations for any π β πΌ and π, π β π½, the ratio (ππΉ
π(π, π)/
πππ)/(ππΉπ(π, π)/ππ
π) = π(π, π, π) is a positive constant.
Acknowledgments
The first and third authors acknowledge the partial supportby the NSF Grants IIS-1161476 and also CMMI-0856663.The second author is thankful to JaΜnos Flesch for helpfuldiscussions. All author are also thankful to an anonymousreviewer for many helpful remarks and suggestions.
References
[1] E. Boros, K. Elbassioni, V. Gurvich, and K. Makino, βOn Nashequilibria and improvement cycles in pure positional strategiesfor Chess-like and Backgammon-like n-person games,βDiscreteMathematics, vol. 312, no. 4, pp. 772β788, 2012.
[2] D. Andersson, V. Gurvich, and T. D. Hansen, βOn acyclicity ofgames with cycles,β Discrete Applied Mathematics, vol. 158, no.10, pp. 1049β1063, 2010.
[3] D. Andersson, V. Gurvich, and T. D. Hansen, βOn acyclicity ofgames with cycles,β in Algorithmic Aspects in Information andManagement, vol. 5564, pp. 15β28, 2009.
[4] E. Boros and V. Gurvich, βOn Nash-solvability in pure station-ary strategies of finite games with perfect information whichmay have cycles,β Mathematical Social Sciences, vol. 46, no. 2,pp. 207β241, 2003.
[5] E. Boros and V. Gurvich, βWhy chess and backgammon can besolved in pure positional uniformly optimal strategies,β RUT-COR Research Report 21-2009, Rutgers University.
[6] E. Boros, V. Gurvich, K. Makino, and W. Shao, βNash-solvabletwo-person symmetric cycle game forms,β Discrete AppliedMathematics, vol. 159, no. 15, pp. 1461β1487, 2011.
-
10 GameTheory
[7] E. Boros and R. Rand, βTerminal games with three terminalshave proper Nash equilibria,β RUTCOR Research Report RRR-22-2009, Rutgers University.
[8] J. Flesch, J. Kuipers, G. Shoenmakers, and O. J. Vrieze, βSub-game-perfect equilibria in free transition games,β ResearchMemorandum RM/08/027, University of Maastricht, Maas-tricht, The Netherlands, 2008.
[9] J. Flesch, J. Kuipers, G. Shoenmakers, and O. J. Vrieze, βSub-game-perfection equilibria in stochastic games with perfectinformation and recursive payos,β Research MemorandumRM/08/041, University of Maastricht, Maastricht, The Nether-lands, 2008.
[10] J. Kuipers, J. Flesch, G. Schoenmakers, and K. Vrieze, βPuresubgame-perfect equilibria in free transition games,β EuropeanJournal of Operational Research, vol. 199, no. 2, pp. 442β447,2009.
[11] J. Flesch, J. Kuipers, G. Schoenmakers, and K. Vrieze, βSubgameperfection in positive recursive games with perfect informa-tion,βMathematics of Operations Research, vol. 35, no. 1, pp. 193β207, 2010.
[12] D.Gale, βA theory ofN-person gameswith perfect information,βProceedings of the National Academy of Sciences, vol. 39, no. 6,pp. 496β501, 1953.
[13] V. A. Gurvich, βOn theory of multistep games,βUSSR Computa-tional Mathematics and Mathematical Physics, vol. 13, no. 6, pp.143β161, 1973.
[14] V. A. Gurvich, βThe solvability of positional games in purestrategies,β USSR Computational Mathematics and Mathemat-ical Physics, vol. 15, no. 2, pp. 74β87, 1975.
[15] V. Gurvich, βEquilibrium in pure strategies,β Soviet Mathemat-ics, vol. 38, no. 3, pp. 597β602, 1989.
[16] V. Gurvich, βA stochastic game with complete information andwithout equilibrium situations in pure stationary strategies,βRussian Mathematical Surveys, vol. 43, no. 2, pp. 171β172, 1988.
[17] V. Gurvich, βA theorem on the existence of equilibrium situ-ations in pure stationary strategies for ergodic extensions of(2 Γ π) bimatrix games,β Russian Mathematical Surveys, vol. 45,no. 4, pp. 170β172, 1990.
[18] V. Gurvich, βSaddle point in pure strategies,β Russian Academyof ScienceDokladyMathematics, vol. 42, no. 2, pp. 497β501, 1990.
[19] H. Kuhn, βExtensive games,β Proceedings of the NationalAcademy of Sciences, vol. 36, pp. 286β295, 1950.
[20] H. Kuhn, βExtensive games and the problems of information,βAnnals of Mathematics Studies, vol. 28, pp. 193β216, 1953.
[21] J. Nash, βEquilibrium points in n-person games,β Proceedings ofthe National Academy of Sciences, vol. 36, no. 1, pp. 48β49, 1950.
[22] J. Nash, βNon-cooperative games,β Annals of Mathematics, vol.54, no. 2, pp. 286β295, 1951.
[23] A. Condon, βAn algorithm for simple stochastic games,β inAdvances in Computational Complexity Theory, vol. 13 ofDIMACS series in discrete mathematics and theoretical computerscience, 1993.
[24] I. V. Romanovsky, βOn the solvability of Bellmanβs functionalequation for a Markovian decision process,β Journal of Mathe-matical Analysis and Applications, vol. 42, no. 2, pp. 485β498,1973.
[25] R. A. Howard, Dynamic Programming and Markov Processes,The M.I.T. Press, 1960.
[26] H. Mine and S. Osaki, Markovian Decision Process, AmericanElsevier, New York, NY, USA, 1970.
[27] J. G. Kemeny and J. L. Snell, Finite Markov Chains, Springer,1960.
-
Submit your manuscripts athttp://www.hindawi.com
Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttp://www.hindawi.com
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
Journal of
Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
CombinatoricsHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
International Journal of
Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014
Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttp://www.hindawi.com
Volume 2014 Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014
Stochastic AnalysisInternational Journal of