[ieee 2011 wireless advanced (wiad) (formerly known as spwc) - london, united kingdom...

Evolutionary Coalitional Games in NetworkSelection

Manzoor Ahmed Khan*, Hamidou Tembine†*DAI-Labor Technical University Berlin, Germany †SUPELEC, France

Abstract—The widespread use of heterogeneous wireless tech-nologies, their integration, advent of multi-mode terminals, andthe envisioned user (service) centricity enable users to getassociated with the best available networks according to userpreferences and application specific requirements. When it comesto user-centric network selection, operators with the view toincreasing their user-pool may offer different incentives to theusers to motivate them form coalitions. In this paper, we present anovel approach of coalition network selection. We use evolution-ary game-theory to model this problem. We also examine fullydistributed algorithms for global optima in network selectiongames. Then, the problem of dynamic network formation andevolutionary coalitional games in network selection networks areinvestigated. This article also points out the open research issues,new interests and developments in this interdisciplinary field.

I. INTRODUCTION

The business models of telecommunication operators havetraditionally been based on the concept of the so called closedgarden: they operate strictly in closed infrastructures and basetheir revenue-generating models on their capacity to retain aset of customers and effectively establish technological andeconomical barriers to prevent or discourage users from beingable to utilize services and resources offered by other oper-ators. The advent and evolution of wireless communicationhave significantly influenced the telecommunication businessmodel e.g., 4th Generation (4G), which is envisioned as path tosolution of still unaddressed issues of the previous generationsand to provide a convergence platform for a wide variety ofnew services, from high-quality voice to high-definition video,through high-data-rate wireless channels. At this stage we mayclaim that we are now beyond the angle of Technology X VsTechnology Y, to recognize a world with a burgeoning varietyof uses of wireless broadband capacity which will require allkinds of spectrum bands, technologies and devices. The giantsoperators and the new entrants wireless operators are nowinterested in more flexible architectures that will enable themto keep their options open, or run parallel systems accordingto their spectrum and applications. For operators the moststraightforward method to combat the changing telecommuni-cation paradigm is to integrate the already deployed (or new)technologies of different characteristics and strive to increaseits user pool for different extended services. Clearly, if on onehand it requires strong economical reasoning and model, onthe other hand the technical feasibility of undergoing suchintegration is also a critical issue. The pattern of researchliterature addressing such problems can broadly be classifiedinto those, presenting the numerical or analytical solutions

and not going much into the architectural details e.g.,[9],whereas the other category may be termed as the one purelyconcentrating on the architectural solution e.g.,[5]. We alsobelieve that a user-centric vision will be mandatory evolutiontrend in future all-IP 4G networks as it represents the mostefficient way to ensure an Always Best Connectivity (ABC)service.

A common understanding of the term ”best” in the ABCtakes different definitions dictated by the user preferences.Network selection may take into account various static anddynamic parameters e.g., delay, bandwidth, packet loss ratio,mobile user velocity, service pricing, user preferences, appli-cation type etc. Intuitively the consequence of ABC vision is:• QoS handover - The reason for execution of such han-

dovers is mainly the quality degradation of the currentpoint of attachment or the change in user preferencesover the application QoS.

• Area handover - Such handovers are executed as theconsequence of user mobility out of the coverage of thecurrent point of attachment.

In the user-centric paradigm, where users are enabled todynamically select the network following the ABC vision,the user terminal triggers the vertical decision mechanism forexecuting the handover decision. The handover decision isdriven by user satisfaction function. It is important to noticehere is that user decision over the association to the point ofattachment is differentiated with respect to various hierarchicallevels i.e., at technology level - lower level and at operatorlevel - the higher level. However for the sake of simplicity,we in this work focus on the operator level. The former levelis self explanatory, whereas the later level provides operatorthe opportunity to make use of all the available underlyingnetwork technologies. In the later case the default assumptionis that user is interested only in attaining the required QoEand has no preferences over the specific network technologies,this further leverages the operators to implement different loadbalancing and resource utilization operators e.g., enabling loadsharing among the its different access technologies (see [8],[7]) etc.

II. RELATED WORK

Most of the earlier research literature concentrating onuser-centric network selection focuses a single user networkselection decision making e.g., [6], [10],[20], whereas theresearch literature concentrating cooperation at the operatorlevel (e.g, [11],[4],[19]) is mostly network centric and do not

2011 Wireless Advanced

978-1-4577-0109-2/11/$26.00 ©2011 IEEE 185

consider the user-centric paradigm. It is envisioned that infuture communication paradigm the users will be able to formcoalitions. Thus opening a new front, which we term as thecoalitional network selection.

For over two decades, classical game theory has been in-tensively applied in wireless networking and communications(see [1] and the references therein).

However, as the wireless nodes become mobile, moreautonomous, self-organizing, self-configuring, and the accessnetworks are more decentralized, the associated mathematicaltools need to be adapted. Since one-shot game models (coop-erative or not) do not allow to update strategies, do not allowerror corrections, the game theorists have proposed dynamicgame theory. These include stochastic games, evolutionarygames, differential games and their ramifications. Dynamicgame theory is a more appropriate framework and capturesmore the behaviors of future envisioned wireless networkswhere randomness, time delays and uncertainty are present.One of the dynamic game theoretic modeling is evolutionarygame theory (EGT). It goes back at least to the works byFisher (1930), Hamilton (1964), Maynard Smith (1972). Formore details on evolutionary game dynamics, we refer to[13]. Following our argument on the need and justificationof evolutionary game theory to future wireless networks, wefocus on the application of dynamic games to the cooperativeuser behavior in future user-centric scenario. This dictatesthe use of coalitional evolutionary game-theory for modelingthe user-centric coalition formation and coalition networkselection problems. Many researchers are currently engaged indeveloping evolutionary game theory based schemes in large-scale wireless networks that allows to describe evolution of co-operation, evolutionary network formation, dynamic networksecurity, evolution of protocols, network neutrality, randomgraph-based topology and architectures.

In terms of applications, we are confident to claim that thecurrent research is restricted to applying standard coalitionalgame models and techniques to study very limited aspectsof cooperation in wireless networks. At the same time, theimplementation of coalitional games in large-scale wirelessnetworks encounters several challenges such as appropriatemodeling, efficiency, stability, signalling reduction, complex-ity, fairness, incomplete information and mobility manage-ment.

To the best of author’s knowledge, this work is the amongstthe first contributions that focuses on user-centric coalitionnetwork selection and apply the evolutionary game theoreticconcept to user-centric network selection approach. We alsobriefly discuss the coalition formation at the operator level.

We make use of dynamic coalitional games for the proposedcoalition network selection approach. Dynamic coalition gameis a very promising tool for designing fair, robust, practical,and efficient adaptive coalitional strategies in wireless net-works where the network conditions evolve dynamically (i.e.,backoff state, channel sate, arrival of users, departure of usersetc.). The evolutionary coalitional game modeling is one ofthe dynamic coalitional game framework. It is very powerfulframework inspired from biology, genetics and evolutionaryecology.

As in standard coalitional games, the central concept ofevolutionary coalitional game is that of coalition i.e., a subsetof players that join their forces and decide to act together(jointly): the players form their coalitions to join optimal pay-offs, players from the same group can negotiate collectively.For each subset of players, one associates a value, generating afunction called characteristic form. In wireless networks, oneneeds to take into consideration the cost of communicationsand costs of bargaining for a join decision so that the valuewill be a function of the benefit and the cost.

A. Contribution

Our contribution can be summarized as follows. We firstpresent the evolutionary ingredients that constitute the fun-damentals of evolutionary coalitional games as well as theirpotential applications in wireless networking and communica-tions in general, and coalition network selection in special.

Second we provide a better understanding of the currentresearch issues in this emerging direction.

Finally, attempt an investigation into pertaining design con-straints and outline the use of evolutionary dynamics tools tomeet certain design objectives.

B. Organization

The remainder of the article is organized as follows. In nextsection we present the basic ingredients of evolutionary gametheory. Then, we overview EGT in wireless networking. Afterthat we provide the ingredients of evolutionary coalitionalgames and illustrate the evolution of coalitions and networkformation in spectrum access.

III. BASIC INGREDIENTS OF EGT

The basic ingredients of evolutionary (coalitional or not)game theory are the solution concepts (equilibrium and itsrefinement such as evolutionary stability) and evolutionarygame dynamics. Below we describe these two key elements.

A. Equilibrium and refinement

1) Equilibrium concepts: Evolutionary networking gamesin large systems provides a simple framework for describingstrategic interactions among large number of players (mobileterminals, base stations, access points, users etc). Traditionally,predictions of behavior and outcome in game theory are basedon some notion of equilibrium, typically Cournot equilib-rium (Cournot, 1838), Bertrand equilibrium (Bertrand, 1883),conjectural variation (Bowley, 1924), Stackelberg solution(Stackelberg, 1934), Nash equilibrium (Nash, 1951), Wardropequilibrium (Wardrop, 1952), mean field equilibrium or somerefinement and/or extensions thereof. Most of these notionsrequire the assumption of equilibrium knowledge, which as-sume that that each player correctly anticipates how the otherplayers will react. The equilibrium knowledge assumption istoo strong and is difficult to justify in particular in contextwith large number of users in dense networks.

186

2) Evolutionary stability: An evolutionarily stable state orstrategy (ESS) is a population profile which, if adopted by apopulation of players, cannot be invaded by any alternativepopulation profile of small size. An ESS is an equilibriumrefinement of the Nash equilibrium – it is a Nash equilibriumwhich is evolutionarily stable meaning that once it is fixedin a population, it is resilient to invasion by small fractionof the population. Other refinement have been explored inevolutionary games: unbeatable state, neutrally stable state,non-invadable state, risk-dominant state or payoff-dominant,correlated evolutionary stable state, evolutionary stable set ,stochastically stable state, continuously stable state, globalevolutionarily stable state (GESS, the analogue of ESS formultiple population).

B. Evolutionary game dynamics

As an alternative to the equilibrium approach, the evolution-ary game approach propose an explicitly dynamic updatingchoice, a model in which players myopically update theirbehavior in response to their current strategic environment.This dynamic procedure does not assume the automatic co-ordination of players’ actions and beliefs, and it can derivemany players’ actions and transition rates. These proceduresare specified formally by defining a revision of pure strategiescalled revision protocol. A revision protocol takes current costs(expected performance) and the system state as arguments;its outputs are conditional switch rates which describe howfrequently players in some class playing strategy who areconsidering switching strategies switch to another strategy,given that the current expected cost vector and subpopulationstate. This revision of pure strategies is flexible enough toincorporating a wide variety of paradigms, including onesbased on learning, imitation, adaptation, optimization, etc. Therevision of pure strategies describe the procedures playersfollow in adapting their behavior to in the dynamic evolvingenvironment such as evolving networks (Internet traffic, flowcontrol etc.). Simple evolutionary game dynamics are replica-tor dynamics, Brown-von Neumann-Nash dynamics, fictitiousplay, adaptive dynamics, imitate “the better” dynamics, bestresponse dynamics, better-reply dynamics, logit or Boltzmann-Gibbs or log-linear dynamics, Smith dynamics, projectiondynamics, gradient methods, generating (G-)function baseddynamics, evolutionary game dynamics with diffusion, evo-lutionary game dynamics with migration, spatial evolutionarygame dynamics with migration and time delays.

IV. EGT IN WIRELESS NETWORKING

A. Delayed evolutionary game dynamics

Evolutionary games in wireless networks have been studiedin [18] with particular emphasis to application of delayedevolutionary game dynamics. The idea of time delayed payoffsin dynamic games is the following: usually it is assumedthat the users (players) receive their payoffs instantaneously.However in many realistic scenarios, the observations aredelayed with some time unit. In the context of wirelessnetworks, this can be due feedback delays, noise, propagationdelays etc. To capture this phenomenon, time delays have been

introduced into evolutionary game dynamics. This means thatan action taken today will have its effects after some timedelays. Therefore the payoffs are delayed. The idea has beenapplied in access control and power control in both IEEE802.16 OFDMA-based wireless networks (WiMAX) and CodeDivision Multiple Access (CDMA) based networks. See [16]for more details.

B. Convergence issue

Evolutionary game dynamics provides a powerful tool forprediction of the outcome of a game. Convergence to equilibriahas been established in many classes of games. These includegeneric two-users-two-actions games, common interest games,potential games, sub/super-modular games, many classes ofaggregative games, games with unique evolutionarily stablestrategy, stable games, games with monotone payoffs, etc.,see [13]. As a consequence convergence issue of severalnetworking problems including parallel routing, routing withM/M/1 cost (Poisson arrival process, exponentially distributedservice time and single server queue), network congestiongames, network selection games, power allocation games, ratecontrol, spectrum access games, resource sharing games, andmany others, can be investigated through evolutionary gamedynamics for both linear and non-linear payoff functions.

C. Selection issue

A fundamental question we address now is the selectionproblem (equilibrium, Pareto optimal solutions, global opti-mum etc.) in a fully distributed way (minimal signalling tothe users, no message exchange, no recommendation etc.).

1) How to select an efficient outcome?: The problem ofselection of a global optimum in a fully decentralized wayis a very challenging problem. To the best to the authorsknowledge, very little is known in this way. In the nextsection we provide examples of games where fully distributedreinforcement learning algorithms lead to evolutionary gamedynamics that converge to global optima. However, in generalin evolutionary game dynamics, the convergence issues areexamined for equilibria.

2) How to select a stable outcome ?: Another importantquestion is the stability/instability of the system. How todesign fast algorithms such that a network that behave wellafter some iterations? This question will be translated by thefact that when time horizon is large, the behavior of the systemlooks like at a stationary and any small perturbation aroundthis point will quickly come back to that point due to losses. Itis important to mention that the two properties: Stability andEfficiency may not be compatible in some situations.

D. Fully distributed reinforcement learningFrom most of the evolutionary game dynamics one can con-

struct fully distributed learning algorithms where the users canadapt its strategy in an iterative fashion without knowing themathematical expression of its payoff function: only numericalmeasurement will be observed by the user as it is usuallydone in machine learning. Now, each player needs to learn itsown payoff function and its optimal strategy in parallel. This

187

1\ 2 f1 f2

f1 (0,0): collision (1,1): successf2 (1,1): success (0,0): collision

TABLE ISTRATEGIC FORM REPRESENTATION OF 2 NODES - 2 CHOICES

type of learning scheme is called combined fully distributedpayoff and strategy reinforcement learning (CODIPAS-RL,[16], [17]). Using standard stochastic approximations tools, theasymptotic pseudo-trajectories of these schemes can be reliedto the evolutionary game dynamics possibly time-dependent,delayed and noisy. The basic representation of CODIPAS-RLupdating scheme has the following form:

Newstrategy←− Oldstrategy + Stepsize (learning-rule - Oldstrategy) (1)Newestimate←− Oldestimate + Stepsize (Target - Oldestimate) (2)

where the target and “learning-rule” play the role of thecurrent strategy and current measurement/observation. Theexpression [Target - Oldestimate] is an error in the estimation.It is reduced by taking a step size toward the target. The targetis presumed to indicate a desirable direction in which to move.Well-known examples of strategy-learning are based on Bush& Mosteller (1955, [3]) learning schemes and their variations.These schemes have been extended to stochastic games usingBellman’s dynamic programming (Bellman, 1952, [2]) andShapley principle (Shapley, 1953, [14]). More recent appli-cations to engineering can be found in [17]. Most often thelimiting behavior of the stochastic iterative schemes are relatedto the well-known evolutionary game dynamics.

E. Application of evolutionary games to network selectionproblem

In this section, we detail the application of evolutionarygame theory to our problem of network selection. We startwith relatively simple case Access Points (APs) of differentfrequencies selection and then model the novel coalitionnetwork selection at two levels namely: i) operator level -the configuration in which operators from coalition(s), ii) userlevel - the configuration in which users form coalitions. Boththese configurations are further detailed later in this section.

1) Simplified AP selection problem: Let us consider a sce-nario, where a user in the coverage area of two APs operatingat different frequencies. We assume that access technologiesare interference limited i.e., user associating with an AP affectsthe quality of already associated users to this AP. In orderto grab the crux of the proposed approach, we simplify theproblem to a two player game however it should be notedthat the model and solution is still scalable to n−number ofusers. We assume that each user can select one of APs at atime, when both the users select the same AP at the same timeinstances, without the loss of generality, we assume that boththe users will end-up with zero utility. The zero utility can beinterpreted as the consequence of the collision and data loss.The Table I represents the matrix form of the stated scenario.

As can be observed that the game has two pure equilibria(f1, f2) (f2, f1) and one fully mixed equilibrium ( 12 ,

12 ) which

is also evolutionarily stable strategy in the sense that it isresilient to deviations by small change of investment. Thefully mixed equilibrium is less efficient in terms of socialwelfare. The pure equilibria are Strong equilibria (robust toany coalition of any size). The two pure equilibria are maxminsolutions in the sense that the minimum payoffs of the usersis maximized. It is easy to see that the two pure equilibria arealso global optima. Now, a natural question is,

Is there a fully distributed learning scheme to converges toglobal optima?

The answer to this question is positive for most of the initialconditions in the case of two-users-two-choices. The set ofinitial conditions under which convergence to global optimais observed is of measure 1.

Let xt denotes the probability for user 1 to choose f1 attime t, and yt the probability for user 2 to choose f1 at timet. Using the iterative scheme

xt+1 = xt + λtu1,t(1l{a1,t=f1} − xt

)(3)

yt+1 = yt + λtu2,t(1l{a2,t=f1} − yt

)(4)

one can find the asymptotic pseudo-trajectory for λt = λconstant (convergence in law) or for time-varying λt satisfyingλt > 0,

∑t′ λt′ = +∞,

∑t′ λ

2t′ < +∞. The term 1l{a1,t=f1}

represents the indicator function. It is equal to 1 if the user 1has chosen f1 at time t i.e a1,t = f1 and 0 otherwise.

Theorem 1: The algorithm given by the system (3) and (4)can be tracked asymptotically by a solution of a differentialequation:

x = x(1− x)(1− 2y), (5)y = y(1− y)(1− 2x), (6)

Proof: By standard stochastic approximations one canshow that the rescaled process from (xt, yt) is asymptoticallyclose to a solution of some differential equation. Here weidentify the exact differential equation.

To obtain this, we compute the expected change in one-timeslot, also called drift:

E(xt+1 − xt

λt| xt = x, yt = y

)= x(1− x)(1− 2y).

We do the same work for yt. Since we work in the unit square,the gap between the expected term and the random variable isa martingale difference. Moreover the norm of this martingaleis bounded by the norm of (x, y). We deduce that the followingresult:

the asymptotic pseudo-trajectories give the replicator dy-namics.

If x denotes the probability for user 1 to choose f1 andy the probability for user 2 to choose f1 then the ordinarydifferential satisfied by x and y are:

x = x(1− x)(1− 2y), y = y(1− y)(1− 2x), (7)

Define rest points (or stationary points) of the system as thezeros: x = 0, y = 0.

Theorem 2: The set of rest points of the dynamics containsboth the set of equilibria and the set of global optima.

188

Proof: The rest points of the system are obtained byfinding the zeros of the right hand side of the system. Thezeros are (0, 0), (1, 0), (0, 1), (1, 1), ( 12 ,

12 ). Thus, the set of

equilibria of the game {(1, 0), (0, 1), ( 12 ,12 )} is in the set of

rest points. The set of global optima {(1, 0), (0, 1)} is also inthe set of rest points

Theorem 3: Starting from any point in the unit square[0, 1]2 outside the segment y = x, the system converges tothe set of global optima.This result gives global convergence to efficient point (globaloptimum) for almost all initial conditions. We say almostall initial points because the diagonal and the anti-diagonalsegments are of Lebesgue measure zero (in two dimension)compared to the measure of the square [0, 1]2.

Proof: By computing the Jacobian at each of the 5rest points, we check that (1, 0) and (0, 1) are stable, andother 3 rest points are unstable (the Jacobian have a positiveeigenvalue). Then, we built the vector field of our dynamicalsystem. Starting from any point in the unit square [0, 1]2

outside the segments y = x, the system converges to the corner(1, 0) or to (0, 1) depending if the starting point is more at theleft corner or the right corner. We conclude that the systemconverges to one of the global optima {(1, 0), (0, 1)}.

As a corollary, we deduce that by well-choosing the learningparameters, say λt = 1

5+t , the fully distributed learningalgorithm converges almost surely to global optima, which isa very interesting property.

Now, what happens if the starting points are in the diagonalsegment?

These cases correspond to symmetric configurations and thesystem is reduced to one dynamical equation

x = x(1− x)(1− 2x).

We say that x∗ = (x∗f1 , x∗f2) is an evolutionarily stable

strategy if for any x 6= x∗ there exists an εx > 0 such that∑f∈{f1,f2}

(x∗f − xf )uf (εx+ (1− ε)x∗) > 0, ∀ε ∈ (0, εx).

The following theorem conducts the analysis of the symmetriccase.

Theorem 4: Now, we consider symmetric configuration.• the symmetric game has a unique evolutionarily stable

strategy which is given by ( 12 ,12 ).

• the system goes to the unique evolutionarily stable strat-egy starting from any interior point x0 ∈ (0, 1).

Proof: In symmetric configurations, the evolutionarilystable strategies should be symmetric equilibria. Thus, wehave to check among the set of symmetric equilibria whichis reduced to ( 12 ,

12 ). We verify that ( 12 ,

12 ) satisfies

(1

2− x, 1

2− (1− x))

(1−xx

)= 2(

1

2− x)2

which is strictly greater than 0 for any x 6= 12 . We conclude

( 12 ,12 ) is an evolutionarily stable strategy (ESS). Since 1

2 is aglobal attractor at the interior, the dynamic system convergesglobal to 1

2 starting from any point x0 ∈ (0, 1). This completesthe proof.

0.9

1.0

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

Eval

utio

n of

str

ateg

ies

0 1 2 3 4 5 6 7 8 9 10Time

LEGENDuser1 probability of choosing f1

user2 probability of choosing f1

Fig. 1. Convergence to global optimum using imitation dynamics.

0

xf1

(t)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Fig. 2. Vector field of imitation dynamics: description of all the possibletrajectories starting from the unit square.

-0.2 0 0.2 0.4 0.6 0.8 1.0 1.2-0.2

0

0.2

0.4

0.6

0.8

1.0

1.2

Fig. 3. Vector field of replicator dynamics.

The results for the mentioned network selection problem arepresented in Figures 1 and 2. Fig 1 illustrates the convergenceto global optimum of the network selection problem and Fig2 shows all the possible trajectories (vector field). Except thediagonal and the anti-diagonal segments, all the trajectorieslead to a global optimum. We further illustrate the solutionresults under imitation dynamics in Fig 2 and in Fig 3 underreplicator dynamics.

V. EVOLUTIONARY COALITIONAL GAMES IN WIRELESSNETWORKS

In this section, present the analytical framework based onthe coalitional game-theory. As envisioned that stake-holdersin future user-centric network selection paradigm may decideto pool their resources (bandwidth, infrastructure resourcesetc.) in case of operators or generating a combined coalitionservice request in case of users. The motivation for coalitionat both levels is driven the objective function maximization ofboth the stake-holders. Coalitional game theory can be usedfor modeling any form of confederation, alliance or wirelessnetwork community formation.

189

We will devise techniques by which entities autonomouslydecide whether it is profitable or costly to coalesce. Thisneeds an exploration of the resulting aggregate cost or payoffof a coalition is allocated among participant entities (usingdynamical Shapley value or any other imputation procedurethat are time-varying and time-consistent) and will characterizethe space and properties of allocations such that immunity tocoalition formation is guaranteed. Thus, we will understandrules that enforce or discourage coalition formations, depend-ing on whether coalition is to enforce (such as when multipleentities collaborate in an overlay network) or to defend against(e.g., an instance of a collusive malicious action). This willenable a precise prediction of network behavior and lead tonetwork engineering where entities will enjoy high utilityand no one is harmed. Different performance objectives needto be explored at different levels, such as QoE at the userlevel, which is the function of achievable throughput, servicecost and other QoS parameters including delay, packet loss,and jitter. We contribute with the utility based user QoErepresentation. Plugging the values of different parameters onecan get the estimated QoE for any real-time and non-real-timeapplications. It should be noted that the study of coalitionformation games and their evolution is of particular interest,especially in cases where the bit-level intricacies encounteredin wireless networks are taken into account, as captured byinformation theory. Advanced receiver and spatial processingdesigns, novel means for treating interference and the possibil-ity of mixing separate traffic streams through network codingwill allow for an information theory-modulated coalitionalgame theory and a characterization of the set of networkoperational points that are achieved by coalition formationand negotiations. Coalitional game theory in wireless networkshave been widely investigated in the literature. We refer thereader to the recent tutorial in [12] and the references therein.

Since cooperative game theory is concerned primarily withcoalitions, one of the problems is how to divide the utilityamong the members of the formed coalition. The basis of thistheory was laid by John von Neumann & Oskar Morgenstern(1944) with coalitional games in characteristic function form,known also as transferable utility games (TU-games). Thetheory has been extended to non-transferable utility games(NTU-games).

The formation of coalitions, their values and the long-runevolution of coalition play an important role in opportunisticnetworks [15] and in reputation systems etc. Most of theliterature in communication networks dealing with cooper-ative games and formation of coalition do not address theissues related to the dynamic solution concept (over timeand randomness). Most conflict situations are not “one-shot”coalitional games but continue over some time horizon (finiteor infinite). This paper, however, deals also with the evolutionof coalitions. Since network selection problems are dynamicin nature, one needs to adapt the classical coalitional gameapproach to dynamic coalitional game theory in order tocapture the realistic behaviors that describe the networks. Theidea of evolutionary coalitional games have been examinedrecently in [15] for access control problems.

We focus on myopic dynamics of such games inspired from

evolutionary game dynamics. One of the advantages to useevolutionary framework for coalition formation is that the dy-namic process describes both the formation of new coalitionsand the strategic interaction between users and coalitions orbetween coalitions. The class of evolutionary game dynamicsthat we use here need less information compared the standardrepeated game approaches. Under suitable assumptions, evo-lutionary game dynamics gives naturally some algorithms forfinding equilibria, stability conditions, limit cycles, and chaoticbehaviors.

We describe the formation of coalitions and their evolutionsby an explicit process based on revision of allocation of usersand coalitions. Using these evolutionary processes one canshow that the survival of a coalition in long-run and the long-term topology depends on the investment of each memberof the coalition, users allocations, and the initial coalitionalstructures.

A. Formation of adaptive coalitions

With the different levels of interaction (between users,between coalitions, between single user and coalitions etc.),the analysis of the dynamical system can be quite complicatedfor a larger number of users and coalitions. The dynamicswill be essentially determined by the rules for selecting orchanging the allocation of payoffs to a particular coalition, andfrom the coalitions to the joint actions which are evaluated bythe members. At each time t, distribution of coalition valuesand costs to its members is subject to negotiations betweenthem. A single user j has the rule β for joining J ′ fromJ : βj

J,J ′(x(t), u(t)) where x(t) is an allocation vector attime t, xjJ(t) ∈ [0, 1] represents the investment of user jin coalition J. The evolution of coalitions is given by thedifference between the incoming flux and the the outgoingflux

xjJ(t) = inflow − outflow (8)

where the inflow is given by∑

J′ βjJ′,J(x(t), u(t))x

jJ′(t) and

the outflow is xjJ(t)∑

J′ βjJ,J ′(x(t), u(t)).

The incoming flux to J corresponds to the arrival investmentrate to J and the outgoing flux from J is the departureinvestment rate from J. Note that, this is different than thepopulation dynamics formulation since it is not based onthe proportion of users. Here x corresponds to a coalitionalinvestment.

xjJ(t) =∑J′

βjJ′,J(x(t), u(t))x

jJ′(t)− xjJ(t)

∑J′

βjJ,J ′(x(t), u(t))(9)

:= V jJ (β, x(t), u(t))

An important class of rules is the class of pairwise com-parison of estimates with the current target. In it each time,each user chooses an investment based on a function ofujJ′(t)− ujJ(t) i.e

βjJ′,J(x(t), u(t)) = ξj

(ujJ′(x(t))− ujJ(x(t))

)where ξj : R −→ R+ is positive function. ξj(γ) = 0 ifγ ≤ 0.

190

Using the work in [16] (Chapter 2), we get the followingresult: Any stationary point xjJ of the process satisfies:

xjJ > 0 =⇒ ujJ(x) = maxJ′

ujJ′(x).

This means that in the long-run if the process converges to astationary point, then the stationary point is locally optimal.Moreover, convergence of the process is guaranteed [16] ifthe value of the assignment per coalition generated monotonemapping i.e.,∑

J

(xjJ − yjJ)(u

jJ(x)− u

jJ(y)) ≤ 0.

Now that we have defined the framework for evaluatingcoalition games, we illustrate the concept by a simplifiedexample. The choice of the example is dictated by its similarityin behavior to that of our core coalition network selectionproblem. Modeling the problem using coalition evolutionarygame and solving it, we then generalize the results to coalitionnetwork selection problem.

B. Example: Evolutionary Coalitional Spectrum Access

In cognitive radio networks, for performing dynamic coali-tional spectrum sensing and reducing collision, secondaryusers can locally form a coalition that may evolve over timedue to mobility and the outcomes. If a secondary user memberof some coalition J is unhappy of its success transmissionduring several slots, he can leave this coalition and thenjoin another coalition according its probability of successand sensing efficiency. His decision will affect the coalitionalstructure that forms in the network. We address the followingquestions: i) suppose that the new group is forming, will theothers nodes be motivated to form further new coalition thatwhere not worthwhile for them before? ii) where will all thislead in the long-run? iii) what will be the final topology ofcoalitions? These questions are illustrated in the numericalexamples.

1) Coalition formation by replication: Let β the rule ofchanges or join coalition is proportional to the investmenttimes the instantaneous regret then the dynamics in (8) be-comes the replicator dynamics. As in standard replicator dy-namics, xjJ(t) increases with the payoff of a coalition J for theuser j, compared to the average payoff. Thus those coalitionsthat better serve the single users’ interests will grow more. Themain difference with the standard replicator dynamics is thatwe have individual players rather than populations, and xjJ(t)represents an allocation (not a fraction of players). Using theevolutionary game dynamics with migration developed in [16]allows us to apply the methodology from this field, includingequilibrium and evolutionary stability concepts.

To illustrate this, we consider the spatial distribution ofwireless nodes and receivers. We took the initial coalitions asJ1 = {1, 2}, J2 = {3, 4} and J3 = {5}. We distinguish twoingredients for the value of a coalition. The first part includesthe spectrum sensing efficiency in order the improve theprobability of success and the second part is of cost for energyconsumption (for both sensing and transmission). When trans-mitting, one can express the outcome for the coalitions J1, J2,

LEGENDJ1J2J3

Time0 5 10 15 20 25 30 35 40 45 50

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

Evo

lutio

n of

Coa

litio

n

Fig. 4. Evolution of the coalition starting from(0.34, 0.34, 0.32)

LEGENDJ1J2J3

Time0 5 10 15 20 25 30 35 40 45 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Evo

lutio

n of

Coa

litio

n

Fig. 5. Evolution of the coalition starting from(0.345, 0.346, 0.309).

and J3. uJ1 = −cS−cT+(1−xJ2)(1−x5), uJ2 = −cS−cT+(1−xJ1)(1−x5), u5 = −cS−cT+(1−xJ1)(1−xJ2) where cSis the energy consumption cost for sensing and cT representsthe transmission cost of a single packet. Below we look atthe evolution of coalition starting from this initial conditionalstructure. We fix cS = 1/8, cT = 1/2. The evolution ofcoalitions is represented in Figures 4 and 5.

We observe in these figures that user 5 is interested to joinone of the coalitions or stay quiet. We examine the case whereuser 5 joins to coalition J1, J2 or will act selfishly. Then,we have three configurations: J1 = {1, 2, 5}, J2 = {3, 4} orJ1 = {1, 2}, J2 = {3, 4, 5} or the initial configuration. Inthe new network formation the coalition J3 disappears. TheFigures 6 and 7 illustrate the new formation of coalition.

These numerical examples show that small changes in theinitial conditions can lead to very different results in thecoalitions formed of the game. The endogenous formationof coalition and long-term stability depends on the existingcoalitional structure.

VI. COALITIONAL NETWORK SELECTION IN 4GNETWORKS

Having explained the vision of future wireless networks inthe introduction section, one can expect a tighten competitionamong the future wireless service operators.

A. Users coalition formation

The authors believe that with user-centric approach in place,operators with the view to increasing their user-pool followdifferent strategies by offering differentiate services, serviceprice offers, discount packages etc.

It should be noted that in all the highlighted scenarios,the decision of user joining the coalition is influenced bythe QoE of users, which in turn is driven by the operators’offers. Thus operators in a way have the controlling liver to

191

0 5 10 15 20 25 30 35 40 45 50

0.9

1.0

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0

LEGENDJ2new

Time

Evalu

tion o

f C

oalit

ion

Fig. 6. Evolution of coalition depending on the investmentof user 5

0 10 20 30 40 50 60 70 80 90 100Time

0.5Evalu

tion o

f co

alit

ion

s

0.5

0.5

0.5

0.5

0.5LEGEND

J2new

Fig. 7. Evolution of the coalition when user invests equally:fuzzy coalition .

increase their user pools in form of service and cost offers.Intuitively an under-loaded operator is motivated to attractdifferent size(s) of coalition than an over-loaded operator.Since the user QoE comprises of technical and economicalcomponents, we therefore carry out extensive simulations toinvestigate the impact of coalition size(s) on user QoE withrespect to technical indices only. The reason for dropping theeconomical aspect from simulation analysis is the following;the authors believe that extensive simulation results will cap-ture the environment dynamics (wireless medium characteris-tics, technology specific behavior, interferences etc.) and theirimpact on the user coalition size(s). Once we know the impactof coalition size on user QoE specific to technical indices, theimpact of varying economical parameters (different serviceprices etc.) is straightforward, therefore, in this section, weemphasize more on the evaluation of technical indices.

B. Simulation setup

In this section, we detail our OPNET Modeller 15.0 basedsimulation setup. We consider the wireless coverage footprintof two operators. Both the operators own Long Term Evolution(LTE) access network technology in the coverage area. Theoperators’ technical potential in the considered coverage areais exactly the same i.e., both operators deployed the accesstechnology, backbone resources, and core network facilitiesare similar. We now detail the technical specifications of theaccess technology.

C. LTE configuration

The considered LTE operates in 10MHz spectrum fre-quency with the cell coverage of 500m radius. There existsonly one cell per eNodeB. The observed cell throughput in

the downlink is 13Mbps(with all FTP users offering veryheavy traffic load). Background offered load is consideredto be ∼ 2Mbps, which is mapped on Guaranteed Bit Rate(GBR) bearers with QoS Class Identifier (QCI)-1. The MACscheduling is performed using Round Robin. We configure1 bearer per UE (1 application per UE). Owing to thefact that we are investigating multiple applications runningin a UE with each application provisioning different QoS.For instance the applications ensuring minimum guaranteedbit rate bearers have an associated GBR value for whichdedicated transmission resources are permanently allocated(e.g., by an admission control function in the eNodeB) at thebearer establishment / modification. However, the applicationswhich do not guarantee any particular bit rate, no bandwidthresources are allocated permanently to the bearer. For the pro-posed approach, where FTP users subscribe for different datarates with operators, which in turn define the user types e.g.,excellent users are subscribed to transmission resource ranges[rexcmin, r

excmax], where rexcmin ≥ rgoodmax etc. We therefore map FTP

traffic to QCI-9. Following the user subscription assumptionfor non-GBR traffic, we make use of the Aggregated MaximumBit Rate (AMBR) parameter, which refers to the maximum bitrate allowed for all the non-GBR SDFs aggregated for a UE.We enforce this parameter in the downlink, however it canalso be implemented in the uplink direction but it is out ofthe scope of this work, since we are concentrating on the QoEanalysis for downlink traffic. The AMBR in this simulation setup is set as 1.5Mbps(per UE for non-GBR bearers only). Wefurther assume that users are mobile in the coverage area andin the simulation this is carried out by selecting the Randomway point model and setting the UE speed (walking speed)0−1.4m/s, which is uniformly distributed. We also configurethe user mobility pause time as uniformly distributed between0 − 30seconds. We configure the LTE transport network asall ethernet 1Gbps links with DiffServ enabled on the last-hop router, where VoIP is mapped to EF PHB and FTP ismapped to BE PHB. In this case we do not introduce anyIP impairments. We use the most commonly used flavor ofTCP i.e., new Reno with the buffer size of 65535bytes andwindow scaling disabled. In order to avoid congestions at theUu interface, Call Admission Control (CAC) is performedwhen total offered load (FTP + VoIP) reaches 10Mbps percell.

The user downloads the TCP based FTP file of (2Mbytes).The generation of user request is modeled using poisson dis-tribution and the inter requests duration is normally distributedwith the mean 14. There are 14 FTP users and 6 VoIP users intotal in the system. It should be noted that VoIP users generatethe background load.

D. Simulation configurationsIn order to investigate the performance of the proposed

coalition approach and its impact on the user QoE, operatorscall blocking rate and resource utilization in different simu-lation configuration (by configuration here we mean differentsimulation settings which are explained below in this section).For all the configurations, we investigate the operator resourceutilization, users’ QoE, Operator call blocking rate etc.

192

• Single user - no coalition configuration: This settingdictates that no coalition can be formed and all the userswithin the system select the network operator so that itincreases its individual utility function i.e., user QoE.User generate requests following Poisson distribution asexplained earlier, the call holding time is dependent onthe configured file size i.e., 2Mbps

• Partial coalition - (4+2) configuration: This setting dic-tates that coalition of two different sizes are allowed i.e.,either of 2 or 4 users.

• Near grand coalition - (5+1) configuration: This settingdictates that a nearly grand coalition is allowed i.e., 5users are allowed to form a coalition leaving a one userin case of simultaneous user connectivity.

E. Results and analysis

We decompose the analysis into operator specific (callblocking, resource utilization, accepted calls etc.) and userspecific analysis (DRT, achieveable datarate) for each config-uration. The aim of such decomposition is to make the resultsof all the configurations comparable.

Note: The fact that TCP sharing available bandwidth evenlyis valid in the simulation scenario because all configurationparameters e.g., TCP parameters, traffic priority at transportnetwork and at MAC schedular etc. have been configured inthe similar way. Moreover all users on the average enjoygood channel condition to avoid any possible throughputdegradation due to bad channel quality. Aforementioned effectis achieved by restricting user movements in an area near eNB.

Due to the space limitation, we summarize the simulationresults in the Table-II (we avoid putting curves). In Table II, weanalyze the user perceived QoE, operator resource utilization,and operator call blocking rate. As can be see that coalitionformation has no impact on user QoE. However, the coalitionformation influences the operator both in terms of resourceutilization and blocked calls. This further drives us to thefollowing conclusion.

We note that when coalition is formed, the operators arethen faced with following situations:

• When operators are fully under loaded - Operators canaccept a coalition request of both type i.e., 2 users /coalition and 4 users / coalition.

• Operators are entertaining coalition requests with 2 users/ coalition - Operators in this case can accept coalitionrequests of both the coalition size i.e., 2 and 4, as theoperators still have bandwidth resources available.

• Operators are currently entertaining a coalition requestwith 4 users / coalition - Operators in this case canaccept an additional coalition request of size 2, howevercoalition size of 4 users are blocked due to scarce operatorresources.

The above observations dictate that operators’ resource utiliza-tion is liable to reduce owning to large blocks of bandwidthrequests by coalition i.e., the greater is the request size mightnot fit into available bandwidth resource of the network. Sucha restriction is not experienced in the no-coalition settings.

TABLE IIUTILITY CONTROL PARAMETER VALUES FOR VOIP AND FTP

APPLICATIONS

Parameter Configuration Operator-1 Operator-2

QoENo Coalition 4.305 4.250Partial coalition 4.237 4.213Near grand coalition 4.14 4.186

Call blockingNo Coalition 24 26Partial coalition 310 300Near grand coalition 370 360

Resource utilizationNo Coalition 3.8Mbps 3.8MbpsPartial coalition 2.7Mbps 2.7MbpsNear grand coalition 2.2Mbps 2.2Mpbs

On the basis of above analysis, we are confident to con-clude that given the mentioned service and user types, thecoalition size has no impact on the user perceived QoE i.e., asobserved the DRT and average throughput of users in all theconfiguration remains unchanged. However the coalition sizehas impact on the operator resource utilization, call blockingand in turn on operators’ revenue. Thus the controlling leverfor motivating the users to form coalition may be the servicecost offers by the operators e.g., by offering discounts whenoperators are willing to encourage the coalition formation. Theamount of discount is influenced by the operator status e.g.,the under-loaded operators offer higher discount on formationof coalition, however the coalition size(s) is typically decidedby the operator resource capacity or operator policy. On theother hand an over-loaded operator would discourage coalitionby equating the discount factor zero.

VII. CHALLENGES AND OPEN ISSUES

The evolutionary coalitional game approach can be extendedin different ways:• stochastic coalitional population games to allow modeling

the variability of users’ internal states such as backoffstate, battery-state, modulation scheme, resource statecharacteristics etc.

• In emerging wireless networks, the architecture, the topol-ogy structure and the performances depend on differentlayers, the upper layers performance depend on the lowerlayers performance and vice versa, an appropriate frame-work for such a scenario would be coalitional hierarchicalpopulation games with different types of users.

• Imperfectness and time delays are frequent in wirelessnetworks where the measurements can be noisy, outdatedand need to be approximated. In presence of randomness,robustness and incomplete information, the frameworkneeds to be extended to robust coalition games and evo-lutionary coalitional games with incomplete information.

• In this article, we have presented single value per coali-tion. We aim to extend the modeling to multiple objec-tives, leading multiple objective evolutionary coalitionalgames.

VIII. CONCLUSIONS

We have shown that the use of simple tools of evolutionarygame theory can lead to global optimum of technology se-lection games and more generally in anti-coordination games

193

in a self-organizing and fully distributed manner. We havedemonstrated how evolutionary games can be extended toevolutionary coalitional games and we provided the use ofthis new tool to emerging wireless networks. We have observedthat the survival of a coalition in long-run and the long-termtopology depends on the investment of each member of thecoalition and the initial coalitional structure of the network. Inour ongoing work, we plan to apply evolutionary coalitionalgames in heterogeneous and hierarchical wireless networks.

REFERENCES

[1] E. Altman, T. Boulogne, R. El-Azouzi, T. Jimenez, and L. Wynter. Asurvey on networking games in telecommunications. Computers andOperations Research, 2006.

[2] R. Bellman. On the theory of dynamic programming. Proceedings ofthe National Academy of Sciences of the U.S.A, 38:716–719, 1952.

[3] R. Bush and F. Mosteller. Stochastic models of learning. Wiley Sons,New York., 1955.

[4] Fodor Gabor, Fruskar Anders, and Lundsjo Johan. On access selectiontechniques in always best connected networks. In ITC Specialist Seminaron Performance Evaluation of Wireless and Mobile Systems, August2004.

[5] Oscar Salazar Gaitn, Philippe Martins, Jacques Demerjian, and SamirTohm. Enabling roaming in heterogeneous multi-operator wirelessnetworks.

[6] A. Hasswa, N. Nasser, and H. Hassanein. In Communications, 2006.ICC ’06. IEEE International Conference on.

[7] Cuong Trong Fikret Sivrikaya Manzoor Ahmed Khan, Ahmet Ci-hat Toker and S.Albayrak. Cooperative game theoretic approach tointegrated bandwidth sharing and allocation. In Proceedings of theGameNets’09, pages 978–1–4244–4176–1, 2009.

[8] Thomas Geithner Fikret Sivrikaya Manzoor Ahmed Khan, Cuong Trongand S.Albayrak. Network level cooperation for resource allocation infuture wireless networks. In Proceedings of the IFIP Wireless DaysConference ’08, 2008.

[9] A. Mihovska, F. Meucci, N.R. Prasad, F.J. Velez, and O. Cabral.Multi-operator resource sharing scenario in the context of imt-advancedsystems.

[10] O. Ormond, J. Murphy, and G.-M. Muntean. Utility-based intelligentnetwork selection in beyond 3g systems. In Communications, 2006. ICC’06. IEEE International Conference on, volume 4, pages 1831–1836,June 2006.

[11] J. Perez-Romero, O. Sallent, R. Agusti, P. Karlsson, A. Barbaresi,L. Wang, F. Casadevall, M. Dohler, H. Gonzalez, and F. Cabral-Pinto. Common radio resource management: functional models andimplementation requirements. In Personal, Indoor and Mobile RadioCommunications, 2005. PIMRC 2005. IEEE 16th International Sympo-sium on, volume 3, pages 2067–2071 Vol. 3, 2005.

[12] W. Saad, Z. Han, M. Debbah, A. Hjrungnes, and T. Basar. Coalitionalgame theory for communication networks: A tutorial. IEEE SignalProcessing Magazine, Special Issue on Game Theory, 26(5):77–97,September 2009.

[13] W. H. Sandholm. Population games and evolutionary dynamics. MITPress, 2010.

[14] L. S. Shapley. Stochastic games. Proc. Nat. Acad. Sciences, 39:1095–1100, 1953.

[15] H. Tembine. Evolutionary network formation games and fuzzy coalitionsin heterogeneous networks. in Proc. IFIP WIRELESS DAYS, December2009.

[16] H. Tembine. Population games in large-scale networks: time delays,mean field dynamics and applications. LAP, 2009.

[17] H. Tembine. Distributed strategic learning for wireless engineers. Notes,250 pages, 2010.

[18] H. Tembine, E. Altman, R. ElAzouzi, and Y. Hayel. Evolutionary gamesin wireless networks. IEEE Trans. on Systems, Man, and Cybernetics,Part B, Special Issue on Game Theory, December 2009.

[19] A. Tolli, P. Hakalin, and H. Holma. Performance evaluation of commonradio resource management (crrm). In Communications, 2002. ICC2002. IEEE International Conference on, volume 5, pages 3429–3433vol.5, 2002.

[20] H. J. Wang, R. H. Katz, and J. Giese. Policy-enabled handoffs acrossheterogeneous wireless networks. In WMCSA ’99: Proceedings of theSecond IEEE Workshop on Mobile Computer Systems and Applications,page 51, 1999.

194

[ieee 2011 wireless advanced (wiad) (formerly known as spwc) - london, united kingdom...

Documents