[acm press the thirteenth acm international symposium - hilton head, south carolina, usa...
TRANSCRIPT
Spatial Spectrum Access Game: Nash Equilibria andDistributed Learning
Xu ChenDepartment of Information EngineeringThe Chinese University of Hong Kong
Shatin, Hong [email protected]
Jianwei HuangDepartment of Information EngineeringThe Chinese University of Hong Kong
Shatin, Hong [email protected]
ABSTRACTA key feature of wireless communications is the spatial reuse.However, the spatial aspect is not yet well understood forthe purpose of designing efficient spectrum sharing mecha-nisms. In this paper, we propose a framework of spatial spec-trum access games on directed interference graphs, whichcan model quite general interference relationship with spa-tial reuse in wireless networks. We show that a pure strategyequilibrium exists for the two classes of games: (1) any spa-tial spectrum access games on directed acyclic graphs, and(2) any games satisfying the congestion property on directedtrees and directed forests. Under mild technical conditions,the spatial spectrum access games with random backoff andAloha channel contention mechanisms on undirected graphsalso have a pure Nash equilibrium. We then propose a dis-tributed learning algorithm, which only utilizes users’ lo-cal observations to adaptively adjust the spectrum accessstrategies. We show that the distributed learning algorithmcan converge to an approximate mixed-strategy Nash equi-librium for any spatial spectrum access games. Numericalresults demonstrate that the distributed learning algorithmachieves up to 100% performance improvement over a ran-dom access algorithm.
Categories and Subject DescriptorsC.2.1 [Network Architecture and Design]: Wireless com-munication
KeywordsCognitive Radio, Distributed Spectrum Sharing, Nash Equi-librium, Distributed Learning
1. INTRODUCTIONCognitive radio is envisioned as a promising technique to
alleviate the problem of spectrum under-utilization [1]. Itenables unlicensed wireless users (secondary users) to op-portunistically access the licensed channels owned by legacy
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.MobiHoc’12, June 11–14, 2012, Hilton Head Island, SC, USA.Copyright 2012 ACM 978-1-4503-1281-3/12/06 ...$10.00.
spectrum holders (primary users), and thus can significantlyimprove the spectrum efficiency [1].A key challenge of the cognitive radio technology is how to
resolve the resource competition by selfish secondary users ina decentralized fashion. If multiple secondary users transmitover the same channel simultaneously, severe interferencesor collisions might occur and the data rates of all users mayget reduced. Therefore, it is necessary to design efficientspectrum sharing mechanism for cognitive radio networks.The competitions among secondary users for common spec-
trum have often been studied as a noncooperative game the-ory (e.g., [6, 9, 18, 19, 23]). Nie and Comniciu in [18] de-signed a self-enforcing distributed spectrum access mecha-nism based on potential games. Niyato and Hossain in [19]studied a price-based spectrum access mechanism for com-petitive secondary users. Chen and Huang in [6] investigatedstable spectrum sharing mechanism design based on evolu-tionary game theory. Fl ↪elegyhlczi et al. in [9] proposed atwo-tier game framework for medium access control (MAC)mechanism design.When not knowing spectrum information such as chan-
nel availability, secondary users need to learn the networkenvironment and adapt the spectrum access decisions ac-cordingly. Han et al. in [11] used no-regret learning to solvethis problem, assuming that the users’ channel selectionsare common information. When users’ channel selectionsare not observable, authors in [2, 13] designed multi-agentmulti-armed bandit learning algorithms to minimize the ex-pected performance loss of distributed spectrum access.A common assumption of the above results is that sec-
ondary users are close-by and interfere with each other whenthey transmit on the same channel simultaneously. How-ever, a unique feature of wireless communication is spatialreuse. If users who transmit simultaneously are located suf-ficiently far away, then simultaneous transmissions over thesame channel may not cause any performance degradationto any user (see Figure 1 for an illustration). Such spatial ef-fect on spectrum sharing is less understood than many otheraspects in existing literature [24].Recently, Tekin et al. in [22] and Southwell et al. in [21]
proposed a novel spatial congestion game framework to takespatial relationship into account. The key idea is to extendthe classical congestion game upon an undirected graph, byassuming that the interferences among the players are sym-metric and a player’s throughput depends on the number ofplayers in its neighborhood that choose the same resource.As illustrated in Figure 1, however, the interference relation-ship among the secondary users can be asymmetric due to
205
����������� �� �
�������������� �
�������������� ���
�������������� ���
�������������� ���
���
���
���
���
���
���
�����������������
�������
�������
�������
Figure 1: Illustration of distributed spectrum accesswith spatial reuse under the protocol interferencemodel. Each user n is represented by a transmit-ter Txn and receiver Rxn pair. Users 2 and 3 cannot generate interference to user 1, since user 1’sreceiver Rx1 is far from user 2 and 3’s transmitters.On the other hand, user 1 can generate interferenceto user 2, since user 2’s receiver Rx2 is within thetransmission range of user 1’s transmitter Tx1. Sim-ilarly, user 2 and user 3 can generate interferencesto each other.
the heterogeneous transmission powers and locations of theusers. We hence propose a more general framework of spatialspectrum access game on directed interference graphs, whichtake users’ heterogeneous resource competition capabilitiesand asymmetric interference relationship into account. Thecongestion game on directed graphs has also been studiedin [4], with the assumption that players have linear and ho-mogeneous payoff functions. The game model in this paperis more generic and allows linear/nonlinear player-specificpayoff functions. Moreover, we design a distributed algo-rithm for achieving the equilibria of the game. The mainresults and contributions of this paper are as follows:
• General game formulation: We formulate the distributedspectrum access problem as a spatial spectrum ac-cess game on directed interference graphs, with user-specific channel data rates and channel contention ca-pabilities.
• Existence of Nash equilibria: We show by counterex-amples that a general spatial spectrum access gamemay not have a pure Nash equilibrium. We then showthat a pure strategy equilibrium exists for the follow-ing two classes of games: (1) any spatial spectrumaccess games on directed acyclic graphs, and (2) anygames satisfying the congestion property on directedtrees and directed forests. We also show that undermild conditions the spatial spectrum access games withrandom backoff and Aloha channel contention mecha-nisms on undirected graphs are potential games andhave pure Nash equilibria.
• Distributed learning for achieving approximate Nashequilibrium: We propose a distributed learning algo-rithm that can converge to an approximate mixed Nashequilibrium for any spatial spectrum access games byutilizing users’ local observations only. Numerical re-
sults demonstrate that the distributed learning algo-rithm achieves up-to 100% performance improvementover the random access algorithm.
The rest of the paper is organized as follows. We introducethe system model and the spatial spectrum access game inSections 2 and 3, respectively. We investigate the existenceof Nash equilibria in Section 4. Then we present the dis-tributed learning algorithm in Section 5. We illustrate theperformance of the proposed algorithm through numericalresults in Section 6, and finally conclude in Section 7.
2. SYSTEM MODELWe consider a cognitive radio network with a set M =
{1, 2, ...,M} of independent and stochastically heterogeneousprimary channels. A set N = {1, 2, ..., N} of secondary userstry to access these channels distributively when the chan-nels are not occupied by primary (licensed) transmissions.Here we assume that each secondary user is a dedicatedtransmitter-receiver pair.To take users’ spatial relationship into account, we denote
dn = (dTxn , dRxn) as the location vector of secondary usern, where dTxn and dRxn denote the location of the trans-mitter and the receiver, respectively. Each secondary user nhas a transmission range δn. Then given the location vec-tors of all secondary users, we can obtain the interferencegraph G = {N , E} to describe the interference relationshipamong the users (see Figure 1 for an example). Here the ver-tex set N is the same as the secondary user set. The edge setis defined as E = {(i, j) : ||dTxi , dRxj || ≤ δi,∀i, j �= i ∈ N},where ||dTxi , dRxj || is the distance between the transmitterof user i and the receiver of user j. As illustrated in Figure1, an interference edge can be directed or undirected. If aninterference edge is directed from secondary user i to userj, then user j’s data transmission will be affected by useri’s transmission on the same channel, but user i will not beaffected by user j. If the interference edge is undirected1 be-tween user i and user j, then the two users can affect eachother. Note that a generic directed interference graph canconsist of a mixture of directed and undirected edges. Inthe sequel, we call an interference graph undirected, if andonly if all the edges of the graph are undirected. We alsodenote the set of users that can cause interference to user nas Nn = {i : (i, n) ∈ E , i ∈ N}.Based on the interference model above, we describe the
cognitive radio network with a slotted transmission structureas follows:
• Channel State: the channel state for a channel m dur-ing time slot t is
Sm(t) =
⎧⎪⎨⎪⎩0, if channel m is occupied
by primary transmissions,
1, if channel m is idle.
• Channel State Transition: for a channelm, the channelstate Sm(t) is a random variable with a probabilitydensity function as ψm. In the following, we denotethe channel idle probability θm as the mean of Sm(t),i.e., θm = Eψm [Sm(t)].
1Here the edge is actually bi-directed. We follow the con-ventions in [22] and [21] and ignore the directions on theedge.
206
• User Specific Channel Throughput : for each secondaryuser n, its realized data rate bnm(t) on an idle channelm in each time slot evolves according to a randomprocess with a mean Bnm, due to users’ heterogeneoustransmission technologies and the local environmentaleffects such as fading.
• Time Slot Structure: each secondary user n executesthe following stages synchronously during each timeslot:
– Channel Sensing : sense one of the channels basedon the channel selection decision made at the endof previous time slot.
– Channel Contention: Let an be the channel se-lected by user n, and a = (a1, ..., aN ) be the chan-nel selection profile of all users. The probabilitythat user n can grab the chosen idle channel anduring a time slot is gn(N an
n (a)) ∈ (0, 1), whichdepends on the subset of user n’s interfering usersthat choose the same channelN an
n (a) � {i ∈ Nn :ai = an}. Here are two examples:1) Random backoff mechanism: the contentionstage of a time slot is divided into λmax mini-slots (see Figure 2). Each contending user n firstcounts down according to a randomly and uni-formly generated integer backoff time counter (num-ber of mini-slots) λn between 1 and λmax. Ifthere is no active transmissions till the count-down timer expires, the user monitors the channeland transmits RTS/CTS messages on that chan-nel. If multiple users choose the same backoffcounter, a collision will occur and no users cangrab the channel successfully. Once successfullygets the channel, the user starts to transmit itsdata packet. In this case, we have
gn(N ann (a)) = Pr{λn < min
i∈Nn:ai=an{λi}}
=
λmax∑λ=1
Pr{λn = λ}
× Pr{λn < mini∈Nn:ai=an
{λi}|λn = λ}
=
λmax∑λ=1
1
λmax
(λmax − λλmax
)Knan
(a)
, (1)
where Knan(a) = |N an
n (a)| =∑i∈NnI{ai=an} de-
notes the number of user n’s interfering users choos-ing the same channel as user n.2) Aloha mechanism: user n contends for an idlechannel with a probability pn ∈ (0, 1) in a timeslot. If multiple interfering users contend for thesame channel, a collision occurs and no user cangrab the channel for data transmission. In thiscase, we have
gn(N ann (a)) = pn
∏i∈Nan
n (a)
(1− pi) . (2)
Note that for the random backoff mechanism, thechannel grabbing probability gn(N an
n (a)) is userhomogeneous since it only depends on the numberof contending users Kn
an(a). For the Aloha mech-anism, the channel grabbing probability gn(N an
n (a))
�������������
��������� ������ �
������������ �
�������������� �
� ��� ����
Figure 2: Time slot structure with random backoffmechanism
is user heterogeneous since it depends on who (in-stead of how many users) contend the channel.
– Data Transmission: transmit data packets if theuser successfully grabs the channel.
– Channel Selection: choose a channel to accessduring next time slot according to the distributedlearning algorithm in Section 5.
Under a fixed channel selection profile a, the long-runexpected throughput of a secondary user n choosing channelan can be computed as
Un(a) = θanBnangn(N an
n (a)). (3)
Since our analysis is from the secondary users’ perspective,we will use the terms “secondary user” and “user” inter-changeably. Due to page limit, the detailed proofs are givenin our technical report [5].
3. SPATIAL SPECTRUM ACCESS GAMEWe now consider the problem that each user tries to max-
imize its own throughput by choosing a proper channel dis-tributively. Let a−n = {a1, ..., an−1, an+1, ..., aN} be thechannels chosen by all other users except user n. Givenother users’ channel selections a−n, the problem faced by auser n is
maxan∈M
U(an, a−n),∀n ∈ N . (4)
The distributed nature of the channel selection problem nat-urally leads to a formulation based on the game theory,such that users can self organize into a mutually accept-able channel selection (pure Nash equilibrium) a∗ =(a∗1, a
∗2, ..., a
∗N ) with
a∗n = arg maxan∈M
U(an, a∗−n), ∀n ∈ N . (5)
We thus formulate the distributed channel selection problemon an interference graph G as a spatial spectrum accessgame Γ = (N ,M, G, {Un}n∈N ), where N is the set of play-ers, M is the set of strategies, G describes the interferencerelationship among the players, and Un is the payoff functionof player n.It is known that not every finite strategic game possesses
a pure Nash equilibrium [17]. We then introduce a more gen-
eral concept of mixed Nash equilibrium. Let σn � (σn1 , ..., σnM )
denote the mixed strategy of user n, where 0 ≤ σnm ≤ 1 is theprobability of user n choosing channel m, and
∑Mm=1 σ
nm =
1. For simplicity, we use the same payoff notation Un(σ1, ...,σN )to denote the expected throughput of user n under the mixedstrategy profile (σ1, ...,σN ), and it can be computed as
Un(σ1, ...,σN ) =
M∑a1=1
σ1a1 ...
M∑aN=1
σNaNUn(a1, ..., aN ). (6)
Similarly to the pure Nash equilibrium, the mixed Nashequilibrium is defined as:
207
Definition 1 (Mixed Nash Equilibrium [17]). Themixed strategy profile σ∗ = (σ∗1, ...,σ
∗N ) is a mixed Nash
equilibrium, if for every user n ∈ N , we have
Un(σ∗n,σ
∗−n) ≥ Un(σn,σ
∗−n), ∀σn �= σ∗n,
where σ∗−n denote the mixed strategy choices of all otherusers except user n.
Note that the pure Nash equilibrium is a special case ofthe mixed Nash equilibrium, wherein every user chooses asingle channel with probability one. One critical issue ingame theory is the existence of both mixed and pure Nashequilibria, which motivates the study in the following Sec-tion 4.
4. EXISTENCE OF NASH EQUILIBRIAIn this part, we study the existence of Nash equilibria
in a spatial spectrum access game. Since a spatial spectrumaccess game is a finite strategic game (i.e., with finite numberof players and finite number of channels), we know that italways admits a mixed Nash equilibrium according to [17].On the other hand, not every finite strategic game pos-
sesses a pure Nash equilibrium [17]. A pure Nash equi-librium is much preferable than a general mixed strategyNash equilibrium, as in a pure strategy equilibrium users canachieve mutually acceptable channel selections without ran-domly picking and switching channels all the time. This mo-tivates us to further investigate the existence of pure NashEquilibria of the spatial spectrum access games.
4.1 Existence of Pure Nash Equilibria on Di-rected Interference Graphs
We first study the existence of pure Nash Equilibria ondirected interference graphs.First of all, we can construct a game which does not have
a pure Nash equilibrium.
Theorem 1. There exists a spatial spectrum access gameon a directed interference graph not admitting any pure Nashequilibrium.
Figure 3 shows such an example. It is easy to verify thatfor all 8 possible channel selection profiles, there always ex-ists one user (out of these three users) having an incentiveto change its channel selection unilaterally to improve itsthroughput.We then focus on identifying the conditions under which
the game admits a pure Nash equilibrium. To proceed, wefirst introduce the following lemma.
Lemma 1. Consider any spatial spectrum access game ona given directed interference graph G that has a pure Nashequilibrium. Then we can construct a new spatial spectrumaccess game by adding a new player, who can not gener-ate interference to any player in the original game and mayreceive interference from one or multiple players in the orig-inal game. The new game also has a pure Nash equilibrium.
We know that any directed acyclic graph (i.e., a directedgraph contains no directed cycles) can be given a topologicalsort (i.e., an ordering of the nodes), such that if node i < jthen there are no edges directed from the node j to node iin the ordering [3]. From Lemma 1, we know that
Figure 3: An example of spatial spectrum accessgame without pure Nash equilibria. There are twochannels available and the throughput of a user nis given as Un(a) = p
∏i∈Nan
n (a)(1 − p). If all three
players (nodes) choose channel 1, then each playerhas the incentive of choosing channel 2 to improveits throughput assuming that the other two playersdo not change their channel choices. We can showthat such derivation will happen for all 8 possiblestrategy profiles a = (a1, a2, a3), where ai ∈ {1, 2} fori ∈ {1, 2, 3}.
Corollary 1. Any spatial spectrum access game on adirected acyclic graph has a pure Nash equilibrium.
To obtain more insightful results, we next impose the fol-lowing property on the spatial spectrum access games:
Definition 2 (Congestion Property). User n’s chan-nel grabbing probability gn (N an
n (a)) satisfies the congestion
property if for any N ann (a) ⊆ N an
n (a), we have
gn(N ann (a)) ≥ gn (N an
n (a)) . (7)
Furthermore, a spatial spectrum access game satisfies thecongestion property if (7) holds for all users n ∈ N .The congestion property means that the more contending
users exist, the less chance a user can grab the channel. Sucha property is natural for practical wireless systems such asthe random backoff and Aloha systems. We can show that
Lemma 2. Consider any spatial spectrum access game sat-isfying the congestion property on a given directed interfer-ence graph G that has a pure Nash equilibrium. Then wecan construct a new spatial spectrum access game by addinga new player, whose channel grabbing probability satisfies thecongestion property and who can have interference relation-ship with at most one player n ∈ N in the original game.The new game also has a pure Nash equilibrium.
Definition 3 (Directed Tree [3]). A directed graphis called a directed tree if the corresponding undirected graphobtained by ignoring the directions on the edges of the orig-inal directed graph is a tree.
Note that a (undirected) tree is a special case of directedtrees. Since any spatial spectrum access game over a sin-gle node always has a pure Nash equilibrium, we can thenconstruct the directed tree recursively by introducing a newnode and adding an (directed or undirected) edge betweenthis node and one existing node. From Lemma 2, we obtainthat
Corollary 2. Any spatial spectrum access game satis-fying the congestion property on a directed tree has a pureNash equilibrium.
Definition 4 (Directed Forest [3]). A directed graphis called a directed forest if it consists of a disjoint union ofdirected trees.
208
Figure 4: An interference graph that consists of di-rected acyclic graphs and directed trees
Similarly, we can obtain from Lemma 2 that
Corollary 3. Any spatial spectrum access game satis-fying the congestion property on a directed forest has a pureNash equilibrium.
As illustrated in Figure 4, according to Lemmas 1 and2, we can construct more complicated directed interferencegraphs over which a spatial spectrum access game satisfyingthe congestion property has a pure Nash equilibrium.
4.2 Existence of Pure Nash Equilibria on Undi-rected Interference Graphs
We now study the case that the interference graph is undi-rected. This is a good approximation of reality if the trans-mitter of each user is close to its receiver, and all users’transmit powers are roughly the same.When an undirected interference graph is a tree, according
to Corollary 2, any spatial spectrum access game satisfyingthe congestion property has a pure Nash equilibrium. How-ever, for those non-tree undirected graphs without a topo-logical sort, the existence of pure Nash equilibrium can notbe proved following the results in previous Section 4.1. Thismotivates us to further study the existence of pure Nashequilibria on generic undirected interference graphs.First of all, [15] showed that a 3-players and 3-resources
congestion game with user-specific congestion weights maynot have a pure Nash equilibrium. Such a congestion gamecan be considered as a spatial spectrum access game on acomplete undirected interference graph (by regarding theresources as channels). When all users have homogeneouschannel contention capabilities and all channels have thesame mean data rates, [22] showed that the spatial spec-trum access game on any undirected interference graphs hasa pure Nash equilibrium. Clearly, the applicability of sucha channel-homogeneous model is quite limited, since thechannel throughputs in practical wireless networks are oftenheterogeneous. We hence next focus on exploring the ran-dom backoff and Aloha systems with user-specific data rates,which provide useful insights for the user-homogeneous anduser-heterogeneous channel contention mechanisms, respec-tively.Here we resort to a useful tool of potential game2, which
is defined as
Definition 5 (Potential Game [16]). A game is calleda potential game if it admits a potential function Φ(a) suchthat for every n ∈ N and a−n ∈MN−1,
sgn(Φ(a
′n, a−n)− Φ(an, a−n)
)2Note that it is much more difficult to find a proper potentialfunction to take into account users’ asymmetric relationships(i.e., directions of edges on graph) when the interferencegraph is directed. Hence in this study we only apply thetool of potential game in the undirected case.
= sgn(Un(a
′n, a−n)− Un(an, a−n)
),
where sgn(·) is the sign function.
Definition 6 (Better Response Update [16]). The
event where a player n changes to an action a′n from the ac-
tion an is a better response update if and only if Un(a′n, a−n) >
Un(an, a−n).
An appealing property of the potential game is that italways admits a pure Nash equilibrium and the finite im-provement property, which is defined as
Definition 7 (Finite Improvement Property [16]).A game has the finite improvement property if any asyn-chronous better response update process (i.e., no more thanone player updates the strategy at any given time) termi-nates at a pure Nash equilibrium within a finite number ofupdates.
Based on the potential game theory, we first study therandom backoff mechanism. We show in Theorem 2 thatwhen the undirected interference graph is complete, thereexists indeed a pure Nash equilibrium.
Theorem 2. Any spatial spectrum access game on a com-plete undirected interference graph with the random backoffmechanism is a potential game with the potential function
Φ(a) =
N∏n=1
θanBnan
M∏m=1
Km(a)∏c=1
gn(c), (8)
where Km(a) is the number of users choosing channel m un-der the strategy profile a, and hence has a pure Nash equi-librium.
We then consider the random backoff mechanism in theasymptotic case that λmax goes to infinity. This can be agood approximation of reality when the number of backoffmini-slots is much greater than the number of interferingusers, and collision rarely occurs. In this case, we have
gn(N ann (a)) = lim
λmax→∞
λmax∑λ=1
1
λmax
(λmax − λλmax
)Knan
(a)
=
∫ 1
0
xKnan
(a)dx =1
1 +Knan(a)
, (9)
here Knan(a) denotes the number of users that choose chan-
nel an and can interfere with user n. Equation (9) im-plies that the channel opportunity is equally shared among1+Kn
an(a) contending users (including user n). We considerthe user specific throughput as
Un(an, a−n) = hnθanBan1
1 +Knan(a)
, (10)
where hn is regarded as a user-specific transmission gain.For example, a user n can possess a higher transmissiongain if the distance between its transmitter and receiver isshorter than other users given that all the users transmitwith the same power level. We show that
Theorem 3. Any spatial spectrum access game on anyundirected interference graph with user-specific transmission
209
gains and the random backoff mechanism in the asymptoticcase is a potential game with the potential function
Φ(a) = −N∑n=1
(1 + 1
2Knan(a)
θanBan
), (11)
and hence has a pure Nash equilibrium.
We now consider the Aloha mechanism. According to (2),we have the throughput function as
Un(a) = θanBnanpn
∏i∈Nan
n (a)
(1− pi). (12)
We can show that
Theorem 4. Any spatial spectrum access game on anyundirected interference graph with the Aloha mechanism isa potential game with the potential function
Φ(a) =
N∑i=1
− log(1− pi)
×⎛⎝1
2
∑j∈Nai
i (a)
log(1− pj) + log(θaiB
iaipi
)⎞⎠ , (13)
and hence has a pure Nash equilibrium.
When a spatial spectrum access game is a potential game,we can design a distributed algorithm such that each userasynchronously updates the channel selection myopically toincrease its throughput. According to the finite improve-ment property of the potential game, such an algorithm canachieve a pure Nash equilibrium within finite number of it-erations. However, asynchronous better response updatesrequire each user to have the complete information of otherusers’ channel selections. This can only be achieved with ex-tensive information exchange among the users, which maynot be always feasible. It will be very nice to design a dis-tributed algorithm that achieves the equilibrium without in-formation exchange.
5. DISTRIBUTED LEARNING FOR SPATIALSPECTRUM ACCESS
In this part, we discuss how to achieve an equilibrium forthe spatial spectrum access games. As shown in Section 4,a generic spatial spectrum access game does not necessar-ily have a pure Nash equilibrium, and thus it is impossibleto design a mechanism achieving pure Nash equilibria ingeneral. We hence target on approaching the mixed Nashequilibria. Govindan and Wilson in [10] proposed a globalNewton method to compute the mixed Nash equilibria forany finite strategic games. This method hence can be ap-plied to find the mixed Nash equilibria for the spatial spec-trum access games. However, such an approach is a cen-tralized optimization, which requires that each user has thecomplete information of other users and compute the solu-tion accordingly. This is often infeasible in a cognitive ra-dio network, since acquiring complete information requiresheavy information exchange among the users, and settingup and maintaining a common control channel for messagebroadcasting demands high system overheads [1]. Moreover,this approach is not incentive compatible since some usersmay not be willing to share their local information due to
����� ������ ���
����� ������ ���
��� ����� ������ ���
������ ���
��� ������ ������
������ ���
���
Figure 5: Time structure of a decision period
the energy consumption of information broadcasting. Wethus propose a distributed learning algorithm for any spatialspectrum access games, and the algorithm does not requireany information exchange among users. Each user only esti-mates its expected throughput locally, and learns to adjustits channel selection strategy adaptively. We show that thedistributed learning algorithm can converge to a mixed Nashequilibrium approximately.
5.1 Expected Throughput EstimationWe first introduce the estimation of user’s expected through-
put based on local observations. To achieve an accurateestimation, a user needs to gather a large number of localobservation samples. This motivates us to divide the spec-trum access time into a sequence of decision periods indexedby T (= 1, 2, ...), where each decision period consists of tmax
time slots (see Figure 5 as an illustration). During a singledecision period, a user accesses the same channel in all tmax
time slots. Thus the total number of users accessing eachchannel does not change within a decision period, which al-lows users to better learn the environment.Suppose user n chooses channel m to access at decision
period T . According to (3), a user’s expected through-put during period T depends on the probability of grabbingthe channel gn(N an
n (a(T ))) on that period, the channel idleprobability θm, and the mean data rate Bnm. Similarly toour work in [7] on the expected throughput estimation forimitation-based spectrum sharing mechanism design, we willapply the maximum likelihood estimation (MLE) to get ac-curate estimations of there parameters for the distributedlearning mechanism design, due to MLE’s efficiency and easeof implementation.
5.1.1 Maximum Likelihood EstimationAt the beginning of each time slot t(= 1, ..., tmax) of a
decision period T , a user n will sense the same channelan(T ) = m. If the channel is idle, the user will compete tograb the channel according to a specified channel contentionmechanism. At the end of each time slot t, a user observesSnm(T, t), I
nm(T, t), and b
nm(T, t). Here S
nm(T, t) denotes the
state of the chosen channel m (i.e., whether occupied bythe primary traffic), Inm(T, t) indicates whether the user hassuccessfully grabs the channel, i.e.,
Inm(T, t) =
⎧⎪⎨⎪⎩1, if user n successfully
grabs the channel m,
0, otherwise,
and bnm(T, t) is the received data rate on the chosen channelm by user n at time slot t. At the end of each decision periodT , each user n can collect a set of local observations Ωn(T ) ={Snm(T, t), Inm(T, t), bnm(T, t)}tmax
t=1 . Note that if Snm(T, t) = 0
(i.e., the channel is occupied by the primary traffic), we alsoset Inm(T, t) and b
nm(T, t) to be 0.
When the channel m is idle (i.e., no primary traffic), user
210
n will grab the channel with probability gn(N ann (a(T ))).
Since there are a total of∑tmaxt=1 Snm(T, t) rounds of channel
contentions in the period T and each round is independentand identically distributed (i.i.d.), the total number of suc-cessful channel captures
∑tmaxt=1 Inm(T, t) follows the Binomial
distribution. A user n can then compute the likelihood ofgn(N an
n (a(T ))), i.e., the probability of the realized observa-tions Ωn(T ) given the parameter gn(N an
n (a(T ))) as
L[Ωn(T )|gn(N ann (a(T )))]
=
( ∑tmaxt=1 Snm(T, t)∑tmaxt=1 Inm(T, t)
)gn(N an
n (a(T )))∑tmax
t=1 Inm(T,t)
× (1− gn(N ann (a(T ))))
∑tmaxt=1 Sn
m(T,t)−∑tmaxt=1 Inm(T,t).
Then MLE of gn(N ann (a(T ))) can be computed by maximiz-
ing the log-likelihood function lnL[Ωn(T )|gn(N ann (a(T )))],
i.e., maxg(k(T )) lnL[Ωn(T )|gn(N ann (a(T )))]. By the first or-
der condition, we obtain the optimal solution as gn(N ann (a(T ))) =
∑tmaxt=1 Inm(T,t)
∑tmaxt=1 Sn
m(T,t), which is the sample averaging estimation.
When length of decision period tmax is large, by the cen-tral limit theorem, we know that
gn(N ann (a(T )))
∼ N(gn(N an
n (a(T ))),gn(N an
n (a(T )))(1− gn(N ann (a(T ))))∑tmax
t=1 Snm(T, t)
),
where N (·) denotes the normal distribution.Similarly, we can apply the MLE to estimate the chan-
nel idle probability θm and the mean channel data rate Bnm.More specifically, when the channel state Sm(t) and the real-ized data rate bnm(t) are i.i.d. random variables, we can easily
obtain the closed-form estimations as θm =∑tmax
t=1 Snm(T,t)
tmax
and Bnm =∑tmax
t=1 bnm(T,t)∑tmax
t=1 Inm(T,t), respectively3. By the MLE, we
can obtain the estimation of gn(N ann (a(T ))), θm and Bnm
as gn(N ann (a(T ))), θm and Bnm, respectively, and then esti-
mate the true expected throughput Un(a(T )) as Un(T ) =
θmBnmgn(N an
n (a(T ))). Since according to the central limit
theorem gn, θm, and Bnm follow independent normal distri-
butions with the mean gn(N ann (a(T ))), θm, and B
nm, respec-
tively, we thus have
E[Un(a(T ))] = E[θmBmgn(N ann (a(T )))] = Un(a(T )),
i.e., the estimation of expected throughput Un(a(T )) is un-biased.
5.2 Distributed Learning AlgorithmBased on the expected throughput estimation, we now
propose the distributed learning algorithm for spatial spec-trum access games. The idea is to extend the principle ofsingle-agent reinforcement learning to a multi-agent setting.Such multi-agent reinforcement learning algorithm has alsobeen applied to the classical congestion games on completegraphs [14,20]. Here we apply the learning algorithm to thegeneralized spatial congestion games on any generic graphsand derive the convergence conditions accordingly. The al-gorithm works as follows.
3When Sm(t) and bnm(t) are non-i.i.d. random variables, the
MLE can also be derived based on the specific probabilitydistribution functions by following the similar procedure asintroduced in Section 5.1.1.
At the beginning of each period T , a user n ∈ N chooses achannel an(T ) ∈M to access according to its mixed strategyσn(T ) = (σnm(T ),∀m ∈M), where σnm(T ) is the probabilityof choosing channel m. The mixed strategy is generated ac-cording to P n(T ) = (Pnm(T ),∀m ∈M), which represents itsperceptions of the payoff performance of choosing differentchannels based on local estimations. Perceptions are basedon local observations in the past and may not accuratelyreflect the expected payoff. For example, if a user n has notaccessed a channel m for many decision intervals, then per-ception Pnm(T ) can be out of date. The key challenge for thelearning algorithm is to update the perceptions with properparameters such that perceptions equal to expected payoffsat the equilibrium.Similarly to the single-agent learning, we choose the Boltz-
mann distribution as the mapping from perceptions to mixedstrategies, i.e.,
σnm(T ) =eγP
nm(T )∑M
i=1 eγPn
i (T ), ∀m ∈M, (14)
where γ is the temperature that controls the randomness ofchannel selections. When γ → 0, each user will choose toaccess channels uniformly at random. When γ →∞, user nalways chooses the channel with the largest perception valuePnm(T ) among all channel m ∈ M. We will show later onthat the choice of γ trades off convergence and performanceof the learning algorithm.At the end of a decision period T , a user n computes its
estimated expected payoff Un(a(T )) as in Section 5.1 (i.e.,by using the MLE method based on the set of local observa-tions Ωn(T ) during the period), and adjusts its perceptionsas
Pnm(T+1) =
{(1− μT )Pnm(T ) + μT Un(a(T )), if an(T ) = m,
Pnm(T ), otherwise,
(15)where (μT ∈ (0, 1), ∀T ) are the smoothing factors. A useronly changes the perception of the channel just accessed inthe current decision period, and keeps the perceptions ofother channels unchanged.Algorithm 1 summarizes the distributed learning algo-
rithm. Next we study the convergence of the learning algo-rithm based on the theory of stochastic approximation [12].
5.3 Convergence of Distributed Learning Al-gorithm
We now study the convergence of the proposed distributedlearning algorithm.First, the perception value update in (15) can be written
in the following equivalent form,
Pnm(T+1)−Pnm(T ) = μT [Znm(T )−Pnm(T )], ∀n ∈ N ,m ∈M,
(16)where Znm(T ) is the update value defined as
Znm(T ) =
{Un(a(T )), if an(T ) = m,
Pnm(T ), otherwise.(17)
For the sake of brevity, we denote the perception val-ues, update values, and mixed strategies of all the users asP (T ) � (Pnm(T ),∀m ∈M, n ∈ N ), Z(T ) � (Znm(T ), ∀m ∈M, n ∈ N ), and σ(T ) � (σnm(T ),∀m ∈ M, n ∈ N ), respec-tively.
211
Algorithm 1 Distributed Learning Algorithm For SpatialSpectrum Access Game
1: initialization:2: set the temperature γ.3: set the initial perception values Pnm(0) =
1Mfor each
user n ∈ N .4: set the period index T = 0.5: end initialization
6: loop for each decision period T and each user n ∈ N inparallel:
7: select a channel m ∈M according to (14).8: for each time slot t in the period T do9: sense and contend to access the channel m.10: record the observations Snm(T, t), I
nm(T, t) and
bnm(T, t).11: end for12: estimate gn(N an
n (a(T ))), θm, and Bnm by the max-
imum likelihood estimation.13: compute the estimated expected payoff Un(a(T )).14: update the perceptions value P n(T ) according to
(15).15: set the period index T = T + 1.16: end loop
Let Pr{Nmn (a(T ))|P (T ), an(T ) = m} denote the condi-
tional probability that, given that the users’ perceptions areP (T ) and user n chooses channel m, the set of users thatchoose the same channel m in user n’s neighborhood Nn isNmn (a(T )) ⊆ Nn. Since each user independently chooses
a channel according to its mixed strategy σn(T ), then therandom set Nm
n (a(T )) follows the Binomial distribution of|Nn| independent non-homogeneous Bernoulli tries with theprobability mass function as
Pr{Nmn (a(T ))|P (T ), an(T ) = m}
=∏
i∈Nmn (a(T ))
(σim(T ))∏
i∈Nn\Nmn (a(T ))
(1− σim(T ))
=∏i∈Nn
(σim(T ))I{ai(T )=m}(1− σim(T ))1−I{ai(T )=m} , (18)
where I{ai(T )=m} = 1 if user i chooses channel m, andI{ai(T )=m} = 0 otherwise.Since the update value Znm(T ) depends on user n’s esti-
mated payoff Un(a(T )) (which in turn dependents onNmn (a(T ))),
thus Znm(T ) is also a random variable. The equations in (16)are hence stochastic difference equations, which are difficultto analyze directly. We thus focus on the analysis of itsmeandynamics [12]. To proceed, we define the mapping from theperceptions P (T ) to the expected payoff of user n choosing
channel m as Qnm(P (T )) � E[Un(a(T ))|P (T ), an(T ) = m].Here the expectation E[·] is taken with respective to themixed strategies σ(T ) of all users (i.e., the perceptions P (T )of all users due to (14)). We show that
Lemma 3. For the distributed learning algorithm, if thetemperature satisfies
γ <1
2maxm∈M,n∈N {θmBnm}maxn∈N {|Nn|} , (19)
the mapping from the perceptions to the expected payoff Q(P (T )) �
(Qnm(P (T )),m ∈M, n ∈ N ) forms a maximum-norm con-traction.
Note that the condition (19) is a sufficient condition toform a contraction mapping, which is in turn is a sufficientcondition for convergence. Simulation results show that aslightly larger γ may also lead to the convergence of themapping. Based on the property of contraction mapping,there exists a fixed point P ∗ such that Q(P ∗) = P ∗. Bythe theory of stochastic approximations [12], we show thatthe distributed learning algorithm also converges to the samelimit point P ∗.
Theorem 5. For the distributed learning algorithm, if thetemperature γ satisfies (19),
∑T μT =∞ and
∑T μ
2T <∞,
then the sequence {P (T ),∀T ≥ 0} converges to the unique
limit point P ∗ � (Pn∗m ,∀m ∈ M, n ∈ N ) of the differentialequations (∀m ∈M, n ∈ N )
dPnm(T )
dT= σnm(T ) (Q
nm(P (T ))− Pnm(T )) , (20)
with probability one. Further, the limit point P ∗ satisfies
Qnm(P∗) = Pn∗m ,∀m ∈M, n ∈ N . (21)
We next explore the property of the equilibrium P ∗ ofthe distributed learning algorithm. From Theorem 5, we seethat
Qnm(P∗) = E[Un(a(T ))|P ∗, an(T ) = m] = Pn∗m . (22)
It means that the perception value Pn∗m is an accurate esti-mation of the expected payoff in the equilibrium. Moreover,we show that the mixed strategy σ∗ is an approximate Nashequilibrium.
Definition 8 (Approximate Nash Equilibrium [8]).A mixed strategy profile σ = (σ1, ..., σN ) is a ξ- approximateNash equilibrium if
Un(σn, σ−n) ≥ maxσn
Un(σn, σ−n)− ξ,∀n ∈ N ,
where Un(σn, σ−n) denotes the expected payoff of player nunder mixed strategy σ, and σ−n denotes the mixed strategyprofile of other users except player n.
Here ξ ≥ 0 is the gap from a (precise) mixed Nash equi-librium. For the distributed learning algorithm, we showthat
Theorem 6. For the distributed learning algorithm, themixed strategy σ∗ in the equilibrium P ∗ is a ξ-approximateNash equilibrium, with ξ = maxn∈N {− 1
γ
∑Mm=1 σ
n∗m lnσn∗m }.
The gap ξ can be interpreted as the weighted entropy,which describes the randomness of the learning exploration.A larger ξ means worse learning performance. When eachuser adopts the uniformly random access, the gap ξ reachesthe maximum value and results in the worst learning per-formance. Theorems 5 and 6 together illustrate the trade-off between the convergence and performance through thechoice of γ. A small enough γ is required to explore the envi-ronment (so that users are not getting stuck in channels withthe current best payoffs) and guarantee the convergence ofdistributed learning. If γ is too small, however, then theperformance gap ξ is large due to over-exploration.
212
��� ���
������
�
�
�
!
"
#
$ �
�
�
!
"
#
$
� �
�
�
� !
" #
$
� �
�
! "
#
$
� �
Figure 6: Interference Graphs
6. NUMERICAL RESULTSWe now evaluate the proposed distributed learning al-
gorithm by simulations. We consider a Rayleigh fadingchannel environment. The data rate of user n on an idlechannel m is given according to the Shannon capacity, i.e.,
bnm(t) = Vm log2
(1 +
ζngnm(t)
N0
), where Vm is the bandwidth
of channel m, ζn is the power adopted by user n, N0 is thenoise power, and gnm(t) is the channel gain, which is a ran-dom variable that follows the exponential distribution withthe mean gnm. In the following simulations, we set Vm = 10MHz, N0 = −100 dBm, and ζn = 100 mW. By choosing dif-ferent mean channel gains gnm, we have different mean datarates Bnm = E[bnm(t)] for different channels and users. Forsimplicity, we set channel state Sm(t) as an i.i.d. Bernoullirandom variable with the idle probability θm = 0.5.We consider a network of M = 5 channels and N = 9
users with four different interference graphs (see Figure 6).Graphs (a) and (b) are undirected, and Graphs (c) and (d)
are directed. Let �Bn = {Bn1 , ..., BnM} be the mean data ratevector of user n. We set �B1 = �B2 = �B3 = {2, 6, 16, 20, 30}Mbps, �B4 = �B5 = �B6 = {4, 12, 32, 40, 60} Mbps, and �B7 =�B8 = �B9 = {10, 30, 80, 100, 150} Mbps. We implement boththe random backoff and Aloha mechanisms for channel con-tention. For the random backoff mechanism, we set the num-ber of backoff mini-slots in a time slot λmax = 10. For theAloha mechanism, the channel contention probabilities ofthe users are {0.7, 0.7, 0.7, 0.5, 0.5, 0.5, 0.3, 0.3, 0.3}, respec-tively. Notice that in this study we focus on channel choicesinstead of the adjustment of contention probabilities.For the distributed learning algorithm initialization, we
set the length of each decision period tmax = 200, which canachieve a good estimation of the mean data rate. We setthe smooth factor μT =
1200+T
, which satisfies the condition∑T μT =∞ and
∑T μ
2T <∞.
We first evaluate the distributed learning algorithm withdifferent choices of temperature γ on the interference graph(d) in Figure 6. We run the learning algorithm sufficientlylong until the time average system throughput does notchange. The result in Figure 7 verifies the trade-off be-tween the convergence and performance, and demonstratesthat a proper temperature γ can offer the best performance.When is γ small, the gap ξ in Theorem 6 can be large.When γ is very large, the algorithm may get stuck in localoptimum and the performance is again negatively affected.We set γ = 5.0 in the following simulations since it achieves
0.001 0.005 0.01 0.05 0.1 0.5 1.0 2.0 5.0 8.0 10.0 15.0 20.0 30.0 40.0
50
100
150
200
250
300
Temperature γ
Sys
tem
Ave
rage
Thr
ough
put
Random Backoff MechanismAloha Mechanism
Figure 7: The system performance of the distributedlearning algorithm with different temperature γ
0 200 400 600 800 1000 1200 1400 1600 1800 20000
10
20
30
40
50
60
70
80
Iterations
Use
rs’ A
vera
ge T
hrou
ghpu
t
User 9 User 8
User 7
User 5 User 6
User 4User 2User 1 User 3
Figure 8: Users’ averagethroughput on interfer-ence graph (d) with ran-dom backoff mechanism
0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000
5
10
15
20
25
Iterations
Use
rs’ A
vera
ge T
hrou
ghpu
t
User 7User 8
User 6
User 1User 3 User 4 User 5 User 2
User 9
Figure 9: Users’ aver-age throughput on inter-ference graph (d) withAloha mechanism
good system performance in both random backoff and Alohamechanisms as in Figure 7.We then look at the learning dynamics. Figures 8 and 9
show the dynamics of users’ time average throughputs withrandom backoff and Aloha mechanisms, respectively. Theseresults demonstrate the convergence of the distributed learn-ing algorithm in both mechanisms.To benchmark the performance of the distributed learning
algorithm, we compare it with the solutions obtained by thefollowing two algorithms:
• Random Access: each user chooses a channel to accesspurely randomly.
• Centralized Optimization: the solution obtained bysolving the centralized global optimization ofmaxa
∑n∈N Un(a).
We implement these algorithms together with the dis-tributed learning algorithm on the four types of interfer-ence graphs in Figure 6. The results are shown in Figures10 and 11. For the random backoff (Aloha, respectively)mechanism, we see that the distributed learning algorithmachieves up-to 100% (65%, respectively) performance im-provement over the random access algorithm. Comparedwith the centralized optimal solution, the performance lossof the distributed learning in the full-interference graph (b)is 28% (34%, respectively). Such performance loss is not dueto the algorithm design; instead it is due to the selfish na-ture of the users. In the partial-interference graphs (a), (c),and (d), the performance loss can be further reduced to lessthan 10% (17%, respectively). This shows that the negative
213
������
� �
�� �
���!
��!
�
"
#"
�""
�#"
�""
�#"
�""
�#"
�����$�% �����$&% �����$�% �����$�%
�����������
'�����&!���(������
)�����*�+��,�����+����
Figure 10: Compari-son of distributed learn-ing, random access, andcentralized optimizationwith the random backoffmechanism
������
� �
�� �
���!
��!
�
"
�"
-"
."
/"
�""
��"
�-"
�����$�% �����$&% �����$�% �����$�%
�����������
'�����&!���(������
)�����*�+��,�����+����
Figure 11: Compari-son of distributed learn-ing, random access, andcentralized optimizationwith the Aloha mecha-nism
impact of users’ selfish behavior is smaller when users canshare the spectrum more efficiently through spatial reuse.
7. CONCLUSIONIn this paper, we explore the spatial aspect of distributed
spectrum sharing, and propose a framework of spatial spec-trum access game on directed interference graphs. We inves-tigate the critical issue of the existence of pure Nash equilib-ria, and develop a distributed learning algorithm convergingto an approximate mixed Nash equilibrium for any spatialspectrum access games. Numerical results show that thealgorithm is efficient and has significant performance gainover a random access algorithm that does not take the spa-tial effect into consideration.For the future work, we are going to design distributed
algorithms that maximize the system-wide throughput.
ACKNOWLEDGMENTSThis work is supported by the General Research Funds (ProjectNumber 412509, 412710, and 412511) established under theUniversity Grant Committee of the Hong Kong Special Ad-ministrative Region, China.
8. REFERENCES[1] I. Akyildiz, W. Lee, M. Vuran, and S. Mohanty. Next
generation/dynamic spectrum access/cognitive radiowireless networks: a survey. Computer Networks,50(13):2127–2159, 2006.
[2] A. Anandkumar, N. Michael, and A. Tang.Opportunistic spectrum access with multiple users:learning under competition. In Proc. of IEEEINFOCOM, San Diego, USA, March 2010.
[3] N. Biggs, E. Lloyd, and R. Wilson. Graph Theory.Oxford University Press, 1986.
[4] V. Bilo, A. Fanelli, M. Flammini, and L. Moscardelli.Graphical congestion games with linear latencies. InProc. of ACM SPAA, Munich, Germany, June 2008.
[5] X. Chen and J. Huang. Spatial spectrum access game:Nash equilibria and distributed learning. Technicalreport, The Chinese University of Hong Kong, 2012.Available at http://jianwei.ie.cuhk.edu.hk/publication/MobihocTech.pdf.
[6] X. Chen and J. Huang. Evolutionarily stable openspectrum access in a many-users regime. In Proc. ofIEEE GLOBECOM, Houston, USA, Dec. 2011.
[7] X. Chen and J. Huang. Imitative spectrum access. InProc. of IEEE WiOpt, Paderborn, Germany, May2012.
[8] C. Daskalakisa, A. Mehtab, and C. Papadimitriou. Anote on approximate Nash equilibria. TheoreticalComputer Science, 410(17):1581–1588, 2009.
[9] M. Fl ↪elegyhlczi, M. Cagalj, and J.-P. Hubaux.Efficient mac in cognitive radio systems: Agame-theoretic approach. IEEE Transactions onwireless Communications, 8:1984–1995, 2009.
[10] S. Govindan and R. Wilson. A global newton methodto compute nash equilibria. Journal of EconomicTheory, 110:65–86, 2003.
[11] Z. Han, C. Pandana, and K. J. R. Liu. Distributiveopportunistic spectrum access for cognitive radiousing correlated equilibrium and no-regret learning. InProc. of IEEE WCNC, Hong Kong, March 2007.
[12] H. Kushner and G. Yin. Stochastic Approximation andRecursive: Algorithms and Applications.Springer-Verlag, New York, 2003.
[13] K. Liu and Q. Zhao. Decentralized multi-armed banditwith multiple distributed players. In Proc. of IEEEIAT, San Diego, USA, Jan. 2010.
[14] E. Melo. Congestion pricing and learning in trafficnetwork games. Journal of Public Economic Theory,13(3):351–367, 2011.
[15] I. Milchtaich. Congestion games with player-specificpayoff functions. Games and Economic Behavior,13:111–124, 1996.
[16] D. Monderer and L. S. Shapley. Potential games.Games and Economic Behavior, 14:124–143, 1996.
[17] J. Nash. Equilibrium points in n-person games.Proceedings of the National Academy of Sciences,36:48–49, 1950.
[18] N. Nie and C. Comniciu. Adaptive channel allocationspectrum etiquette for cognitive radio networks. InProc. of IEEE DySPAN, Baltimore, USA, Nov. 2005.
[19] D. Niyato and E. Hossain. Competitive spectrumsharing in cognitive radio networks: a dynamic gameapproach. IEEE Transactions on WirelessCommunications, 7:2651–2660, 2008.
[20] D. Shah and J. Shin. Dynamics in congestion games.ACM SIGMETRICS Performance Evaluation Review,38(1):107–118, 2010.
[21] R. Southwell and J. Huang. Convergence dynamics ofresource-homogeneous congestion game. In Proc. ofICST GameNets, Shanghai, China, April 2011.
[22] C. Tekin, M. Liu, R. Southwell, J. Huang, andS. Ahmad. Atomic congestion games on graphs andtheir applications in networking. to appear inIEEE/ACM Transactions on Networking, 2012.
[23] B. Wang, Y. Wua, and K. R. Liu. Game theory forcognitive radio networks: An overview. ComputerNetworks, 54:2537–2561, 2010.
[24] M. Weiss, M. Al-Tamaimi, and L. Cui. Dynamicgeospatial spectrum modelling: taxonomy, options andconsequences. In Proc. of TPRC Conference, Fairfax,USA, Sep. 2010.
214