file
TRANSCRIPT
CITY UNIVERSITY OF HONG KONGô/Œ⇥'x
Energy-Efficient Heuristics for JobAssignment in Server Farms
�Ÿh4-‹ºÿ˝H˚Ÿ⌃MOLÑ_|✏ó’
Submitted toDepartment of Electronic Engineering
˚PÂ↵˚in Partial Fulfillment of the Requirements
for the Degree of Doctor of PhilosophyÚxZÎxM
by
FU JingÖg
June 2016
Abstract
The rapidly increasing energy consumption in modern server farms has driven
studies to improve their energy efficiency. In this thesis, we analyze the stochastic
job assignment in a server farm comprising servers with various speeds, power
consumptions and buffer sizes. Ultimately, we seek to optimize the energy efficiency
of server farms by controlling the carried load on the networked servers.
We consider two types of assignment policies, with and without jockeying. The
jockeying in a multi-queue system indicates reassignment of incomplete jobs. The
job assignment policies considered here select a server, which has a vacancy, to
accept new arrival jobs. In the jockeying case, the reassignment of jobs that are
being served in the system will also be considered by a job assignment policy. The
energy efficiency of a server farm is defined as the ratio of its job throughput to its
power consumption and represents the useful work per unit cost. The maximization
of the ratio of job throughput to power consumption can be modeled as a (semi)
Markov decision process, and an optimal policy is obtained by conventional dynamic
programming techniques. However, the algorithm for an optimal solution in our
multi-queue systems is constrained by its computational complexity, and is imple-
mented as a baseline policy when comparing the performance of policies that are
computationally feasible in a server farm with tens of thousands of servers.
For the jockeying case, we propose a robust job assignment policy called E*, to
maximize the energy efficiency of a server farm already defined earlier. We model
the server farm as a system of parallel finite-buffer processor-sharing queues with
heterogeneous server speeds and energy consumption rates. We devise E* as an
insensitive policy so that the stationary distribution of the number of jobs in the
system depends on the job size distribution only through its mean. We provide a
rigorous analysis of E* and compare it with a baseline approach, known as most
energy-efficient server first (MEESF), that greedily chooses the most energy-efficient
servers for job assignment. We show that E* has always a higher job throughput
than that of MEESF, and derive realistic conditions under which E* is guaranteed to
outperform MEESF in terms of energy efficiency. Extensive numerical results are
presented to illustrate that E* can improve energy efficiency by up to 100%.
We also study the problem of job assignment in a large-scale realistically-
dimensioned server farm comprising multiple processor-sharing servers with differ-
ent service rates, energy consumption rates, and buffer sizes. Our aim is to optimize
the energy efficiency of such a server farm by effectively controlling carried load
on networked servers. To this end, we propose a job assignment policy, called Most
energy-efficient available server first Accounting for Idle Power (MAIP), which is
both scalable and near optimal. MAIP focuses on reducing the productive power used
to support the processing service rate. Using the framework of semi-Markov decision
process we show that, with exponentially distributed job sizes, MAIP is equivalent
to the well-known Whittle’s index policy. This equivalence and the methodology of
Weber and Weiss enable us to prove that, in server farms where a loss of jobs happens
if and only if all buffers are full, MAIP is asymptotically optimal as the number of
servers tends to infinity under certain conditions associated with the large number of
servers as we have in a real server farm. Through extensive numerical simulations,
we demonstrate the effectiveness of MAIP and its robustness to different job-size
distributions, and observe that significant improvement in energy efficiency can be
achieved by utilizing knowledge of energy consumption rate of idle servers.
iv
CITY UNIVERSITY OF HONG KONG Qualifying Panel and Examination Panel
Surname:
FU
First Name:
Jing
Degree: PhD
College/Department: Department of Electronic Engineering
The Qualifying Panel of the above student is composed of: Supervisor(s) Dr. WONG Wing Ming Eric Department of Electronic Engineering
City University of Hong Kong
Co-Supervisor(s) Prof. ZUKERMAN Moshe Department of Electronic Engineering
City University of Hong Kong
Qualifying Panel Member(s) Dr. CHAN Chi Hung Sammy Department of Electronic Engineering
City University of Hong Kong
Prof. CHEN Guanrong Department of Electronic Engineering
City University of Hong Kong
This thesis has been examined and approved by the following examiners: Dr. CHAN Chi Hung Sammy Department of Electronic Engineering
City University of Hong Kong
Dr. YEUNG Kai Hau Alan Department of Electronic Engineering City University of Hong Kong
Dr. WONG Wing Ming Eric Department of Electronic Engineering City University of Hong Kong
Prof. CHAN Shueng Han Department of Computer Science and Engineering Hong Kong University of Science & Technology
Acknowledgements
I extend my sincere thanks to my supervisor, Dr. Eric W. M. Wong, who guided me
into this area with great patience and optimistically walked me through those tough
areas in my Ph. D. study. I offer my heartfelt gratitude to my co-supervisor, Prof.
Moshe Zukerman. I could not finish my study without his invaluable support and
supervision. I express my sincere gratitude to Prof. Bill Moran for his innovative
inspirations and insightful comments, which amazingly broadened my horizons.
I offer my deep gratefulness to Dr. Jun Guo for all his time and efforts in helping
and teaching me, and I have learnt so much from him.
I am so grateful for my families and all my friends who have delivered their great
courages to me for my research and my dream.
Table of contents
List of figures xv
List of tables xvii
Summary of major symbols and acronyms xix
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Maximization of the Ratio of Long-Run Expected Reward to Long-Run
Expeced Cost 11
2.1 Server Farm Model . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Markov/Semi-Markov Decision Process . . . . . . . . . . . . . . . 13
2.3 Long-Run Average Reward . . . . . . . . . . . . . . . . . . . . . . 15
2.4 Maximization of Average Reward divided by Average Cost . . . . . 16
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Job-Assignment Heuristics with Jockeying 21
3.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Table of contents
3.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Insensitive Conditions with Jockeying . . . . . . . . . . . . . . . . 29
3.3.1 Adaptation of symmetric queue . . . . . . . . . . . . . . . 30
3.4 Optimality for Insensitive Job Assignments . . . . . . . . . . . . . 33
3.4.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4.2 Polynomial Computational Complexity . . . . . . . . . . . 35
3.5 Jockeying Job-Assignment Heuristics . . . . . . . . . . . . . . . . 38
3.5.1 MEESF . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5.2 E⇤ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5.3 Rate-Matching . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5.4 Insensitive MEESF Policy . . . . . . . . . . . . . . . . . . 40
3.5.5 Insensitive E⇤ Policy . . . . . . . . . . . . . . . . . . . . . 48
3.5.6 Approximate E⇤ . . . . . . . . . . . . . . . . . . . . . . . 65
3.6 conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4 Job Assignment Heuristics without Jockeying 69
4.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.3 Job Assignment Policy: MAIP . . . . . . . . . . . . . . . . . . . . 81
4.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.4.1 Stochastic Process . . . . . . . . . . . . . . . . . . . . . . 83
4.4.2 Whittle’s Index . . . . . . . . . . . . . . . . . . . . . . . . 85
4.4.3 Indexability . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.4.4 Asymptotic optimality . . . . . . . . . . . . . . . . . . . . 90
4.5 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.5.1 Effect of Idle Power . . . . . . . . . . . . . . . . . . . . . 96
4.5.2 Effect of Jockeying Cost . . . . . . . . . . . . . . . . . . . 100
4.5.3 Sensitivity to the Shape of Job-Size Distributions . . . . . . 103
xii
Table of contents
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5 Conclusions 107
References 111
Appendix A Theoretical results for Jockeying Job-Assignments 125
A.1 An Example for Mapping Qf . . . . . . . . . . . . . . . . . . . . . 125
A.2 Pseudo-Code of the Algorithm for an Insensitive Optimal Policy . . 126
A.3 Proof of Proposition 3.1 . . . . . . . . . . . . . . . . . . . . . . . . 127
A.4 Proof of Lemma 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . 133
A.5 Proof of Lemma 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . 135
A.6 Proof of Lemma 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . 135
A.7 Proof of Proposition 3.7 . . . . . . . . . . . . . . . . . . . . . . . . 136
A.8 Lemmas for Proposition 3.6 . . . . . . . . . . . . . . . . . . . . . 139
A.9 Proof of Proposition 3.6 . . . . . . . . . . . . . . . . . . . . . . . . 146
A.10 Proof of Proposition 3.8 . . . . . . . . . . . . . . . . . . . . . . . . 151
Appendix B Theoretical Proofs for Non-Jockeying Job-Assignments 161
B.1 Proof for Proposition 4.1 . . . . . . . . . . . . . . . . . . . . . . . 161
B.2 Proof for Proposition 4.2 . . . . . . . . . . . . . . . . . . . . . . . 164
B.3 Proof for Proposition 4.3 . . . . . . . . . . . . . . . . . . . . . . . 165
B.4 Consequences of the averaging principle . . . . . . . . . . . . . . . 166
xiii
List of figures
3.1 State transition diagram of the logically-combined queue under any
insensitive jockeying policy f 2Fs. . . . . . . . . . . . . . . . . . 32
3.2 Illustration of optimizing bK for the E* policy. . . . . . . . . . . . . 41
3.3 Energy efficiency of E⇤ and MEESF versus arrival rate. . . . . . . . 45
3.4 Energy efficiency of E⇤ and MEESF versus buffer size. . . . . . . . 46
3.5 Relative difference of energy efficiency of E⇤ to MEESF. . . . . . . 47
3.6 Relative difference of energy efficiency of E⇤ to MEESF. . . . . . . 48
3.7 Relative difference of power consumption of E⇤ to MEESF. . . . . . 49
3.8 Virtual probability of service rate. . . . . . . . . . . . . . . . . . . 61
3.9 Relative difference of energy efficiency of E⇤ to RM. . . . . . . . . 65
3.10 Difference of the value of bK of E⇤ to RM. . . . . . . . . . . . . . . 67
4.1 Performance comparison with respect to normalized system load r .
(a) Relative difference of L MAIP/E MAIP to L MNIP/E MNIP. (b) Job
throughput. (c) Relative difference of E MAIP to E MNIP. . . . . . . . 97
4.2 Performance comparison with respect to number of servers K. (a)
Energy efficiency. (b) Job throughput. (c) Power consumption. . . . 98
4.3 Cumulative distribution of relative difference of L MAIP/E MAIP to
L MNIP/E MNIP. (a) r = 0.4. (b) r = 0.6. (c) r = 0.8. . . . . . . . . 99
xv
List of figures
4.4 Performance comparison in terms of the energy efficiency with
respect to the number of servers K. (a) D = 0. (b) D = 0.0005. (c)
D = 0.01. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.5 Performance comparison in terms of the job throughput with respect
to the number of servers K. (a) D = 0. (b) D = 0.0005. (c) D = 0.01. 103
4.6 Cumulative distribution of relative difference of L MAIP,F/E MAIP,F
to L MAIP,Exponential/E MAIP,Exponential; (a) r = 0.4. (b) r = 0.6. (c)
r = 0.8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
A.1 An example for multi-queue system. . . . . . . . . . . . . . . . . . 126
A.2 An example for multi-queue system. . . . . . . . . . . . . . . . . . 126
A.3 An example for arrival event. (a) Multi-queue system. (b) Logically
combined queue. . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
A.4 An example for jockeying upon arrival event. (a) Multi-queue system.
(b) Logically combined queue. . . . . . . . . . . . . . . . . . . . . 127
A.5 An example for departure event. (a) Multi-queue system. (b) Logi-
cally combined queue. . . . . . . . . . . . . . . . . . . . . . . . . 128
A.6 An example for jockeying upon departure event. (a) Multi-queue
system. (b) Logically combined queue. . . . . . . . . . . . . . . . . 128
xvi
List of tables
3.1 Summary of major symbols . . . . . . . . . . . . . . . . . . . . . . 28
3.2 The running times of the algorithm for I-J-OPT . . . . . . . . . . . 37
3.3 The Counter Example for the Instantaneous Optimality of MEESF . 44
4.1 Summary of Frequently Used Symbols . . . . . . . . . . . . . . . . 80
xvii
Summary of major symbols and
acronyms
Roman Symbols
af
j (n j) indicator for server j in state n j under policy f : server j is tagged if af
j (n j) =
1; untagged if af
j (n j) = 0.
af (t) random variable, representing the action taken at time t under policy f
B j buffer size of server j
e scaler for e-revised reward
g scaler for g-revised reward
j server label
K number of servers
bK parameter for E⇤ policy
n vector of states (n1,n2, . . . ,nK)
n j state of server j representing the number of jobs in server j
R set of real
R j reward rate function for server j: a mapping from N j to R+
xix
Summary of major symbols and acronyms
R j(n j) reward rate of server j in state n j
R+ set of positive real
R vector of reward rate functions (R1,R2, . . . ,RK)
eB j aggregate buffer size of the first j servers
Xf
j (t) random variable, representing the number of jobs in server j at time t under
policy f
Greek Symbols
e j energy consumption rate of server j when it is busy
e
0j energy consumption rate of server j when it is idle
k parameter for extended E⇤ policy
l average arrival rate
µ j service rate of server j
n
⇤j (n j) index value for server j in state n j
F set of all stationary job assignment policy without jockeying
f job assignment policy
FJ set of all stationary job assignment policy with jockeying
g
f (R) long-run average reward of multi-queue system with reward rate function
vector R under policy f
r normalized offered traffic
e
e
bK aggregated energy consumption rate of the virtual server under policy E⇤ with
parameter bK
xx
Summary of major symbols and acronyms
e
µ
bK aggregated service rate of the virtual server under policy E⇤ with parameter bK
Other Symbols
E f long-run average energy consumption rate (power consumption) under policy
f
K set of servers
L f long-run average throughput (job throughput) under policy f
N set of all state vectors n
N j set of all states of server j
N {0,1}j set of controllable states of server j
N {0}j set of uncontrollable states of server j
R j set of all reward rate functions for server j
Acronyms / Abbreviations
CDF cumulative distribution function
E⇤ E⇤ policy
FIFO first in, first out
I-J-OPT optimal solution among all insensitive jockeying policies
JSQ Join the Shortest Queue policy
MAIP Most energy efficient available server first-Accounting for Idle Power
MEESF Most Energy Efficient Server First policy
MNIP Most energy efficient available server first-Neglecting Idle Power
xxi
Summary of major symbols and acronyms
PDF probability density function
PS processor sharing
QoS quality of service
RM Rate Matching policy
xxii
Chapter 1
Introduction
1.1 Introduction
Data centers have become essential to the functioning of virtually every sector of a
modern economy [1]. Server farms in data centers are known for their massive power
consumption [2]. The energy efficiency of server farms is an important concern when
considering greenhouse gas emissions. Driven by the green datacenter initiative to
facilitate a low-carbon economy in the information age [3], there are strong incentives
for managing the power consumption of server farms while maintaining acceptable
levels of performance [4].
Also, the data-center industry, with more than 500 thousand data centers world-
wide in 2011 [5], has been driven by dramatically increasing Internet traffic. For
example, the global IP traffic will annually pass 1.1 zettabyte by 2016 [6]. An
estimated 91 billion kWh of electricity was consumed by U.S. data centers in 2013,
and annual consumption is growing. It will achieve $13 billion annually on elec-
tricity bills, and will emit nearly 100 million metric tons of carbon pollution per
year by 2020 [7]. Computing servers do account for the major portion of the energy
consumption of data centers [8] .
1
Introduction
A standard model for a data center is that of several server farms [1], such a
structure being also suitable for other business models including cloud computing [9].
Our aim here is to describe optimal scheduling/dispatching strategies for incoming
requests in order to improve energy efficiency.
This has been a topic of interest for some time, with approaches such as speed
scaling approach that minimizes server cost by controlling server speed(s) [10–16],
right-sizing server pool that powers on/off servers according to system’s workload
[17–19], and resource allocation methodologies that distributes power budget for
energy conservation [20]. Our approaches that are in line with [21, 22] and investigate
the problem of stationary job assignment in a server farm. These approaches provide
a way for optimizing the power consumption of server farms by controlling the
carried load on the networked servers as a function of the (fixed) server speeds and
energy consumption rates. It can also be combined with speed scaling at each server
for local fine-tuning.
Rapid improvements in computer hardware have resulted in server farms with
a range of different computer resources (heterogeneous servers) being deployed in
modern data centers [23]. This heterogeneity significantly complicates the optimiza-
tion, since each server needs to be considered individually. Here we exhibit energy
efficiency improvements by exploiting of this heterogeneity by means of scalable
job-assignment policies that are applicable to real server farms with tens of thousands
of servers.
As defined in [21, 22], we regard the ratio of the long-run average throughput
divided by the expected energy consumption rate as our objective function, hereafter
referred to as the energy efficiency of a server farm. This objective function can be
justified as the amount of useful work (e.g., data rate, throughput) per watt.
In Chapter 2, for later analysis, we build up connections between the expected
cumulative reward and the long-run average reward per unit cost, following the idea
in [21].
2
1.1 Introduction
In Chapters 3 and 4, we analyze the processor sharing (PS) discipline on each
queue, where all jobs on the same queue share the processing capacity and are
served at the same speed. The PS discipline avoids unfair delays for jobs which
face extremely large jobs ahead of them; as a result PS is suitable in modeling web
server farms [24–26] where job-size distributions are of high variability [27]. In
communication systems, broader applications for PS queues have been studied, e.g.,
[28, 29].
In Chapter 3, we focus on job assignment policies that allow jockeying. When
jockeying is permitted, jobs can be reassigned to any server with buffer vacancies
at any time before they are completed. Assignment with jockeying provides more
freedom and are very suitable for server farms where the servers are collocated in a
single physical center and can use e.g., a shared DRAM-storage [30] or flash-storage
[31]. Jockeying policies in general can significantly improve system performance.
In addition, they are scalable in computation and hence make resource optimization
more tractable.
For such a problem, a straightforward heuristic approach is to greedily choose the
most energy-efficient servers for job assignment. It can be shown under certain con-
ditions that this approach, which we call most energy-efficient server first (MEESF)
in Chapter 3, maximizes the ratio of instantaneous job departure rate to instantaneous
energy consumption rate. In general, however, this is not the case. We shall see that
MEESF does not necessarily maximize the ratio of long-run average job departure
rate (i.e., job throughput) to long-run average energy consumption rate (i.e., power
consumption). This observation motivates us to design a more robust heuristic policy
for improving the energy efficiency of the system but which nevertheless achieves a
higher job throughput than can be achieved with MEESF.
In Chapter 4, we study a single server farm model with a large number of
heterogeneous servers where jockeying is not allowed. The jockeying case discussed
in [21, 22] is more appropriate for a localized server farm, in which the cost of
3
Introduction
jockeying actions (e.g. live-migration) is negligible. For server farms with significant
jockeying costs, a simple scalable job-assignment policy without jockeying is more
attractive. Policies in [21, 22] assume, in addition, that power consumption on the
idle servers is negligible; this too is an unrealistic assumption for some computing
environments. Here, we consider a more realistic system in which idle servers do not
necessarily have negligible power consumption [32]. A key feature of our approach
is to model the problem as an instance of the Restless Multi-Armed Bandit Problem
(RMABP) [33] in which, at each epoch, a policy chooses a server to be tagged for
a new job assignment (other servers are said to be untagged). The RMABP has
been proved to be SPACE-hard [34], and this has led to studies in scalable and
near-optimal approximations, such as index policies. An index policy selects a set of
tagged servers at any epoch based on their state-dependent indices.
We assume that a server farm (queueing system) cannot turn away a job if it has
buffer space available. When a farm has some inefficient servers, rejection of some
jobs might save energy if only those servers are available, but this is not permitted.
This might occur, for instance, when a server farm vendor is unable to replace all
older servers simultaneously, but legacy inefficient servers might be needed to meet
a service level agreement. In other words, the constraint on the number of tagged
servers in conventional RMABP has to be replaced by a constraint on the number of
tagged servers in controllable states. To the best of our knowledge, no theoretical
work has been published for the asymptotic optimality of an index policy for a
multi-server system with finite buffer size, where loss of jobs happens if and only
if all servers (queues) are full. Buffering spill-over creates dependencies between
servers, and requires us to postulate uncontrollable states [35].
4
1.2 background
1.2 background
Server farm management has been studied for many years, with applications includ-
ing data centers, cloud computing, large scale storage system, and network resource
allocation. These problems are modeled as multi-server (queue) systems, where each
server has a limited quantities of resources (e.g. bandwidth, memories, positions
for parallel tasks, etc.). The resources used by jobs (tasks) have to be paid, and the
reward from completed jobs then obtained.
Data center vendors will always balance the operation costs, e.g., energy con-
sumption, and Service Level Agreement (SLA), e.g., throughput and response time.
For interactive applications that are sensitive to their response time, some researchers
have considered minimizing the average delay of a server farm model with homoge-
neous servers (queues) [18, 36].
With an eye towards optimizing response time in heterogeneous environments,
experimental work in on/off powering servers has been studied in order to improve
the proportionality when system utilization is low [37]. A group of work proposed
to minimize the sum of weighted delay time and power consumption with geographi-
cally distributed data centers [38], with geographically differed electricity prices and
setup-delay [39]; other work is done with service rates in smaller time granularity
than the decision of the number of servers [40] and with consideration of multi-core
processors [41]. In particular, in [39–42], the authors mapped the optimization
problem as a convex optimization problem, the papers [39–41] assume a convex
relationship between the service rate and the power consumption of the server, and
[40] gave a theoretical bound for the performance deviation of the heuristic policy
from an optimal solution.
Another approach to improve efficiency in server farms is to minimize the energy
consumption with guaranteed average response time. In [43], three-tier web clusters
was considered with different servers in each tier by Mixed Integer Non-Linear
5
Introduction
Programming (MINLP). In [44], methodology was studied with geographically
varying electricity prices based on convex optimization and multi-step resource
usage prediction; and, in [45], scheduling in a multi-server model was analyzed
with no-waiting room in each server. In [46], the authors considered both cases
that optimizes power consumption with bounded response time and that minimizes
response time with given power budget by Pareto optimization.
Other experimental works in the heterogeneous environment with respect to
response time include improving server utilization with forecasting two classes of
workload on each computing infrastructure [47]. Khazaei et al. in [48] analyzed the
case of generally distributed service times for user requests (tasks) in a M/G/m/m+ r
model by assuming a limited maximum number (e.g. three) of departures between
two arrivals.
To better differentiate our proposed scheme with previous work, more detailed
related work will be distributed into corresponding contribution chapters.
1.3 Overview
In Chapter 2, we discuss the general framework for optimizing a long-run average
reward per unit cost by means of SMDP. In Chapters 3 and 4, we analyze the job
assignment policies in server farms with and without jockeying actions, respectively.
1.4 Contributions
In this thesis, we study the energy-efficient job assignment policies in server farms,
aiming at maximization of the energy efficiency (the ratio of the long-run average
throughput to the long-run average power consumption).
In Chapter 2, we introduce the problem for heterogeneous server farms as SMDP
and then introduce the e-revised criteria [21] in a vein of conventional g-revised
6
1.4 Contributions
criteria [49]. The g-revised criteria connects the optimization of expected cumulative
reward to the optimization of the long-run average reward per unit time. The e-
revised criteria connects the optimization of expected cumulative reward to the
optimization of the long-run average reward per unit cost. This cost can be specified
as time, energy consumption and etc.
We then continue our analysis of energy-efficient job assignments in both jockey-
ing and non-jockeying cases.
For the jockeying policies, our main contributions in Chapter 3 are summarized
as follows:
• We prove that the computational complexity of the algorithm proposed in [21]
for an optimal solution is O(K logKB), where K is the number of servers and
B is the total number of positions to allocate jobs in the entire system. For
a system with homogeneous buffer size of each server, the algorithm is of
linear complexity in the buffer size and of quadratic-logarithmic complexity in
the number of servers. For the cubic power functions, the complexity of the
algorithm is quadratically increasing in the number of servers.
• We prove that the optimal insensitive policy in the jockeying case also optimize
the energy efficiency for all jockeying policies with exponentially distributed
job sizes.
• Despite the polynomial complexity mentioned above, the algorithm is not
sufficiently scalable for systems with tens of thousands of servers. According
to our numerical experiments, it takes around five days running time to obtain
an optimal solution for 1,000 servers with 100 buffers for each server. This
motivates our studies on nearly optimal and scalable methodologies
• We propose a robust heuristic policy for stochastic job assignment in a finite-
buffer PS server farm. In this chapter we shall name this heuristic policy as E*
7
Introduction
to reflect our goal of designing a “star” policy that can maximize the energy
efficiency of the system. We demonstrate the effectiveness of E* by comparing
it to the baseline MEESF policy. Unlike MEESF that greedily chooses the
most energy-efficient servers for job assignment, E* aggregates an optimal
number of bK � 2 most energy-efficient servers to form a virtual server. In
our design, E* always gives preference to this virtual server and utilizes its
service capacity in such a way as to guarantee a higher job throughput than
what is achievable with MEESF, and yet improve the energy efficiency of
the system. The decision variable bK provides a degree of freedom for E* to
fine-tune performance.
• We discuss the insight gained from our design of E*. This will lead us to
proposing a simple rule of thumb for determining the optimal bK value that
maximizes the energy efficiency of the system under E*. The resulting policy,
referred to as rate matching (RM), chooses the value of bK so that the aggregate
service rate of the virtual server matches the job arrival rate. We provide
extensive numerical results to demonstrate the effectiveness of RM relative to
E*.
• We devise E* as an insensitive job assignment policy, in that, the stationary
(steady-state) distribution of the number of jobs in the system depends on the
job size distribution only through its mean. This insensitivity property is useful
for enssuring the robustness and predictability of the performance of the server
farm under a wide range of job size distributions.
• We perform a rigorous analysis of E*. In particular, we prove that E* has
always a higher job throughput than MEESF. Under the realistic scenario
where at least two servers in a heterogeneous server farm are equally most
energy efficient, we also prove that E* is guaranteed to outperform MEESF
in terms of the energy efficiency of the system. Having at least two servers
8
1.4 Contributions
that are equally energy efficient can be justified in practice by the fact that a
server farm is likely to comprise multiple servers of the same type purchased
at a time.
For the non-jockeying case, the main contributions of Chapter 4 are listed as
follows.
• We propose a job assignment policy, referred to as Most-energe-efficient-
server-first Accounting for Idle Power (MAIP). The MAIP requires the power
consumption of idle servers as an input parameter for decision making, and
provides a model of a realistic system with significant power consumption in
idle states. The MAIP is scalable, and requires only binary state information
of servers; this is appropriate for a real environment with frequently changing
server states.
• In our server farm, we apply the well-known Whittle’s index policy that
decomposes a complex RMABP problem into multiple sub-problems, each
assumed computationally feasible. In the general case, Whittle’s index does
not necessarily exist, and even if it does, a closed form solution is usually
unavailable.
• Remarkably we prove that when job sizes are exponentially distributed, the
Whittle’s index policy is equivalent to MAIP, and that it is asymptotically
optimal for our server farm composed of multiple groups of identical servers
as the numbers of servers in these groups tend to infinity.
• Extensive numerical results in Section 4.5 illustrate the performance and
the sensitivity to the shape of job-size distributions of these policies. We
numerically demonstrate the effectiveness of MAIP by comparing it to a
baseline policy. From the numerical results, the performance of MAIP in terms
9
Introduction
of energy efficiency is nearly insensitive to different job-size distributions
under the processor sharing discipline.
1.5 Publications
[1] J. Fu, J. Guo, E. W. M. Wong and M. Zukerman, “Energy-efficient heuristics
for insensitive job assignment in processor-sharing server farms," IEEE Journal on
Selected Areas in Communications, vol. 33, no. 12, pp. 2878-2891, Dec. 2015.
[Online Available] http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7279057
[2] J. Fu, J. Guo, E. W. M. Wong and M. Zukerman, “Energy-efficient heuristics for
job assignment in processor-sharing server farms," in Proc. IEEE INFOCOM, Hong
Kong SAR, China, pp. 882–890, Apr. 2015.
[Online Available] http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=7218459
[3] Z. Rosberg, Y. Peng, J. Fu, J. Guo, E. W. M. Wong and M. Zukerman, “Insen-
sitive Job Assignment with Throughput and Energy Criteria for Processor-Sharing
Server Farms," IEEE/ACM Transactions on Networking, vol. 22, no. 4, pp. 1257–
1270, Aug. 2014.
[Online Available] http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6583274
10
Chapter 2
Maximization of the Ratio of
Long-Run Expected Reward to
Long-Run Expeced Cost
The main objective of our problem in this thesis is to maximize the energy efficiency
of a server farm. The energy efficiency is defined as the ratio of the long-run
average throughput to the long-run average power consumption. Here, we regard the
throughput and power consumption as the reward and cost of our system, then this
ratio becomes the ratio of long-run expected reward to long-run average cost. This
objective is more general than the traditional optimization problem focusing on the
expected reward (cost) per unit time.
In this chapter, we introduce and analyze the relationships between optimizing
cumulative reward, average reward and average reward divided by average cost. In
this chapter, we assume a Poisson arrival process and exponentially distributed job
sizes.
11
Maximization of the Ratio of Long-Run Expected Reward to Long-RunExpeced Cost
2.1 Server Farm Model
We study a heterogeneous server farm modeled as a multi-queue system. There are
K servers and the set of these servers is denoted by K = {1,2, . . . ,K}. Let N j be
the set of all states of server j 2K , where |N j|<+•. Decisions are made upon
arrival/departure events to allocate jobs to the servers (queues) in the server farm
(queueing system).
We also define X(t) = (X1(t),X2(t), . . . ,XK(t)) as the random variable repre-
senting the state at time t � 0 in the stochastic process of the multi-queue sys-
tem. The probability distribution of X(t) is denoted by P(t). Define vectors
n = (n1,n2, . . . ,nK) representing possible values of the random variable X(t), with
n j 2N j, j 2K . The set of all such n is denoted by N .
We consider a set of stationary policies, denoted F, which is a subset of the set
of all policies, where a policy is said to be stationary if it is non-randomized and
the action it takes at time t only depends on the state of the process at time t. The
decisions (actions) upon arrivals/departures rely on the values of the random variable
X(t) at the moments right before these arrivals/departures occur. A policy f gives
these decisions (actions) along the time line.
Let {Xf (t), t > 0}, denote the stochastic process under policy f with initial state
Xf (0) = x(0). For each j 2K , there exists a set of mappings R j = {R j : N j!R+}
with R j(n j) the reward rate for server j2K in state n j 2R j. Let R=(R j 2R j : j2
K ) denote a vector of these mappings, we define a function
g
f (R) = limt!+•
1t
Z t
0E(
Âj2K
R j(Xf
j (u))
)
du. (2.1)
If µ j(n j) and e j(n j) are the service rate and the energy consumption rate of server j in
state n j, respectively, then µ j,e j 2R j. Let µ = (µ j : j 2K ) and e = (e j : j 2K ).
In this thesis, our objective function is to maximize the ratio of the long-run average
12
2.2 Markov/Semi-Markov Decision Process
reward to the long-run average cost, when the reward is defined as the job throughput
and the cost is defined as the power consumption of the entire system, namely,
supf2F
g
f (µ)
g
f (e). (2.2)
2.2 Markov/Semi-Markov Decision Process
To solve the optimization problem with the objective function (2.2), in this section,
we introduce the well-known Semi-Markov Decision Process and then apply it as a
key tool to model our problem.
In this section, we focus on the maximization of the long-run expected non-
discounted reward, and, in later sections in this chapter, we will show the connections
between the long-run expected reward, the long-run average reward (Section 2.3) and
the long-run expected reward devided by the long-run expected cost (Section 2.4).
In our model defined in Section 2.1, there are K +1 key events, each of which
causes a change of state. One represents the arrival of jobs, labeled event 0, and the
other K events are departures from server 1, 2, . . ., K, labeled as event i, i = 1, 2,
. . ., K, respectively. Either an arrival or a departure will trigger a policy decision
(action) to rearrange the jobs into the system positions followed by an increment or a
decrement (by one) of the number of jobs in the system. In the definition of SMDP,
the increment/decrement of the number of jobs is also defined as part of the decision
actions. The rearrangement will be discussed in the case where jockeying is allowed.
For non-jockeying case, we simplify the rearrangement as the selection of a server to
allocate a new arriving job or decrement by one of the number of jobs of a server
associated with the occurred departure.
We define time points t0 ⌘ 0, t1, t2, . . . as the occurrence times for events where
ti is the time of the ith event, i = 1, 2, . . .. As defined above, any occurrence of an
event will trigger a decision action. For t > 0, let af (t) represent the action taken
13
Maximization of the Ratio of Long-Run Expected Reward to Long-RunExpeced Cost
at time t immediately following the event at t under policy f . The reward rate for
server j 2K during the period between two time points, say ti�1 and ti, depends on
the state Xf
j (ti) during this time period, and is represented by R j(Xf
j (ti)), R j 2R j.
For a stationary policy f 2 F, let V f (n,R) be the expected value of the accu-
mulative reward of the process of the system that starts from state n 2N until
it firstly goes into an absorbing state n0 2N with a vector of reward functions
(mappings) R = (R j 2R j : j 2K ) (i.e., R j(n j) is the reward rate of server j in
state n j). In particular, V f (n0,R)⌘ 0 for any f 2F and R j 2R j, j 2K . Without
loss of generality, we can extend the process associated with V f (n,R) to a long-run
process by setting both the transition probability, from state n0 to any other state
n 2N , and the reward rate in state n0 to be zero. For clarity of presentation, we
still refer to this extended long-run process as the process associated with V f (n,R).
If this extended process gives rise to an irreducible Markov chain with finite state
space, then, V f (n,R)<+• for f 2F [49, Corollary 6.20].
For n 2N and R = (R j 2R j : j 2K ), let
V (n,R) = supf2F
V f (n,R), (2.3)
and define the optimal policy f
⇤ 2F by
V f
⇤(n,R) =V (n,R), (2.4)
where f
⇤ exists since |F| < +•. Moreover, let t
f (n) denote the expected length
of a sojourn that the system stays in state n under policy f , and let Pf
n,n0 denote the
transition probability from state n to n0, n,n0 2N , under policy f . According to
the Hamilton-Jacobi-Bellman equation, we have
V (n,R) = supaf (n)
(
t
f (n) Âj2K
R j(n j)dt + Ân02N , n0 6=n
Pf
n,n0V (n0,R)
)
. (2.5)
14
2.3 Long-Run Average Reward
2.3 Long-Run Average Reward
In this section, we introduce how, in our problem, the optimization of the expected
accumulative reward V f (n,R) can be transferred to the optimization of the long-run
average reward by means of the g-revised reward.
Here, we consider a process that initializes at state n until its first return to the
same state, called process Pf
1 (n). Then, the expected cumulative reward of this
process is denoted by Uf (n,R). Define T f (n) as the expected time of the first return
to state n with initial state n. According to [49, Corollary 6.20], if process Pf
1 (n)
gives rise to an irreducible Markov chain with finite state space, then T f (n) is finite.
According to [49, Theorem 7.5], if T f (n) is finite, then the long-run average reward
under stationary policy f is equivalent to Uf (n,R)T f (n) .
We extend process Pf
1 (n) to a long-run version by repeating this process such
that those replicas are probabilistic replicas of Pf
1 (n). Then, the long-run average
reward of the extended Pf
1 (n) process is the same as Pf
1 (n), which, under certain
conditions, is equivalent to the ratio Uf (n,R)T f (n) . Namely, If Pf
1 (n) gives rise to an
irreducible Markov chain for any f 2 F, then, optimizing the long-run average
reward is same as optimizing the average reward of process Pf
1 (n).
We use the g-revised reward to maximize the expected average reward. The
g-revised reward (cost) is the reward obtained by subtracting a constant g from the
reward rate. There exists a value of g such that the policy optimizing the expected
accumulative g-revised reward will also optimize the expected average reward with
the original reward function.
Let T f
v (n) be the expected time of the process associated with V f (n,R). We
define the g-revised version of V f (n,R) as
V f ,g(n,R) =V f (n,R)�gT f
v (n).
15
Maximization of the Ratio of Long-Run Expected Reward to Long-RunExpeced Cost
Also, we set V g(n,R) = supf
V f ,g(n,R). According to [49, Theorem 7.6, Theorem
7.7], we obtain following corollary.
Corollary 2.1. If Uf (n,R) and Âj2K
R j(n j)tf (n) are finite, R = (R j 2R j : j 2K ),
for all f 2F and n 2N , then there exists a scaler g satisfying
V g(n,R) = supaf (n)
(
Âj2K
R j(n j)tf (n)+ Â
n02NPf
n,n0Vg(n0,R)�gt
f (n)
)
, (2.6)
and
g =Uf
⇤(n0,R)
T f
⇤(n0,R)= sup
f2F
Uf (n0,R)
T f (n0,R), (2.7)
where n0 is the absorbing state of V f (n,R),V f ,g(n,R) as defined before.
The optimal policy f
⇤ 2F prescribes an action that maximizes the right hand
side of (2.6) for any n 2N .
In particular, for our server farm problem, we set the absorbing state n0 to be the
empty state 0, which leads to only one available action: all servers are idle. Then, the
stationary policy f
⇤ for U(0,R) can be identified by solving (2.6). We will describe
the details of the algorithms for an optimal policy in later chapters where we focus
on more detailed situations.
2.4 Maximization of Average Reward divided by Av-
erage Cost
In previous section, we have explained the connection between the maximization of
long-run average reward and the maximization of V f (n,R) by means of the g-revised
reward introduced in Section 2.3.
We are now going to extend this g-revised reward to a more general situation
based on the result claimed in [21]; we refer to this result as e-revised reward. The
e-revised reward generalize g-revised reward in a way that the long-run expected
16
2.4 Maximization of Average Reward divided by Average Cost
reward per unit time, can be extended to the ratio of the long-run average reward to
the long-run average cost.
Define a vector C = (Cj 2R j : j 2K ) as the vector of cost functions. In the
vein of the definition of the g-revised reward function in Section 2.3, for a vector of
cost functions C, we define the e-revised version of V f (n,R) as
V f ,e(n,R,C) =V f (n,R)� eV f (n,C),
and
V e(n,R,C) = supf2F
V f ,e(n,R,C).
Similarly to Corollary 2.1, we obtain the following proposition for this e-revised
reward function.
Proposition 2.1. If Uf (n,R), Uf (n,C) and Âj2K
�
R j(n j)� eCj(n j)�
t
f (n) are fi-
nite, R = (R j 2R j : j 2K ),C = (Cj 2R j : j 2K ), for all f 2 F and n 2N ,
then, there exists a scaler e satisfying
V e(n,R,C) = supaf (n)
(
Âj2K
�
R j(n j)� eCj(n j)�
t
f (n)+ Ân02N
Pf
n,n0Ve(n0,R,C))
)
,
(2.8)
and
e =Uf
⇤(n0,R)
Uf
⇤(n0,C)= sup
f2F
Uf (n0,R)
Uf (n0,C), (2.9)
where n0 is the absorbing state of V f (n,R),V f ,e(n,R,C) as defined before.
The optimal policy f
⇤ 2F is a policy prescribes an action that maximizes the
right hand side of (2.8) for any n 2N .
Proof. This proposition is derived from Corollary 2.1 and [21, Theorem 1].
Define
Uf ,e,g(n,R,C) =Uf (n,R)� eUf (n,C)�gt
f (n),
17
Maximization of the Ratio of Long-Run Expected Reward to Long-RunExpeced Cost
and
Ue,g(n,R,C) = supf2F
Uf ,e,g(n,R,C).
Clearly, (2.9) is equivalent to Ue,0(n,R,C) = 0. Observe that Ue,0(n,R,C) is con-
tinuous in e 2 R. Since Uf (n,R) and Uf (n,C) are finite, then Ue,0(n,R,C)!+•
as e!�• and Ue,0(n,R,C)!�• as e!+•.
We then discuss the monotonicity of Ue,0(n,R,C) on e 2 R. For any e 2 R, we
have
Ue,0(n,R,C) =Uf
⇤e ,e,0(n,R,C)�Uf ,e,0(n,R,C).
Then, for any e1 < e, we obtain
Ue1,0(n,R,C)=Uf
⇤e1,e1,0(n,R,C)�Uf
⇤e ,e1,0(n,R,C)>Uf
⇤e ,e,0(n,R,C)=Ue,0(n,R,C).
(2.10)
Namely, Ue,0(n,R,C) is monotonically decreasing in e 2 R. In another word, there
exists a uniquely value for e satisfying (2.9).
For clarity of presentation, let e⇤ denote that value of e satisfying (2.9). Without
loss of generality, we set g = 0 and then adjust the variable e 2 R to be e⇤, i.e.,
Ue⇤,0(n,R,C) = 0. We refer to such an optimal policy as f
⇤, i.e., Uf
⇤,e⇤,0(n,R,C) =
Ue⇤,0(n,R,C) = 0. Then, (2.7) also holds true when the reward rates for states
n 2N are set to be
Âj2K
�
R j(n j)� e⇤Cj(n j)�
. (2.11)
That is, the optimal policy f
⇤ that prescribes the action maximzing the right hand
side of (2.6) for the same reward rate, also maximizes the long-run average reward,
which is zero, for the same reward rate.
According to [21, Theorem 1], such f
⇤ associated with e⇤ with the zero long-run
average reward, where the reward rate in state n is given by (2.11), also maximizes
the ratio Uf (n,R)/Uf (n,C).
18
2.5 Conclusions
As a consequence, the optimal solution that maximizes the ratio of the long-run
average reward to the long-run average cost can be obtained by conventional policy
iteration (or value iteration) method.
Although the computing complexity of a general algorithm derived on policy
iteration is exponentially increasing in the scale of the server farm (the number
of servers), we will prove in Chapter 3 that the algorithm for an optimal solution
proposed in [21] is of polynomial computing complexity.
2.5 Conclusions
We have established a connection between the expected e-revised cumulative reward
and the long-run average reward per unit cost. Then, the dynamic programing based
on the Hamilton-Jacobi-Bellman equation can be used to maximize the ratio of the
long-run expected reward to the long-run expected cost.
19
Chapter 3
Job-Assignment Heuristics with
Jockeying
In this chapter, we focus on job assignment where jockeying is allowed. If jockeying
is permitted, incomplete jobs can be reassigned to other vacant positions in the
system. The reassignment of jobs provides more degrees of freedom and matches a
real server farm where servers are allocated in a single physical data center, and a
shared DRAM-storage [30] or flash-storage [31] is employed. In general, stationary
policies with jockeying can significantly improve the performance of the system. In
addition, they are computationally scalable so that the resource optimization under
jockeying is more tractable.
For such a resource optimization problem, a straightforward heuristic approach
is achieved by greedily selecting the most energy-efficient server at each decision
epoch. Under certain conditions, we will prove that the straightforward policy, which
we refer to as Most Energy-Efficient Server First (MEESF) in this chapter, maximizes
the ratio of instantaneous job departure rate to instantaneous energy consumption
rate. However, we shall see in general that MEESF does not necessarily optimize the
ratio of long-run average job departure rate (i.e., job throughput) to long-run average
energy consumption rate (i.e., power consumption). This observation incentives us
21
Job-Assignment Heuristics with Jockeying
to seek a more robust heuristic job-assignment policy which improves the energy
efficiency and yet yields a higher job throughput than available under MEESF.
As mentioned in Section 1.4, the main contribution of this chapter includes:
• We prove that the computational complexity of the algorithm proposed in [21]
for an optimal solution is O(K logKB), where K is the number of servers and
B is the total number of positions to allocate jobs in the entire system. For
a system with homogeneous buffer size of each server, the algorithm is of
linear complexity in the buffer size and of quadratic-logarithmic complexity in
the number of servers. For the cubic power functions, the complexity of the
algorithm is quadratically increasing in the number of servers.
• We prove that the optimal insensitive policy in the jockeying case also optimize
the energy efficiency for all jockeying policies with exponentially distributed
job sizes.
• Despite the polynomial complexity mentioned above, the algorithm is not
sufficiently scalable for systems with tens of thousands of servers. According
to our numerical experiments, it takes around five days running time to obtain
an optimal solution for 1,000 servers with 100 buffers for each server. This
motivates our studies on nearly optimal and scalable methodologies.
• We propose a robust job-assignment policy for a finite-buffer PS server farm.
We demonstrate the effectiveness of E* by comparing it to the baseline MEESF
policy. Unlike MEESF that greedily chooses the most energy-efficient servers
for job assignment, E* aggregates an optimal number of bK � 2 most energy-
efficient servers to form a virtual server. In our design, E* always gives
preference to this virtual server and utilizes its service capacity in such a way
as to guarantee a higher job throughput than what is achievable with MEESF,
22
and yet improve the energy efficiency of the system. The decision variable bK
provides a degree of freedom for E* to fine-tune performance.
• We discuss the insight gained from our design of E*. This will lead us to
proposing a simple rule of thumb for determining the optimal bK value that
maximizes the energy efficiency of the system under E*. The resulting policy,
referred to as rate matching (RM), chooses the value of bK so that the aggregate
service rate of the virtual server matches the job arrival rate. We provide
extensive numerical results to demonstrate the effectiveness of RM relative to
E*.
• We devise E* as an insensitive job assignment policy, in that, the stationary
(steady-state) distribution of the number of jobs in the system depends on the
job size distribution only through its mean. This insensitivity property is useful
for enssuring the robustness and predictability of the performance of the server
farm under a wide range of job size distributions.
• We perform a rigorous analysis of E*. In particular, we prove that E* has
always a higher job throughput than MEESF. Under the realistic scenario
where at least two servers in a heterogeneous server farm are equally most
energy efficient, we also prove that E* is guaranteed to outperform MEESF
in terms of the energy efficiency of the system. Having at least two servers
that are equally energy efficient can be justified in practice by the fact that a
server farm is likely to comprise multiple servers of the same type purchased
at a time.
The remainder of the chapter is organized as follows. In Section 3.3, we introduce
the concept and construction of symmetric queue which possesses the insensitivity
of the queueing system where jockeying is allowed. In Section 3.4, an algorithm
for the optimal insensitive jockeying policy is introduced; we complement the
23
Job-Assignment Heuristics with Jockeying
details in implementing the algorithm; and a theoretical proof is provided for the
polynomial computational complexity of the algorithm in the scale of the server farm.
In Section 3.5, we give a thorough analysis, including theoretical and numerical
results, for our proposed job-assignment heuristics which aim to maximize the energy
efficiency, under the insensitive structure.
3.1 Related Work
Queueing models associated with job assignment among multiple servers with and
without jockeying have been studied since the work of Haight [50] in 1958. Most
of the existing work has focused on policies that aim to improve the system’s
performance under the FCFS discipline such as Join-the-Shortest-Queue (JSQ). The
key aim of such policies is to balance system’s workload among all servers (queues),
and to optimize the throughput and the response time. The JSQ policy in the non-
jockeying case has been studied in [51–56] for FCFS and in [57–59] for processor
sharing.
For the non-jockeying case, Kingman [51] studied the non-jockeying case and
proved that assigning new arrival to the shortest queue would maximize the dis-
counted number of customers processed at any time t < +•. Then, Winston [52]
asserted that Join-the-Shortest-Queue with a finite number of servers maximizes
throughput during a fixed time t <+• in a FCFS homogeneous multi-queue system.
Later in 1987, Knessl et al. [55] analyzed a two-queue system with general service
times and infinite capacity under the Join-the-Shortest-Queue policy and provided
approximations for this system’s statistical characteristics. Later, JSQ became a
basic mechanism for computer networks and a popular model of multi-processor
architecture systems. Weber [53] considered multi-identical servers where if iden-
tical customers came in arbitrary arrival process and their job-size distribution is
characterized by a non-decreasing hazard function, JSQ will maximize the number
24
3.1 Related Work
of jobs served in a given time interval. For heterogeneous servers, Hordijk and Koole
[56] characterized, under certain condition, the optimal solution in the form that
“an arriving customer should be assigned to the queue with a faster server when
that server has a shorter queue" and such type of policies are referred to Shortest
Queue Faster Server Policy (SQFSP). However, fully characterizing optimal solu-
tion for a general heterogeneous problem is commonly considered difficult. Then,
approximation methodologies have been studied.
Bonomi [57] proved the optimality of JSQ for the processor sharing case under
a general arrival process, Markov departure processes and homogeneous servers.
A counter-example for the non-optimality of JSQ with non-exponential job-size
distribution was given by Whitt [58]. Gupta [59] provided an approximate analysis
for the performance of JSQ in a processor sharing model with a general job-size
distribution; also he intuitively explained and proved the optimality of JSQ in terms
of average delay for a system comprising servers with two different service rates.
Moreover, Altman et al. investigate optimal strategies for optimal centralized
load balancing in PS server farm and indicate that optimal policy for job assignment
in PS discipline is independent of service requirement of each request which is
dependent in FCFS [25].
Server farm applications of the JSQ with jockeying policies under the FCFS
discipline have been studied in [60–62]. Here, the jockeying action is triggered
when the difference between the queue sizes of the shortest queue and the longest
queue achieves a threshold value. Different threshold values clearly lead to different
policies. These publication focus on the expression of the equilibrium distribution
of the queue lengths; it has been shown that the service quality can be significantly
improved by jockeying.
For other job-assignments in processor sharing server farms without jockeying,
the energy efficiency of a multi-queue heterogeneous system with infinite buffer and
setup delay has been studied in [19, 63], where exact expressions of the value function
25
Job-Assignment Heuristics with Jockeying
for the Markov Decision Process (MDP) are given. Hyytiä et al. [63] showed that
the M/G/1-LCFS model is only sensitive to the mean of the set-up delay, while, the
M/G/1-PS model is not insensitive with respect of the set-up delay. Li et al. in [64]
proposed a heuristic algorithm which aims to maximize the weighted probability
of the combined execution time (delay) and the energy consumption metric with
constraints on both the delay deadline and the energy budget in a heterogeneous
computing system.
For other job-assignments in processor sharing server farms without jockeying,
the energy efficiency of a multi-queue heterogeneous system with infinite buffer and
setup delay has been studied in [19, 63], where exact expressions of the value function
for the Markov Decision Process (MDP) are given. Hyytiä et al. [63] showed that
the M/G/1-LCFS model is only sensitive to the mean of the set-up delay, while, the
M/G/1-PS model is not insensitive with respect of the set-up delay. Li et al. in [64]
proposed a heuristic algorithm which aims to maximize the weighted probability
of the combined execution time (delay) and the energy consumption metric with
constraints on both the delay deadline and the energy budget in a heterogeneous
computing system.
To the best of our knowledge, processor sharing multi-queue system with jockey-
ing has been only studied in [21, 22], where the optimization problem to maximize
the energy efficiency of systems, as mentioned in Section 1.1, is modeled as a
Semi-Markov Decision Process (SMDP) [65].
In the general context of Markov Decision Process (MDP) or SMDP, significant
work has been done to optimize the expected average reward (cost). In [66], Lippman
optimized the finite horizon discounted reward in three models, namely, the M/M/c
finite capacity queue, M/M/1 service rate control problem and M/M/c finite capacity
queue with policy-dependent arrival rates, by using the concept of "uniformization",
a device by which a virtual exponential clock runs independently from both policy
and the state of the stochastic process. The concept of g-revised reward [67] provides
26
3.2 Model
a bridge between the optimization problems of the expected total reward and the
expected average reward. In addition, an optimal solution that maximizes the
expected average reward can be achieved by using a procedure proposed in [67]. In
particular, Stidham and Weber [68] considered a service rate control problem of a
single queue with left-skip-free transition structure for both exponential and non-
exponential service times, where the decisions depend on the number of customers
(jobs) in the queue and no discounting is considered. They provided a method to
prove the monotonicity of the optimal service rates, i.e., service rates are increasing
in queue length; some of their optimal policies are shown to be insensitive to the
shape of service time distribution. George and Harrison [69] studied the service rate
control problem of a single queue evolving as a birth-and-death Markov process, and
proved the existence of monotonic optimal service rates under weaker assumption
than those of Stidham and Weber [68].
The long-run average reward per unit cost (e.g. time consumption, energy
consumption and etc.) in [21] the long-run average service quality per unit time
which had been studied previously. With similar techniques as in [66–69], Rosberg et
al. [21] proposed an algorithm to find the stationary job assignment that maximizes
the energy efficiency by exploring servers’ heterogeneity. As is algorithm not
sufficiently scalable for a real local server farm, the authors of [21] proposed the
scalable job-assignment Slowest Server First (SSF) policy which aims to approximate
this optimality. SSF demonstrated numerically to be near-optimal under a cubic-
relationship between service rate and power consumption.
3.2 Model
For the reader’s convenience, Table 3.1 provides a list of major symbols that we
use in this chapter. Our model of a server farm consists of K independent servers,
each having a finite buffer for queuing jobs. For j = 1,2, . . . ,K, we denote by B j the
27
Job-Assignment Heuristics with Jockeying
Table 3.1 Summary of major symbols
Symbol Definition
K Number of servers in the systemB j Buffer size of server jeBi Aggregate buffer size of the first i servers in the systemµ j Service rate of server je j Energy consumption rate of server j
µ j/e j Energy efficiency of server jl Job arrival ratebK Number of energy-efficient servers forming a virtual servere
µ
bK Aggregate service rate of the virtual servere
e
bK Aggregate energy consumption rate of the virtual servere
µ
bK/eebK Energy efficiency of the virtual serverbK⇤ Optimal value of bK chosen by the E* policybKRM Empirical value of bK chosen by the RM policyL f Job throughput of the system under policy f
E f Power consumption of the system under policy f
L f/E f Energy efficiency of the system under policy f
buffer size of server j where B j > 1. For notational convenience, we denote by eBi
the aggregate buffer size of the first i servers in the system, given by
eBk =k
Âl=1
Bl, k = 0,1, . . . ,K (3.1)
where eB0 = 0 by definition.
We denote by µ j the service rate of server j, defined as the units of jobs that can
be processed per time unit, and by e j the energy consumption rate of server j. Note
that, in the literature, the energy consumption rate of a server is usually related to the
server speed by a convex function of the form
e(µ) µ µ
b , b > 0 (3.2)
28
3.3 Insensitive Conditions with Jockeying
with b = 3 being the most commonly used value [70–72]. However, some researchers
have suggested that e(µ) is not necessarily convex [73, 74]. The job assignment
policies that we propose in this chapter do not require such an assumption.
We refer to the ratio µ j/e j as the energy efficiency of server j. Accordingly,
server i is defined to be more energy-efficient than server j if and only if µi/ei >
µ j/e j. Since our focus in this chapter is on developing energy-efficient job assign-
ment policies, for convenience of description, we label the servers according to their
energy efficiency; i.e. if i < j, then µi/ei � µ j/e j.
Remark 3.1. With this convention, for i < j, we have µie j� eiµ j � 0 since µi/ei �
µ j/e j.
We consider that jobs arrive at the system according to a Poisson process with
mean rate l . An arriving job is assigned to one of the servers with at least one vacant
slot in its buffer, subject to the control of an assignment policy. If all buffers are full,
the arriving job is lost.
We assume that job sizes (in units) are independent and identically distributed
random variables; with an cumulative distribution function (CDF) F(x), x � 0.
Without loss of generality, we normalize the average size of jobs to one. Each server
j serves its jobs at a total rate of µ j using the PS service discipline.
3.3 Insensitive Conditions with Jockeying
Rosberg et al. [21] proposed a class of insensitive stationary policies with jockeying
based on the concept of symmetric queue introduced in [75]. As in [21, 22], we
apply the symmetric queue model with state-dependent service rates defined in [75]
to our multi-queue system where jockeying is allowed. in Section 3.3.1, we begin
by providing an explanation based on [21, 22] of this adaptation to make this thesis
self-contained.
29
Job-Assignment Heuristics with Jockeying
3.3.1 Adaptation of symmetric queue
A symmetric queue is a logical queue defined in [75] as a queue that has a symmetry
between the service rate allocated to each position in the queue and the probability
that a newly arrived job will join the queue in the corresponding position. In
particular, if there are n = |n| (n2N ) jobs in the queue, the jobs are labeled by their
positions, say 1,2, . . . ,n, and the total service rate is µ(n), then a proportion g(l,n)
of the total service effort µ(n) is directed to the job at position l, l = 1,2, . . . ,n.
• When the job at position l completes its service and leaves the system, the jobs
at positions l +1, l +2, . . . ,n move to positions l, l +1, . . . ,n�1, respectively.
• When a job arrives, it moves into position l with probability g(l,n+1). The
jobs originally located at positions l, l + 1, . . . ,n move to positions l + 1, l +
2, . . . ,n+1, respectively.
This logical queue, referred to as a symmetric queue, is general enough to rule a group
of job-assignment policies by adapting the values of the service efforts assigned to
its positions. The equilibrium distribution of the state of such a symmetric queue
(the number of jobs in the queue n = 0,1, . . . , eBK) was shown to be of product-form,
hence insensitive to general service requirement distributions [75, 76]. Kelly [75]
showed that a stationary symmetric queue is insensitive to the shape of the service
time distribution, given that it can be represented as a mixture of Erlang distributions.
This result was extended by Barbour [77] to arbitrarily distributed service times.
Taylor [76] showed that the canonical insensitive queueing model, the Erlang loss
system, can be described as a symmetric queue.
Let the buffer position in the multi-queue system be defined using a 2-tuple
notation ( j,m) for position m at server j where j 2K and m = 1,2, . . . ,B j. For
a policy f 2 F, we define a one-to-one mapping Qf from the buffer positions in
the multi-queue system (with domain set Qm = {( j,m) : j 2K , m = 1,2, . . . ,B j})
30
3.3 Insensitive Conditions with Jockeying
to the buffer positions in a logically-combined queue (whose range set is Qs = {l :
l = 1,2, . . . , eBK}). Note that there could be more than one such mapping Qf that
matches the underlying policy, since the service discipline at each server is processor
sharing (PS) so that all the relevant mappings associated with the positions of a given
server are equivalent. For clarity of presentation, an example for the mapping Qf ,
including its relavence to jockeying actions, is provided in Appendix A.1.
Because of the insensitivity property of the logically-combined queue, by the
symmetric queue construction of [75] shown below, all these mappings give the same
stationary distribution of the underlying stochastic process, {Xf (t) = |Xf (t)|, t > 0},
where |Xf (t)|= Âj2K
Xf
j (t). We define N = {0,1, . . . , eBK} for clarity of presentation.
In a logically-combined queue, a stationary policy f 2 F affects the equilibrium
state distribution of the underlying process {Xf (t), t > 0} by controlling the total
service rate µ(n), rewritten µ
f (n), for n 2N . We define a subset of F, referred
to as Fs, so that, for any policy in Fs, there exists a logically-combined queue as
defined above.
For any mapping Qf for a policy f 2 Fs, the multi-queue system can be im-
plemented as a symmetric queue on the Qs domain. To show this, we begin by
deriving the state-dependent service rates of Qs. With the logically-combined queue,
each server j serves only the jobs located at its associated positions using the PS
discipline. For policy f at state n, the total service rate µ
f (n) is given by
µ
f (n) def= Â
j2T f (n)µ j,
where T f (n) is the set of busy servers.
Under the PS discipline, the proportion of µ
f (n) allocated to the job at position
l def= Qf ( j,m) on the Qs domain is equivalent to that allocated to the job at its
31
Job-Assignment Heuristics with Jockeying
Fig. 3.1 State transition diagram of the logically-combined queue under any insensi-tive jockeying policy f 2Fs.
corresponding position m of server j on the Qm domain, and is given by
g
f (l,n) =µ j
n jµf (n). (3.3)
To complete the construction of the logically-combined queue as a symmetric
queue, it remains to enforce a symmetry in the same manner as [75] between the
service rate allocated to each position in the logically-combined queue and the
probability that a newly arrived job will join the queue in the corresponding position.
That is, as in [21, 22], based on the one-to-one position mapping Qf from Qm to Qs,
the movements of jobs in the multi-queue system correspond to the movements of
jobs in the logically-combined queue.
Due to the insensitivity property of the symmetric queue, the state transition
process of the logically-combined queue in this context can be modeled as a birth-
death process (shown in Fig. 3.1) with birth rate l and death rates µ
f (n), n 2N .
Accordingly, the stationary distribution p
f = (pf
n , n2N ) of the process {Xf (t), t >
0} under any stationary policy f with jockeying, can be obtained by solving the
balance equations
lp
f
n = µ
f (n+1)pf
n+1, n = 0,1, . . . , eBK�1. (3.4)
32
3.4 Optimality for Insensitive Job Assignments
Then, the long-run job throughput of the system under policy f is equivalent to
the long-run average job departure rate, and can be obtained as
g
f (µ) =eBK
Ân=0
µ
f (n)pf (n), (3.5)
or
g
f (µ) = l
h
1�p
f (eBK)i
, (3.6)
for the stable system.
The power consumption of the system under policy f is equivalent to the long-run
average energy consumption rate, and can be obtained as
g
f (e) =eBK
Ân=0
e
f (n)pf (n) (3.7)
where e
f (n) is the total energy consumption rate in state n and is given by
e
f (n) def= Â
j2T f (n)e j. (3.8)
By definition, g
f (µ)/g
f (e) is the energy efficiency of the system under policy
f 2Fs.
3.4 Optimality for Insensitive Job Assignments
Rosberg et al. [21] proposed an algorithm for an optimal policy that maximizes
the energy efficiency of a server farm defined in Section 3.2 among all insensitive
stationary job-assignment policies based on the concept of symmetric queue.
In this section, we briefly introduce the algorithm proposed in [21], and then
refine it in the way that the computing complexity of the algorithm is guaranteed to
be polynomial in the scale of the server farm.
33
Job-Assignment Heuristics with Jockeying
We provide the pseudo-code of the algorithm in Appendix A.2, and the imple-
mentation, including source code and instruction, is given in [78]. Here, we firstly
explain the idea and theoretical basis for constructing this algorithm in Section 3.4.1
and complete the implementable details of the algorithm which are not described in
[21]. We then prove the computing complexity of the algorithm is polynomial in the
number of servers and size of buffers of the server farm in Section 3.4.2. Although
the polynomial algorithm provides a perfect benchmark in research experiments, the
computing time is not scalable for applications in a real server farm with thousands
of servers, as we will demonstrate by numerical results in Section 3.4.2.
3.4.1 Algorithm
By Proposition 2.1 in Chapter 2, maximizing the energy efficiency is equivalent in
maximizing the e-revised reward given by equation (2.8), where the optimal value e⇤
is conditioned by (2.9).
Based on the mapping from the server farm to a logically combined queue Qf , a
vector of state n 2N in the server farm can be mapped to a state n = |n|, n 2N , in
the logically combined queue. Thus, in the insensitive model defined in Section 3.3,
set V e(n) =V e(n,µ,e) where n = |n|. For clarity of presentation, in the insensitive
job-assignment problem with jockeying, we rewrite (2.8)
V e(n)�V e(n+1) = supT f (n)
1l
⇣
µ
f (n)(V e(n�1)�V e(n))+µ
f (n)� ee
f (n)⌘
,
(3.9)
for n = 2,3, . . . , eBK�1, where V e(1)�V e(0) = 0.
With
Y e(n) =V e(n�1)�V e(n), n = 1,2, . . . , eBK,
34
3.4 Optimality for Insensitive Job Assignments
we can rewrite (3.9) as
Y e(n+1) = supT f (n)
1l
⇣
µ
f (n)Y e(n)+µ
f (n)� ee
f (n)⌘
. (3.10)
The resulting sequence of sets of busy servers T I�J�OPT(n), n = 1,2, . . . , eBK , forms
the optimal insensitive job-assignment stationary policy I-J-OPT.
As in [21], the bisection method can be used to find e⇤. According to (2.10) in
the proof for Proposition 2.1, the accumulative e-revised reward Ue,0(n,µ,e), for
each µ,e 2R, is monotonically decreasing in e. We start from a lower and an upper
bound of e⇤, say e1 and e2, which are usually set to be 0 and the energy efficiency
under any f 2Fs, respectively, and then calculate the corresponding Ue1+e2
2 ,0(0,µ,e)
by Bellman equation repeatedly until |e1� e2| < d , where d > 0 is a given small
positive real. In the program available online [78], d = 10�30 (For readers who are
interested in the implementation details, we strongly recommend the files including
user instruction and source code available in [78]).
Proposition 3.1. For the system defined in Section 3.2, if the job-sizes are expo-
nentially distributed, then a policy that maximizes the energy efficiency among all
insensitive policies is also optimal among all stationary policies.
Proof. The proof is given in Appendix A.3.
3.4.2 Polynomial Computational Complexity
To find an optimal policy remains solving the right hand side of (2.8). We propose
details for implementing dynamic programming for problem (2.8).
For clarity of presentation, we define the set of all available sets of busy servers
for state n 2N as T (n) = {T |T ⇢K }, let
yT ,e(n) =1l
Âj2T
⇥
(Y e(n)+1)µ j� ee j⇤
, (3.11)
35
Job-Assignment Heuristics with Jockeying
for a set of busy servers T 2 T (n), Then, (3.10) can be rewritten as
Y e(n+1) = supT 2T (n)
yT ,e(n). (3.12)
For any policy f 2 Fs, n 2N , T ,T + { j} 2K and j /2 T , the increment of
yT ,e(n) for adding j into the busy set T is given by
yT [{ j},e(n)� yT ,e =1l
⇥
(Y e(n)+1)µ j� ee j)⇤
, (3.13)
of which the right hand side is independent from T . We hope to pick up all the j
with a higher value of the right hand side of (3.13) to comprise the set of busy servers,
T , which leads to a higher value of the resulting yT ,e(n). We obtain an order of
servers, say j1, j2, . . . , jK , which is descending in the value of the right hand side of
(3.13). Then, we define a set of integers, Le(n) = {l : { j1, j2, . . . , jl} 2 T (n)}, such
that there exists an l⇤ defined as
l⇤ = argmaxl2Le(n)y{ j1, j2,..., jl},e(n). (3.14)
To beak ties, we select the smallest l⇤ without loss of generality. An optimal soluton
is given by
T I�J�OPT (n) = { j1, j2, . . . , jl⇤}, (3.15)
Following this approach, the computational complexity of the algorithm for
I-J-OPT is O(D(K)eBK log(e2� e1)/d ), where the e2 and e1 denote the initial values
of the upper and lower bound of e⇤ for the bisection, d is a parameter for the accuracy
of the bisection, and O(D(K)) is the time complexity for ordering K servers based
on the the value of (3.13), which is O(K) for the cubic power functions assumed in
[21], and is O(K logK) for general power functions.
36
3.4 Optimality for Insensitive Job Assignments
Table 3.2 The running times of the algorithm for I-J-OPT
K 10 60 110 210 310 410 460Running Time (s) <1 20 97 680 1935 4316 6177
Despite its polynomial complexity which provides a perfect benchmark in re-
search experiments, the algorithm for I-J-OPT is not scalable to realistic applications
involving server farm with thousands of servers. This will be illustrated as follows.
We begin with a set of small examples when B j = 100, j = 1,2, . . . ,K, and K ranges
between 10 and 460. The running times of the algorithm for I-J-OPT for these
examples are presented in Table 3.2. These results are based on an implementation
of the algorithm in C++ on an iMAC with 2.7 GHz Intel Core i5, 8.00GB installed
memory, and OS X 10.9.2 as the operating system. Then, a polynomial fitting of the
data in Table 3.2 is performed by OriginPro 8 SR3 v8.0932, and the running time as
a function of K is approximated as
0.428M2�7.245M+239.79(s).
This implies around 420,994.79 seconds (4.87 days) for K = 1,000. The algorithm
for I-J-OPT is not sufficiently scalable.
When B j = B is a constant for all j and the power function follows the cubic
assumption, i.e., e j = µ
3j and the time complexity for ordering, O(D(K)), is now
O(K), the approximate time complexity in the experiments above is consistent with
our previous result.
37
Job-Assignment Heuristics with Jockeying
3.5 Jockeying Job-Assignment Heuristics
3.5.1 MEESF
MEESF is a straightforward heuristic approach that works by greedily choosing the
most energy-efficient server for job assignment. The SSF algorithm proposed in [21]
is a special case of MEESF by assuming for each server j its energy consumption
rate satisfies e j = µ
3j . It can be shown when all servers in the system have the same
buffer size, i.e., B1 = B2 = . . .= BK , that SSF maximizes the ratio of instantaneous
job departure rate to instantaneous energy consumption rate. In general, however,
this is not the case. We shall see later in Section 3.5.4 that MEESF does not
necessarily maximize the ratio of long-run average job departure rate (i.e., job
throughput) to long-run average energy consumption rate (i.e., power consumption).
This observation motivates us to design the more robust E⇤ policy.
3.5.2 E⇤
We design the E⇤ policy so that the bK� 2 most energy-efficient servers are aggregated
to form a virtual server.
Proposition 3.2. For the system defined in Section 3.2, consider a virtual server
obtained by aggregating the bK most energy-efficient servers in the system. If its
energy efficiency is defined by the ratiobKÂj=1
µ j/ÂbKj=1 e j, then, the virtual server is
more energy efficient than any server j � bK +1, i.e.,
ÂbKj=1 µ j
ÂbKj=1 e j
>µk
ek, k = bK +1, bK +2, . . . ,K (3.16)
if there exists at least one pair of servers u and v, where 1 u < v bK +1, such
that µu/eu > µv/ev.
Proof. The proof is straightforward and omitted here.
38
3.5 Jockeying Job-Assignment Heuristics
In particular, the decision variable bK provides a degree of freedom for E⇤ to
fine-tune its performance: The objective of E⇤ is to determine within the range
{2,3, . . . ,K} an optimal bK value, referred to as bK⇤, which maximizes the energy
efficiency.
3.5.3 Rate-Matching
Our proposed rule of thumb, RM, to determine the value bK⇤ of E⇤ simply picks up
an integer bKRM for which the aggregated service rate of the virtual server largest but
still lower than the average job arrival rate. More specifically, bKRM is chosen to be
the largest bK satisfyingbKÂj=1
µ j l .
We provide an intuition for this simple rule of thumb with the help of the
numerical results presented in Fig. 3.2. In this particular example, we have ten
servers. The server speeds and the energy consumption rates are randomly generated.
The job arrival rate is set to be the sum of the service rates of the six most energy-
efficient servers. The results in Fig. 3.2 are presented in the form of relative difference
between E* and MEESF in terms of each corresponding performance measure,
namely, energy efficiency in Fig. 3.2a, job throughput in Fig. 3.2b, and power
consumption in Fig. 3.2c.
On the basis of these results in Fig. 3.2, we argue that a different value of bK
selected for E⇤ other than bKRM is likely to reduce the energy efficiency. This happens
for several reasons.
• On one hand, if we choose a value of bK larger than bKRM, the excessive service
capacity made available with the virtual server can only increase the job
throughput marginally, but can substantially increase the power consumption
since it now uses more of those less energy-efficient servers. As a result, the
energy efficiency (defined in our context as the ratio of job throughput to power
39
Job-Assignment Heuristics with Jockeying
consumption) is likely to decrease with an increasing value of bK in the range
{bKRM, bKRM +1, . . . ,K].
• On the other hand, if we choose a value of bK smaller than bKRM, the aggregated
service rate of the virtual server is not high enough to support the input
traffic in the long run. As a result, incoming jobs will be queued up and we
are forced to use more of those less energy-efficient servers for serving the
backlog. This will again increase power consumption. Since it can be shown
that the job throughput decreases with decreasing bK, accordingly the energy
efficiency is also likely to decrease with a decreasing value of bK in the range
{bKRM, bKRM�1, . . . ,2}.
3.5.4 Insensitive MEESF Policy
The insensitive MEESF policy specifies the set of servers T MEESF(n) designated
for serving existing jobs in the system at state n to be
T MEESF(n) = {1, . . . , i}, 1 n eBK (3.17)
where i K is the smallest integer satisfying Âij=1 B j � n. The server sets specified
in (3.17) indeed define the MEESF policy, thereby the available servers that are most
energy efficient are designated for service in state n.
The position mapping QMEESF of MEESF is defined iteratively as follows. The
B1 positions of server 1 (the most energy-efficient server) are mapped to the first
B1 positions of the logically-combined queue that are associated with server 1. The
B2 positions of server 2 (the second most energy-efficient server) are mapped to
the following B2 positions of the logically-combined queue that are associated with
server 2. The procedure continues until the BK positions of server K have been
mapped.
40
3.5 Jockeying Job-Assignment Heuristics
2 4 6 8 10−40
−20
0
20
K
Rel
ativ
e di
ffere
nce
(%)
(a) Energy efficiency.
2 4 6 8 100
0.6
1.2
1.8 x 10−3
K
Rel
ativ
e di
ffere
nce
(%)
(b) Job throughput.
2 4 6 8 10−20
0
20
40
60
K
Rel
ativ
e di
ffere
nce
(%)
(c) Power consumption.
Fig. 3.2 Illustration of optimizing bK for the E* policy.
Under the condition where all servers in the system have the same buffer size, i.e.,
B1 = B2 = . . .= BK , it can be shown that SSF maximizes the ratio of instantaneous
job departure rate to instantaneous energy consumption rate.
Proposition 3.3. For the system defined in Section 2.1, among all insensitive jock-
eying policies f 2 Fs, if B1 = B2 = . . . = BK and the power function follows the
cubic assumption, i.e. e j = µ
3j for all j = 1,2, . . . ,K, then the MEESF maximizes the
ratio of instantaneous job departure rate to instantaneous energy consumption rate,
41
Job-Assignment Heuristics with Jockeying
namely,Â
j2T MEESF(n)µ j
Âj2T MEESF(n)
e j�
Âj2T f (n)
µ j
Âj2T f (n)
e j(3.18)
for any f 2Fs and n 2N .
Proof. Let L(n) represent the set of busy servers at state n, and assume without loss
of generality that the servers in the system have been ordered in descending order
with respect to their energy efficiencies. For a given number of jobs in the system n,
we use Sf (n) to represent the set of all reachable state vector n with |n|=ÂKj=1 n j = n
under policy f . In particular, according to the definition of MEESF, in our system
with jockeying, SMEESF(n) contains only one element. For clarity of presentation,
we denote the only state vector in SMEESF(n) by nMEESF(n).
We define the set L(n) = { j /2 L(n), j = 1,2, . . . ,K}. Since B1 = B2 = . . . =
BK = b, the minimal number of busy servers for a given number of jobs in the
system n is dn/be, which is equivalent to the number of elements in L(nMEESF(n)).
Hence, for any policy f and any state vector n0 2 Sf (n), the number of elements in
L(nMEESF(n))\L(n0) is greater than that in L(nMEESF(n))\ L(n0). Assume with-
out loss of generality, L(nMEESF(n))\L(n0) = {i1, i2, . . . , im} and L(nMEESF(n))\
L(n0) = { j1, j2 . . . , jm0}, m � m0. Also, according to the definition of MEESF, for
any j 2 L(nMEESF(n)) and j0 2 L(nMEESF(n)), we have µ j µ j0 .
Under the cubic assumption, for any policy f and any state vector n0 2 Sf (n),
for any i 2 L(nMEESF(n))\L(n0) and j 2 L(nMEESF(n))\ L(n0) satisfying µi 6= µ j,
we obtainµi�µ j
ei� e j=
1µ
2i +µ
2j +µiµ j
1µ
2i=
µi
ei, (3.19)
and
µi
ei
Âj2L(nMEESF(n))
µ j
Âj2L(nMEESF(n))
e j. (3.20)
42
3.5 Jockeying Job-Assignment Heuristics
Inequality (3.20) can be shown to hold by induction on the number of elements in
L(nMEESF).
Based on (3.19), (3.20) and the cubic assumption, if
Âj2L(nMEESF(n))\L(n0)
e j 6= Âj2L(nMEESF(n))\L(n0)
e j, (3.21)
then,
Âj2L(nMEESF(n))
µ j
Âj2L(nMEESF(n))
e j�
m0
Âk=1
(µik�µ jk)+mÂ
k=m0+1µik
m0Â
k=1(eik� e jk)+
mÂ
k=m0+1eik
=
Âj2L(nMEESF(n))\L(n0)
µ j� Âj2L(nMEESF(n))\L(n0)
µ j
Âj2L(nMEESF(n))\L(n0)
e j� Âj2L(nMEESF(n))\L(n0)
e j� 0. (3.22)
Therefore, if (3.21) holds true,
Âj2L(nMEESF(n))
µ j
Âj2L(nMEESF(n))
e j
�Â
j2L(nMEESF(n))µ j + Â
j2L(nMEESF(n))\L(n0)µ j� Â
j2L(nMEESF(n))\L(n0)µ j
Âj2L(nMEESF(n))
e j + Âj2L(nMEESF(n))\L(n0)
e j� Âj2L(nMEESF(n))\L(n0)
e j
=
Âj2L(n0)
µ j
Âj2L(n0)
e j. (3.23)
If (3.21) does not hold true, then (3.18) straightforwardly holds true. Hence, for any
reachable state vector n, (3.18) holds true. This proves the proposition.
43
Job-Assignment Heuristics with Jockeying
Table 3.3 The Counter Example for the Instantaneous Optimality of MEESF
j µ j e j µ j/e j B j
1 1 2 0.5 22 1 4 0.25 23 0.25 1.1 5/22 ( < 0.25) 4
The power of three used here in the power function is not a very restrictive
assumption and is used only for simplicity of exposition to illustrate the instantaneous
energy efficiency attribute of the MEESF policy.
In general, however, this is not the case. We give a counter-example for the
instantaneous optimality of MEESF for general buffer sizes and general power
function. The counter-example for (3.18), in the general case with heterogeneous
buffer size and without the cubic assumption, is given as follows.
Consider a system with K = 3, and use the other parameters in Table 3.3. When
there are four jobs in the system, MEESF will use the first two servers and the
ratio of the instantaneous service rate to the instantaneous energy consumption rate
(the instantaneous energy efficiency) of the system is 13 ⇡ 0.3333. While, if we use
servers 1 and 3 to serve these jobs, the ratio will become 2562 ⇡ 0.4032 > 1/3.
Moreover, if we consider the state with six jobs in the system, then all three
servers will be busy under MEESF where the instantaneous energy efficiency is
2.257.1 ⇡ 0.3169. However, if we use servers 1 and 3, which have sufficient capacity
for the six jobs since B1 + B3 = 6, then the instantaneous energy efficiency is
2563 ⇡ 0.4023 > 2.25/7.1.
MEESF does not necessarily maximize the ratio of long-run average job departure
rate (i.e., job throughput) to long-run average energy consumption rate (i.e., power
consumption). This observation motivates us to design the more robust E* policy.
In [21], a special case of MEESF, when the cubic power function is assumed,
referred to as the Slowest-Server-First policy (SSF), has been numerically demon-
44
3.5 Jockeying Job-Assignment Heuristics
0 100 200 300 400 5000
0.3
0.6
0.9
1.2
Arrival rate
Rel
ativ
e di
ffere
nce
(%)
E*MEESF
(a)
0 100 200 300 400 5000
0.3
0.6
0.9
1.2
Arrival rate
Rel
ativ
e di
ffere
nce
(%)
(b)
Fig. 3.3 Energy efficiency of E⇤ and MEESF versus arrival rate.
strated to be close to an optimal solution that maximizes the energy efficiency of the
system.
Here, in Fig. 3.3, we present results for both the MEESF and E⇤ policies that
show their relative difference of energy efficiency versus the average arrival rate of
the system. In Fig. 3.3a, the y-axis represents the relative difference of both policies
to the optimal solution obtained by the algorithm in [21] under the cubic assumption.
Correspondingly, Fig. 3.3b gives the relative difference of energy efficiency of E⇤ to
MEESF for the same running of the experiment. For the sake of clarity, for policies
f1 and f2, we define
G(f1,f2) =
Lf1
Ef1� L
f2E
f2L
f2E
f2
. (3.24)
Then, the relative difference of energy efficiency of E⇤ to MEESF is G(E⇤,MEESF),
and the relative difference of energy efficiency of policy f to the optimal solution is
G(f ,OPT), where OPT represents the optimal solution.
In Fig. 3.3, the parameters are set as K = 50, µ j = 0.1 j, e j = µ
3j , and B j =
10, j = 1,2, . . . ,K. In Fig. 3.3a (Fig. 3.3b), it can be observed that both MEESF
and E⇤ policies are close to the optimal solution with relative difference smaller
45
Job-Assignment Heuristics with Jockeying
0 10 20 30 40 500
0.2
0.4
0.6
Buffer size
Rel
ativ
e di
ffere
nce
(%)
E*MEESF
(a)
0 10 20 30 40 500
0.2
0.4
0.6
Buffer size
Rel
ativ
e di
ffere
nce
(%)
(b)
Fig. 3.4 Energy efficiency of E⇤ and MEESF versus buffer size.
than 1.2% and 0.3%, respectively, and that E⇤ always outperforms MEESF. These
two observations are consistent with the results in [21] and the argument given in
this chapter. In other words, MEESF approximates an optimal solution under the
cubic assumption, and in that case E* still outperforms MEESF in terms of energy
efficiency.
In Fig. 3.4, for the same parameters except r is fixed to 0.8, we present the
relationship between the relative energy efficiencies of E⇤ and MEESF versus the
buffer size of each server, where the buffer size is the same for all servers. Fig. 3.4a
gives the relative difference of E⇤ and MEESF to the optimal solution and Fig. 3.4b is
the relative difference of energy efficiency of E⇤ to MEESF. The trend of the curves
in Fig. 3.4 are similar to those in Fig. 3.3, namely, MEESF is close to the optimal
solution and E⇤ outperforms MEESF for all points in the figures.
In Fig. 3.5, we also present the cumulative distribution of the relative dif-
ference of energy efficiency of E⇤ to MEESF. With K = 50, we set e j = µ
3j
and B j, j = 1,2, . . . ,K. B1,B2, . . . ,BK are uniformly randomly generated from
{10,11, . . . ,15}, while the service rates µ1,µ2, . . . ,µK are uniformly randomly gen-
erated from [0.1,10]. Here, MEESF is equivalent to the SSF proposed in [21].
46
3.5 Jockeying Job-Assignment Heuristics
0 0.5 1 1.5 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cum
ulat
ive
dist
ribut
ion
(a) r = 0.4
0 0.5 1 1.5 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Relative difference (%)
Cum
ulat
ive
dist
ribut
ion
(b) r = 0.6
0 0.5 1 1.5 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cum
ulat
ive
dist
ribut
ion
(c) r = 0.8
Fig. 3.5 Relative difference of energy efficiency of E⇤ to MEESF.
In addition, in Fig. 3.5, we observe that the superiority of E⇤ over MEESF is
limited under these cubic-assumption cases which is consistent with the argument
in [21]. Also, in Fig. 3.5, E⇤ achieves higher energy efficiency than MEESF for all
cases with randomly generated parameters under the cubic assumption. In addition,
for the three traffic intensities, i.e. the ratio of the average arrival rate to the maximal
total service rate of the system, the cumulative distribution is stable. In other words,
the relative performance of E⇤ to MEESF is not very sensitive to the normalized
system load when the power function is assumed to follow the cubic assumption.
47
Job-Assignment Heuristics with Jockeying
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cum
ulat
ive
dist
ribut
ion
Case 1Case 2Case 3
(a) r = 0.4
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Relative difference (%)
Cum
ulat
ive
dist
ribut
ion
Case 1Case 2Case 3
(b) r = 0.6
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cum
ulat
ive
dist
ribut
ion
Case 1Case 2Case 3
(c) r = 0.8
Fig. 3.6 Relative difference of energy efficiency of E⇤ to MEESF.
3.5.5 Insensitive E⇤ Policy
The cubic assumption in [21] is far away from reality. In the following, we consider a
more general scenario by dropping the constraint of cubic power function. We argue
that, although MEESF approximates an optimal solution in the cubic-assumption
case, the performance of MEESF may be significantly improved in a general case
with independently randomly generated service rates and energy consumption rates
in terms of energy efficiency.
In Fig. 3.7, the cumulative distributions of the relative difference of energy
efficiency are presented. In Fig. 3.7, there are 10 server groups, where the servers in
48
3.5 Jockeying Job-Assignment Heuristics
−50 −45 −40 −35 −30 −25 −20 −15 −10 −5 00
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cum
ulat
ive
dist
ribut
ion
Case 1Case 2Case 3
(a) r = 0.4
−50 −45 −40 −35 −30 −25 −20 −15 −10 −5 00
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cum
ulat
ive
dist
ribut
ion
Case 1Case 2Case 3
(b) r = 0.6
−50 −45 −40 −35 −30 −25 −20 −15 −10 −5 00
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cum
ulat
ive
dist
ribut
ion
Case 1Case 2Case 3
(c) r = 0.8
Fig. 3.7 Relative difference of power consumption of E⇤ to MEESF.
the same server group have the same service rate, power consumption and PS buffer
size that are denoted by µi, ei and Bi, i = 1,2, . . . ,10, respectively. We uniformly
randomly generate µi, i = 1,2, . . . ,10, and the ratios ri, i = 2,3, . . . ,10, from [0.1,10]
and [0.1,1], respectively, where the service rates are ordered descendingly. With
µ1/e1 = 100, the energy efficiency of server group i is set to be µi/ei = riµi�1/ei�1,
r1.2i µi�1/ei�1, and r1.4
i µi�1/ei�1, i = 2,3, . . . ,10, for Case 1, Case 2, and Case 3 in
Fig. 3.7, respectively. These three cases represent three levels of server heterogeneity.
The buffer sizes Bi, i = 1,2, . . . ,10, are also uniformly randomly generated from
{10,11, . . . ,15}. These settings can be justified in that a newer server is likely to
49
Job-Assignment Heuristics with Jockeying
have higher service rate and higher energy efficiency. Also, a server farm vendor
would likely purchase many servers with the same brand and same style at any given
time, and those servers would behave identically.
Compare to Fig. 3.5 with cubic power function, the relative differences in Fig. 3.7
are much more significant, namely, E⇤ substantially improves the energy efficiency
of MEESF without assuming a cubic power function. From Fig. 3.7, the degradation
of MEESF in terms of energy efficiency can be more than 100%. In Fig. 3.7c, in
around 27% of the experiments for Case 3, E⇤ improves MEESF by more than 10%
in terms of energy efficiency.
In Fig. 3.7, from Case 1 to Case 3, as the server heterogeneity becomes higher,
the superiority of E⇤ over MEESF also increases. As we recall our proof, if all the
servers are equivalent to each other, then MEESF will be equivalent to E⇤ in terms
of energy efficiency. We therefore argue that the increasing heterogeneity among
servers leads to increasing superiority of E⇤ over MEESF. This argument about the
continuous trend of energy efficiency versus server heterogeneity is consistent with
the spirit of this proof.
According to Fig. 3.7, the proportion of experiments where MEESF achieves
higher energy efficiency or lower power consumption than those of E⇤ can be
ignored. Then we also claim that E⇤ usually outperforms MEESF and can sig-
nificantly improve MEESF’s performance with respect to energy efficiency and
power consumption with randomly generated parameters. In addition, by varying
the normalized system load, the superiority of E⇤ over MEESF is increasing as the
normalized system load r increases.
The insensitive E* policy is derived by specifying the set of servers T E⇤(n)
designated for serving the existing jobs in the system at state n as
T E⇤(n) =
8
>
<
>
:
{1, . . . ,min(n, bK)}, 1 n eBbK�1
{1, . . . , i}, eBbK n eBK
(3.25)
50
3.5 Jockeying Job-Assignment Heuristics
where i K is the smallest integer satisfying Âij=1 B j � n. The server sets specified
in (3.25) indeed define the E* policy, in which preference is always given to the
virtual server for maximally utilizing its service capacity at any state n.
The position mapping QE⇤ of E* is defined iteratively as follows. In the first
iteration, the first buffer positions of servers 1,2, . . . , bK are mapped to the first bK
positions of the logically-combined queue in the order of the server labels 1,2, . . . , bK,
inheriting their original server speeds and energy consumption rates. In every
subsequent iteration until all the positions of the first bK servers have been mapped,
the next remaining position levels from the remaining buffers, say m bK positions,
are mapped to the next m positions of the logically-combined queue in the order
of the server labels. The BbK+1 positions of server bK + 1 are then mapped to the
next BbK+1 positions of the logically-combined queue that are associated with server
bK +1. The BbK+2 positions of server bK +2 are mapped to the next B
bK+2 positions of
the logically-combined queue that are associated with server bK +2. The iterations
terminate when all the positions of all server buffers have been mapped.
Here, we provide a rigorous analysis of the E* policy. First, we show that E*
is guaranteed to outperform MEESF in terms of job throughput. Then, we derive
conditions under which E* is guaranteed to outperform MEESF in terms of energy
efficiency.
Remark 3.2. By the nature of the two policies, we have µ
E⇤(n) � µ
MEESF(n) for
1 n eBK.
Proposition 3.4. For the stochastic job assignment problem studied in this chapter,
we have
L E⇤ �L MEESF. (3.26)
Proof. Using (3.6), we obtain the job throughput of the system under E* and that
under MEESF as
L E⇤ = l
h
1�p
E⇤(eBK)i
(3.27)
51
Job-Assignment Heuristics with Jockeying
and
L MEESF = l
h
1�p
MEESF(eBK)i
, (3.28)
respectively. From (3.4), we derive for E* that
p
E⇤(n) =eBK
’i=n+1
µ
E⇤(i)l
p
E⇤(eBK), 0 n eBK�1. (3.29)
By normalization, we have
eBK�1
Ân=0
eBK
’i=n+1
µ
E⇤(i)l
p
E⇤(eBK)+p
E⇤(eBK) = 1, (3.30)
hence
p
E⇤(eBK) =1
ÂeBK�1n=0 ’
eBKi=n+1
µ
E⇤(i)l
+1. (3.31)
Likewise, for MEESF, we get
p
MEESF(eBK) =1
ÂeBK�1n=0 ’
eBKi=n+1
µ
MEESF(i)l
+1. (3.32)
It follows from Remark 3.2 that
eBK
’i=n+1
µ
E⇤(i)l
�eBK
’i=n+1
µ
MEESF(i)l
(3.33)
for 0 n eBK�1. Therefore, we have
p
E⇤(eBK) p
MEESF(eBK) (3.34)
and follows (3.26).
52
3.5 Jockeying Job-Assignment Heuristics
Now, we consider energy efficiency. For convenience, let
P(n) =p
E⇤(n)p
E⇤(eBK), 0 n eBK. (3.35)
It is plain that
P(n) =
8
>
>
>
<
>
>
>
:
eBK
’i=n+1
µ
E⇤(i)l
, 0 n eBK�1
1, n = eBK.
(3.36)
Similarly, let
P0(n) =p
MEESF(n)p
MEESF(eBK), 0 n eBK, (3.37)
so that
P0(n) =
8
>
>
>
<
>
>
>
:
eBK
’i=n+1
µ
MEESF(i)l
, 0 n eBK�1
1, n = eBK.
(3.38)
Lemma 3.1. For P(n) and P0(n) defined by (3.36) and (3.38), respectively, we have
eBK
Ân=eBx�1+1
P(n)eBK
Ân0=eBy�1+1
P0(n0)�eBK
Ân=eBy�1+1
P(n)eBK
Ân0=eBx�1+1
P0(n0)� 0 (3.39)
for any two integers x and y such that 1 x < y K.
Proof. Note that
eBK
Ân=eBx�1+1
P(n) =eBy�1
Ân=eBx�1+1
P(n)+eBK
Ân=eBy�1+1
P(n) (3.40)
andeBK
Ân0=eBx�1+1
P0(n0) =eBy�1
Ân0=eBx�1+1
P0(n0)+eBK
Ân0=eBy�1+1
P0(n0). (3.41)
53
Job-Assignment Heuristics with Jockeying
Thus, proving (3.39) is equivalent to proving
eBy�1
Ân=eBx�1+1
P(n)eBK
Ân0=eBy�1+1
P0(n0)�eBK
Ân=eBy�1+1
P(n)eBy�1
Ân0=eBx�1+1
P0(n0)� 0. (3.42)
It suffices to show that, for any two integers l and m where eBx�1 +1 l eBy�1
and eBy�1 +1 m eBK , we have P(l)P0(m)�P(m)P0(l)� 0, or equivalently,
P(l)P(m)
=m
’i=l+1
µ
E⇤(i)l
� P0(l)P0(m)
=m
’i=l+1
µ
MEESF(i)l
. (3.43)
The inequality of (3.43) clearly holds since by Remark 3.2 we have µ
E⇤(i) �
µ
MEESF(i) for all i in the defined range.
Proposition 3.5. A sufficient condition for
L E⇤
E E⇤ �L MEESF
E MEESF (3.44)
to hold is thatµ j
e j=
µ1
e1, j = 2,3, . . . , bK. (3.45)
Proof. From (3.7), we derive for E* that
E E⇤ =eBK
Ân=1
e
E⇤(n)pE⇤(n)
=bK
Ân=1
p
E⇤(n)n
Âj=1
e j +
eBbK
Ân=bK+1
p
E⇤(n)bK
Âj=1
e j +K
Âi=bK+1
eBi
Ân=eBi�1+1
p
E⇤(n)i
Âj=1
e j. (3.46)
54
3.5 Jockeying Job-Assignment Heuristics
Interchanging the summations in (3.46), we obtain
E E⇤ =bK
Âj=1
e j
bK
Ân= j
p
E⇤(n)+bK
Âj=1
e j
eBbK
Ân=bK+1
p
E⇤(n)+
0
@
bK
Âj=1
e j
K
Âi=bK+1
+K
Âj=bK+1
e j
K
Âi= j
1
A
eBi
Ân=eBi�1+1
p
E⇤(n)
=bK
Âj=1
e j
eBK
Ân= j
p
E⇤(n)+K
Âj=bK+1
e j
eBK
Ân=eB j�1+1
p
E⇤(n).
(3.47)
Since B j � 1 for all j = 1,2 . . . ,K, we have eB j�1 +1� j for 1 j bK, and we can
rewrite the elements of E E⇤ in (3.47) as
E E⇤ =bK
Âj=1
e j
eB j�1
Ân= j
p
E⇤(n)+bK
Âj=1
e j
eBK
Ân=eB j�1+1
p
E⇤(n)+K
Âj=bK+1
e j
eBK
Ân=eB j�1+1
p
E⇤(n)
=bK
Âj=1
e j
eB j�1
Ân= j
p
E⇤(n)+K
Âj=1
e j
eBK
Ân=eB j�1+1
p
E⇤(n).
(3.48)
Similar to the way we derive the expression of E E⇤ in (3.48), we obtain L E⇤ as
L E⇤ =eBK
Ân=1
µ
E⇤(n)pE⇤(n) =bK
Âj=1
µ j
eB j�1
Ân= j
p
E⇤(n)+K
Âj=1
µ j
eBK
Ân=eB j�1+1
p
E⇤(n), (3.49)
so that
L E⇤
E E⇤ =L E⇤/p
E⇤(eBK)
E E⇤/p
E⇤(eBK)=
ÂbKj=1 µ j Â
eB j�1n= j P(n)+ÂK
j=1 µ j ÂeBKn=eB j�1+1
P(n)
ÂbKj=1 e j Â
eB j�1n= j P(n)+ÂK
j=1 e j ÂeBKn=eB j�1+1
P(n). (3.50)
On the other hand, from (3.7), for the MEESF, we derive that
E MEESF =eBK
Ân0=1
e
MEESF(n0)pMEESF(n0) =K
Âi0=1
eBi0
Ân0=eBi0�1+1
p
MEESF(n0)i0
Âj0=1
e j0 . (3.51)
55
Job-Assignment Heuristics with Jockeying
Interchanging the summations in (3.51), this time we obtain
E MEESF =K
Âj0=1
e j0K
Âi0= j0
eBi0
Ân0=eBi0�1+1
p
MEESF(n0) =K
Âj0=1
e j0
eBK
Ân0=eB j0�1+1
p
MEESF(n0). (3.52)
Similar to the way we derive the expression of E MEESF in (3.52), we obtain L MEESF
as
L MEESF =eBK
Ân0=1
µ
MEESF(n0)pMEESF(n0) =K
Âj0=1
µ j0
eBK
Ân0=eB j0�1+1
p
MEESF(n0), (3.53)
and then we obtain
L MEESF
E MEESF =L MEESF/p
MEESF(eBK)
E MEESF/p
MEESF(eBK)=
ÂKj0=1 µ j0Â
eBKn0=eB j0�1+1
P0(n0)
ÂKj0=1 e j0Â
eBKn0=eB j0�1+1
P0(n0). (3.54)
Given L E⇤/E E⇤ in the form of (3.50) and L MEESF/E MEESF in the form of
(3.54), for the inequality of (3.44) to hold, it requires
bK
Âj=1
K
Âj0=1
µ je j0
eB j�1
Ân= j
P(n)eBK
Ân0=eB j0�1+1
P0(n0)+K
Âj=1
K
Âj0=1
µ je j0
eBK
Ân=eB j�1+1
P(n)eBK
Ân0=eB j0�1+1
P0(n0)
�bK
Âj=1
K
Âj0=1
e jµ j0
eB j�1
Ân= j
P(n)eBK
Ân0=eB j0�1+1
P0(n0)+K
Âj=1
K
Âj0=1
e jµ j0
eBK
Ân=eB j�1+1
P(n)eBK
Ân0=eB j0�1+1
P0(n0).
(3.55)
First, we show that
K
Âj=1
K
Âj0=1
µ je j0
eBK
Ân=eB j�1+1
P(n)eBK
Ân0=eB j0�1+1
P0(n0)�K
Âj=1
K
Âj0=1
e jµ j0
eBK
Ân=eB j�1+1
P(n)eBK
Ân0=eB j0�1+1
P0(n0).
(3.56)
In particular, we observe in (3.56) that:
56
3.5 Jockeying Job-Assignment Heuristics
• For j = j0, we have
µ je j0
eBK
Ân=eB j�1+1
P(n)eBK
Ân0=eB j0�1+1
P0(n0) = e jµ j0
eBK
Ân=eB j�1+1
P(n)eBK
Ân0=eB j0�1+1
P0(n0).
(3.57)
• For any two integers x and y where 1 x < y K, we have
µxey
eBK
Ân=eBx�1+1
P(n)eBK
Ân0=eBy�1+1
P0(n0)+µyex
eBK
Ân=eBy�1+1
P(n)eBK
Ân0=eBx�1+1
P0(n0)
� exµy
eBK
Ân=eBx�1+1
P(n)eBK
Ân0=eBy�1+1
P0(n0)� eyµx
eBK
Ân=eBy�1+1
P(n)eBK
Ân0=eBx�1+1
P0(n0)
= (µxey� exµy)
"
eBK
Ân=eBx�1+1
P(n)eBK
Ân0=eBy�1+1
P0(n0)�eBK
Ân=eBy�1+1
P(n)eBK
Ân0=eBx�1+1
P0(n0)
#
� 0,(3.58)
where the final inequality follows from Remark 3.1 and Lemma 3.1.
Now, for the inequality of (3.55) to hold, it remains to find a sufficient condition
yielding
bK
Âj=1
K
Âj0=1
µ je j0
eB j�1
Ân= j
P(n)eBK
Ân0=eB j0�1+1
P0(n0)�bK
Âj=1
K
Âj0=1
e jµ j0
eB j�1
Ân= j
P(n)eBK
Ân0=eB j0�1+1
P0(n0).
(3.59)
In (3.59), we observe that:
• For j = j0, µ je j0 = e jµ j0 .
• For j = 1,2, . . . , bK and j0 = j+1, j+2, . . . ,K, µ je j0 � e jµ j0 � 0.
• For j = 2,3, . . . , bK and j0 = 1,2, . . . , j�1, µ je j0 � e jµ j0 0.
57
Job-Assignment Heuristics with Jockeying
Therefore, for the inequality of (3.59) to hold, it is sufficient to have (3.45), which
enforces
µ je j0 � e jµ j0 = 0, j = 2,3, . . . , bK, j0 = 1,2, . . . , j�1. (3.60)
This completes the proof.
From Proposition 3.5, we can obtain the following two corollaries.
Corollary 3.1. If (3.44) is satisfied with parameters K = K⇤, bK = bK⇤, l = l
⇤,
B j = B⇤j , µ j/e j = µ
⇤j /e
⇤j , j = 1, . . . ,K⇤, then (3.44) still holds true for all K � K⇤
with bK = bK⇤, l = l
⇤, B j = B⇤j , µ j/e j = µ
⇤j /e
⇤j , j = 1, . . . ,K⇤.
Corollary 3.2. If µ j/e j = c, j = 1,2, . . . ,K, then we have
L E⇤
E E⇤ =L MEESF
E MEESF (3.61)
for each bK = 1, . . . ,K.
Corollary 3.3. If e j/µ j = e1/µ1, j = 1,2, . . . ,k where k = 1,2, . . . ,K, then we have
L E⇤
E E⇤ �L MEESF
E MEESF (3.62)
for each bK = 1,2, . . . ,k.
Corollary 3.2 suggests that, if all servers in the system are equally energy efficient,
the energy efficiency of E* is equivalent to that of MEESF. Nevertheless, even in
the case of a homogeneous server farm, E* has a higher job throughput than that
of MEESF. On the other hand, Corollary 3.3 suggests that, if at least two servers
in a heterogeneous server farm are both most energy-efficient, E* is guaranteed
to outperform MEESF in terms of energy efficiency. We argue that the latter is a
realistic scenario since in practice a server farm is likely to comprise multiple servers
of the same type purchased at a time.
58
3.5 Jockeying Job-Assignment Heuristics
Moreover, for scenarios where µ1/e1 > µ2/e2, we have the following theoretical
results. We derive an extended policy (denoted by E⇤k
) that is the same as E⇤, except
that a new arrival will not be assigned to the second server until the queue size of the
first server is no less than k for some k = 1,2, . . . ,B1. That is, under E⇤k
, the second
server is idle if the queue size of the first server is no larger than k . Hence, E⇤ is a
special case of E⇤k
for k = 1.
Proposition 3.6. If K = 4, B j � 2, j = 1,2,3,4, (p
2�1)/2µ3 � µ2 � µ1, e3/µ3 �p
2e2/µ2 and l � µ1+µ2+µ3, then we have L E⇤/E E⇤ �L MEESF/E MEESF where
bK = 2 and k = 1,2, . . . ,B1.
Proof. See Appendix A.9.
Based on Proposition 3.6 and Corollary 3.1, we have the following corollary.
Corollary 3.4. If K � 4, B j � 2, j = 1,2, . . . ,K, (p
2�1)/2µ3 � µ2 � µ1, e3/µ3 �p
2e2/µ2 and l � µ1+µ2+µ3, then we have L E⇤/E E⇤ �L MEESF/E MEESF where
bK = 2.
Next, we will analyze the feasible region of bK, denoted by FbK , which is defined
to be
FbK =
(
bK :L E⇤
E E⇤ �L MEESF
E MEESF , bK = 2,3, . . . ,K
)
. (3.63)
We do not consider bK = 1 since E⇤ with bK = 1 is equivalent to MEESF as discussed
before.
For any policy f 2Fs, we define
pf
n =p
f (n)L f
µ
f (n), n 2N (3.64)
and
qf
n =pf
n
pf
eBbK
, n 2N . (3.65)
59
Job-Assignment Heuristics with Jockeying
We have ÂeBKn=0 pf
n = 1, based on the definition of pf
n , n = 0,1, . . . , BK .
To formulate an optimization problem, we rewrite our objective function E f/L f
as a weighted sum of decision variables.
E f
L f
=eBK
Ân=0
Ff (n) · pf
n , (3.66)
where Ff (n) = e
f (n)/µ
f (n). We refer to pf
n as the virtual probability of state n in
steady state. We use the word virtual to indicate that these quantities are not the
real steady-state probabilities for the original system defined in Section 3.2. Then,
the virtual probability of the system’s total service rate µ is denoted by pf (µ) and
defined as
pf (µ) = Ân:µf (n)=µ
pf
n . (3.67)
As the set of service rates is an attribute of the system which is independent of the
policy used, this set is common to all the E⇤ policies, including MEESF, considered
here. We denote this set of service rates by R = {0,r1,r2, . . . ,rK} where
r j =j
Âi=1
µi.
For E⇤ and MEESF, the total power consumption of the system is uniquely de-
termined by its total service rate, i.e., the corresponding virtual probability of the
system’s total power consumption is exactly the same as the system’s total service
rate. We rewrite (3.66) as
E f
L f
= µ2R
Ff (µ)pf (µ) (3.68)
where Ff (µ) is the ratio of the total power consumption to total service rate of the
system when the total service rate is µ; in our case, f 2 {MEESF,E⇤}. Note that
f here is not strongly restricted to MEESF and E⇤, but to the set of policies where
60
3.5 Jockeying Job-Assignment Heuristics
Fig. 3.8 Virtual probability of service rate.
the total power consumption can be uniquely determined by a given value of the
total service rate. Then, Fig. 3.8 can also represent the virtual probabilities of the
power consumption. Our objective is then equivalent to minimizing the average
value of the ratio of the total power consumption to the total service rate based on
the virtual probabilities. Neither (3.66) nor (3.68) is amenable to a solution by linear
programming because the corresponding constraints, which are related to decision
variables µ
f (n), are non-linear.
In Fig. 3.8, we present the virtual probabilities versus system total service rates
under MEESF and E⇤, where K = 10, B j = 10, j = 1,2, . . . ,10, bK = 5, l = Â5j=1 µ j
and µ j and e j are uniformly randomly generated in [0.1,10] and [0.1,100], respec-
tively. In this figure, we split R into the following three mutually exclusive subsets.
RA = {0,r1,r2, . . . ,rbK�1},
RB = {rbK},
RC = {rbK+1,rbK+2, . . . ,rK},
(3.69)
where RA[RB[RC = R. We define A, B and C as the time period during which
the total service rate of the system is in RA,RB and RC, respectively. For clarity of
61
Job-Assignment Heuristics with Jockeying
presentation, we also define the state set for A, B and C as
N f
q
= {n 2N |µf (n) 2 Rq
}, q 2 {A,B,C}
µn,f is the total service rate of the system as defined before. The average ratio of
the total power consumption to the total service rate in each part based on virtual
probability is given by
Ff
q
=
Ân2N f
q
Ff (n)pf
n
Ân2N f
q
pf
n, q 2 {A,B,C} (3.70)
where Ff (n) is the ratio of the total power consumption to the total service rate in
state n under policy f .
Lemma 3.2. For the system defined in Section 3.2, if l � rbK�1 and B j+1 � B j � 1
for all j = 1,2, . . . , bK�2, then,
FMEESFA �FE⇤
A .
Proof. See Appendix A.4.
Lemma 3.3. For bK 2 {1,2, . . . ,K}, we have
FMEESFB = FE⇤
B (3.71)
and
FMEESFC = FE⇤
C . (3.72)
Proof. See Appendix A.5.
62
3.5 Jockeying Job-Assignment Heuristics
As defined in (3.63), the condition of the feasible region is L E⇤/E E⇤ �L MEESF/E MEESF�
0, and, by (3.68) and (3.70), this condition is equivalent to
Âq2{A,B,C}
"
FMEESFq
µ2R
q
pMEESF(µ)�FE⇤q
µ2R
q
pE⇤(µ)
#
� 0. (3.73)
We observe in Fig. 3.8 that the virtual probability of service rate r5 for E⇤ is higher
than that of MEESF, but for service rates greater than r5, the virtual probability
is higher under MEESF. In Fig. 3.8, we shadow the zones that represent the dif-
ferences of the virtual probabilities for the two policies. For these shadow zones,
dense-slash represents pMEESF(µ)< pE⇤(µ) and sparse-slash is for the cases where
pMEESF(µ) � pE⇤(µ). Based on the definition (3.64), the size of all the shadow
zones for the case of pMEESF(µ)< pE⇤(µ) will be equivalent to those for the com-
plementary cases, i.e.,
µ:pMEESF(µ)<pE⇤(µ)
h
pE⇤(µ)� pMEESF(µ)i
= µ:pMEESF(µ)�pE⇤(µ)
h
pMEESF(µ)� pE⇤(µ)i
.
(3.74)
From Fig. 3.8, we see that the virtual probabilities satisfy pMEESF(µ)< pE⇤(µ) for
RB, and pMEESF(µ)> pE⇤(µ) for RC, and that the virtual probabilities are near-zero
for RA.
Lemma 3.4. For bK 2 {1,2, . . . ,K}, we have
pMEESF(µ)� pE⇤(µ),µ 2 RC. (3.75)
Proof. See Appendix A.6.
Proposition 3.7. If l �p
5+12 r
bK�1,
Ân2N MEESF
A
pMEESFn < Â
n2N E⇤A
pE⇤n , (3.76)
63
Job-Assignment Heuristics with Jockeying
then, {2,3, . . . ,K�1}✓FbK.
Proof. See Appendix A.7.
When pMEESF(r j)� pE⇤(r j) decreases rapidly as r j decreases from rbK�1 to 0,
then the condition (3.76) will hold true. The descent rate is strongly related to the
value of l/rbK , which should be made as high as possible. Although l can be infinite,
bK cannot be equal to K. If bK = K, the condition (3.76) can not hold true, and the
shadowed zones, as illustrated in Fig. 3.8 will diminish in C which leads to higher
energy efficiency for MEESF.
To find the feasible region FbK as defined in (3.63), define a set
fFbK =
(
2,3, . . . ,min
"
K�2,max
j :j+1
Âi=1
µi l
!#)
(3.77)
and give Proposition 3.8 as follows.
Proposition 3.8. For the system defined in Section 3.2, if K � 4 and B j � 3, j =
1,2, . . . ,K, then,
fFbK \ fF 0
bK ✓FbK.
where fF 0bK✓ {2,3, . . . ,K} and m 2 fF 0
bKif and only if the following conditions hold
true.
Condition 3.1. B1 B2 . . . Bm�1.
Condition 3.2.
(p
3+1)rm�1 rm p
2�12
rm+1
andm+1Âj=1
e j
m+1Âj=1
µ j
� 94
mÂj=1
e j
mÂj=1
µ j
.
Proof. See Appendix A.10.
64
3.5 Jockeying Job-Assignment Heuristics
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cum
ulat
ive
dist
ribut
ion
Case 1Case 2Case 3
(a) r = 0.4
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cum
ulat
ive
dist
ribut
ion
Case 1Case 2Case 3
(b) r = 0.6
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cum
ulat
ive
dist
ribut
ion
Case 1Case 2Case 3
(c) r = 0.8
Fig. 3.9 Relative difference of energy efficiency of E⇤ to RM.
Proposition 3.8 does not rely on Proposition 3.7 which is used to describe our
intuition for fFbK .
3.5.6 Approximate E⇤
In this section, we numerically demonstrate that the RM policy approximates the
policy E⇤ with optimal bK. In Fig. 3.9, we present the cumulative distribution of the
relative difference of energy efficiency between E⇤ and RM with the same settings
as those used in Fig. 3.7. The relative difference of energy efficiency of E⇤ to RM is
G(E⇤,RM). Results for RM in Figs. 3.9a and 3.9b are close to those for MEESF in
65
Job-Assignment Heuristics with Jockeying
Figs. 3.7a and 3.7b. We nevertheless observe a clear improvement from MEESF in
Fig. 3.7c to RM in 3.9c; the relative difference of MEESF in Fig. 3.7c is higher than
5% for 40% of the experiments in Case 3 while, for RM in Fig. 3.9c, it is only in
17% of our experiments that the enery efficiency degrades from E⇤ to MEESF for
over 5%.
In Fig. 3.9, we observe that the values of the relative difference are less than
5% for over 78% of the experiments in all cases, which numerically demonstrate
the high accuracy of RM for approximating E⇤. Note that the curves with r = 0.4,
r = 0.6 and r = 0.8 in Figs. 3.9a-3.9c, respectively, are similar to each other. That
is, the accuracy of RM for approximating E⇤ is relatively stable with three different
traffic intensities, which is practical to a real environment with uncertain arrival rate.
In Fig. 3.10, we present the histogram of the difference between the values of bK
for E⇤ and RM. The settings are the ones already used in Fig. 3.9.
By observing Fig. 3.10, the difference between valus of bK for E⇤ and RM vary
within the range of {�2, . . . ,1}. That is, for a system with 50 servers, we only need
to explore at most four integers for an optimal bK. In addition, in Fig. 3.10, as the
normalized system load grows from 0.6 to 0.8, the range of the distance between
RM an E⇤ remains the same, and points to the stability of our system in a real
environment with varying l .
3.6 conclusion
We have proposed a new approach that gives rise to an insensitive job-assignment
policy for the popular server farm model comprising a parallel system of finite-buffer
PS queues with heterogeneous server speeds and energy consumption rates. Unlike
the straightforward MEESF approach that greedily chooses the most energy-efficient
servers for job assignment, an important feature of the more robust E* policy is to
aggregate an optimal number of most energy-efficient servers as a virtual server.
66
3.6 conclusion
−3 −2 −1 0 1 2 30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Difference
Rel
ativ
e fre
quen
cy
Case 1Case 2Case 3
(a) r = 0.4
−3 −2 −1 0 1 2 30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Difference
Rel
ativ
e fre
quen
cy
Case 1Case 2Case 3
(b) r = 0.6
−3 −2 −1 0 1 2 30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Difference
Rel
ativ
e fre
quen
cy
Case 1Case 2Case 3
(c) r = 0.8
Fig. 3.10 Difference of the value of bK of E⇤ to RM.
The policy E* is designed to give preference to this virtual server and utilize its
service capacity in such a way that both the job throughput and the energy efficiency
of the system can be improved. We have provided a rigorous analysis of the E*
policy, and have shown that E* has always a higher job throughput than MEESF;
and that there exist realistic and sufficient conditions under which E* is guaranteed
to outperform MEESF in terms of the energy efficiency. We have further proposed
a rule of thumb to form the virtual server by simply matching its aggregate service
rate to the job arrival rate. Extensive experiments based on random settings have
confirmed the effectiveness of the resulting RM policy. In addition, the algorithm
67
Job-Assignment Heuristics with Jockeying
proposed in [21] was shown to have polynomial computational complexity. The
optimal solution obtained by this algorithm was also shown to bw optimal for
general jockeying policies when job sizes are exponentially distributed. Despite the
polynomial complexity, the optimal policy was shown not to be sufficiently scalable
for a practical system with tens of thousands of servers.
68
Chapter 4
Job Assignment Heuristics without
Jockeying
The data-center industry, with over 500 thousand data centers worldwide [5], has
been growing in parallel with the dramatic increase in global Internet traffic. An
estimated 91 billion kWh of electricity was consumed by U.S. data centers in 2013,
and this consumption rate continues to grow, resulting in $13 billion annually for
electricity bills and potentially nearly 100 million metric tons of carbon pollution per
year by 2020 [7]. Servers account for the major portion of energy consumption of
data centers [8]. Our aim here is to describe optimal scheduling/dispatching strategies
for incoming job requests in a server farm so as to improve energy efficiency.
This has been a topic of interest for some time, with approaches such as speed
scaling to reduce energy consumption by controlling server speed(s) [10, 11, 16, 79,
80]. Right-sizing of server farms has been done by powering servers on/off accord-
ing to traffic load [17, 18, 45, 81], and by switching servers between active/sleep
mode according to number of waiting jobs [82]. In [20, 83] resource allocation
methodologies are used to distribute the power budget for energy conservation.
Rapid improvements in computer hardware have resulted in frequent upgrades
of parts of the server farms. This, in turn, has led to server farms with a range of
69
Job Assignment Heuristics without Jockeying
different computer resources (heterogeneous servers) being deployed [23]. Such
heterogeneity significantly complicates optimization, since each server needs to
be considered individually. Despite the complexity, we are able to improve here
the energy efficiency of heterogeneous server farm via appropriate scalable job
assignment policies that are applicable to server farms with tens of thousands of
servers. For the purposes of this chapter, a server farm is postulated to have a fixed
number of servers with no possibility of powering off during the time period under
consideration; this, in practice, could apply to periods during which no powering off
takes place. In this way, the job assignment policies described here can be combined
with the right-sizing techniques mentioned above, as appropriate. Note that frequent
powering off/on increases wear and tear and the need for costly replacement and
maintenance [84].
Here, we consider a system in which idle servers may have non-negligible energy
consumption rate [32]. In [16, 21, 22], job assignment policies have been discussed
without consideration of idle power (energy consumption rate of idle servers), and
in [17, 45, 85], such policies have been considered for a server pool with identical
servers. In the server farms considered, servers are not assumed to be identical. To
the best of our knowledge, there is no published work that considers idle power in
designing job assignment methodologies for a given heterogeneous server pool.
Other job assignment policies in the literature (e.g., [57, 59, 62]) have considered
scenarios with infinite buffer size and have sought to minimize delay. We consider a
server farm with parallel finite-buffer queues, and with heterogeneous server speeds,
energy consumption rates, and buffer sizes. As in [21], the energy efficiency of a
server farm is defined as the ratio of the long-run expected job departure rate divided
by the expected energy consumption rate. It forms the objective function of our
optimization strategy. This objective function represents the amount of useful work
(e.g., data rate, throughput, processes per second) per watt.
70
The processor sharing (PS) discipline is imposed on each queue, so that all jobs
on the same queue share the processing capacity and are served at the same rate.
The PS discipline avoids unfair delays for those jobs that are preceded by extremely
large jobs, making it an appropriate model for web server farms [24, 25], where
job-size distributions are highly variable [27, 86]. The finite buffer size queuing
model with PS discipline can be applied in situations where a minimum service
rate is required for processing a job in the system [87]. In communication systems,
broader applications of PS queues have been studied; e.g., [28, 29].
A key feature of our approach is to model the problem as an instance of the
Restless Multi-Armed Bandit Problem (RMABP) [33] in which, at each epoch, a
policy chooses a server to be tagged for a new job assignment (other servers are
said to be untagged). The general RMABP has been proved SPACE-hard [34], and
this has led to studies in scalable and near-optimal approximations, such as index
policies. An index policy selects a set of tagged servers at any epoch according to
their state-dependent indices.
We consider a large-scale realistically-dimensioned server farm that cannot reject
a job if it has buffer space available. Such a situation would occur, for instance,
where a server farm owner is unable to replace all older servers simultaneously, and
so legacy inefficient servers are needed to meet a service level agreement. Buffering
spill-over creates dependencies between servers, and requires us to postulate uncon-
trollable states [35]. In other words, the constraint on the number of tagged servers
in conventional RMABP has to be replaced by a constraint on the number of tagged
servers in controllable states. As far as we are aware, there are no theoretical results
on the asymptotic optimality of an index policy for a multi-server system with finite
buffer size, where loss of jobs happens if and only if all buffers are full. A further
discussion on existing related work is provided in Section 4.1.
As mentioned in Section 1.4, the main contribution of this chapter is listed as
follows.
71
Job Assignment Heuristics without Jockeying
• We propose a new job assignment method to attempt maximization of energy
efficiency. The new proposed policy is referred to as the Most energy-efficient
available server first Accounting for Idle Power (MAIP). MAIP prioritizes
the most energy-efficient servers that are available (that is, servers with at
least one vacant slot in their buffers) and requires the energy consumption
rate of idle servers as an input parameter for decision making. This policy
provides a model of a real system with significant energy consumption rate
in idle states. MAIP is scalable and requires only binary state information
of servers, making it suitable for an environment with frequently changing
server states. Our server farm model is centralized and is applicable to a
local system with frequently changing information, which for our case is the
binary information of server states. We note that Google has built a centralized
control mechanism for network routing and management that monitors all link
states and is scalable for Google’s building-scale data center [88].
• We prove, remarkably, that when job sizes are exponentially distributed, the
Whittle’s index policy is equivalent to MAIP, and that it is asymptotically
optimal for our server farm comprised of multiple groups of identical servers
as the numbers of servers in these groups tend to infinity. It is reasonable to
assume that if the total number of servers in a server farm is very large, then
the number of servers bought in a single batch, or over a short period of time
during which the technology is not improving, will also be large. In any case,
there is cost benefit in buying in bulk, so that the number of servers purchased
at once is likely to be large. More importantly, the typical lifespan of a server
is in the range of 3 years [89, 90]. Accordingly, a modern server farm is likely
to be categorized into several server groups, each of which contains a large
number of servers of the same or similar type and attributes that were bought
at the same time, or over a short time period. The well-known Whittle’s index
72
4.1 Related Work
relaxation enables decomposition of a complex RMABP problem into multiple
sub-problems, assumed computationally feasible [33]. Note that, in the general
case, Whittle’s index does not necessarily exist, and even if it does, a closed
form solution is often unavailable. As mentioned before and in Section 4.1, the
buffer constraint in our case enforces the need for uncontrollable states and, in
turn, prevents direct application of previous asymptotic optimality results on
RMABP to our problem.
• We demonstrate numerically the effectiveness of MAIP by comparing it with
a baseline policy that prioritizes the most energy-efficient available serves but
ignores idle power. Although, as mentioned above, powering off servers is not
considered here, the performance of the baseline policy can be significantly
improved by taking the idle power into consideration (MAIP) in terms of
energy efficiency. MAIP is demonstrated numerically to be almost insensitive
to the shape of job-size distributions. We also demonstrate the applicability of
MAIP for a large server farm with significant cost of job reassignment.
The remainder of this chapter is organized as follows. In Section 4.1, we proived
the related work for job assignment policies. In Section 4.2, we define our server
farm model. In Section 4.3, we propose the MAIP policy, and in Section 4.4, we
give the proof for the asymptotic optimality of MAIP. We present numerical results
in Section 4.5 and conclusions are given in Section 4.6.
4.1 Related Work
Queueing models associated with job assignment among multiple servers with and
without jockeying (reassignment actions of incomplete jobs) have been studied since
1958 [50]. Most existing work has focused on job assignment policies that aim to
73
Job Assignment Heuristics without Jockeying
improve the system performance under a first-come-first-served (FCFS) discipline
such as Join-the-Shortest-Queue (JSQ).
For the non-jockeying case, JSQ under PS has been analyzed in [24, 57–59].
Bonomi [57] proved optimality of JSQ for the processor sharing case under a general
arrival process, a Markov departure process, and homogeneous servers while, for a
non-exponential job-size distribution, a counter-example to optimality of JSQ has
been given by Whitt [58]. Gupta [59] provided an analysis for the approximation
of the state distribution of a system under JSQ with homogeneous PS servers and
general job-size distributions. Gupta also showed the optimality of JSQ in terms of
average delay for a system comprising servers with two different service rates.
Server farm applications of JSQ with jockeying policies for FCFS have been
studied in [60–62]. In these papers, when the difference between the longest and
shortest queue sizes achieves a threshold, a jockeying action is triggered. Different
values of the threshold clearly result in different JSQ policies. These publications
focus on the calculation of the equilibrium distribution of the lengths of queues.
Energy efficiency for a multi-queue heterogeneous system with infinite buffers
and set-up delay has been studied in [19], where the authors assume zero energy
consumption rate when a server is idle. Hyytiä et al. [19] have shown that the
M/G/1-LCFS is insensitive to the shape of the distribution of set-up delay while
this insensitivity is lost in the M/G/1-PS. In [91], energy-efficient job assignment
in a system with heterogeneous servers and homogeneous job has been considered,
where jobs were queued in an infinite public buffer without waiting room on each
server and no cost were consumed for an idle server. The authors analyzed the
coinciding of individual optimality (minimizing the cost of one job) and social
optimality (minimize the sum of the cost of all jobs) that were, both proved in this
paper, threshold style in certain situations.
Energy-aware PS multi-queue systems with jockeying have been studied in
[21, 22], where the optimization problem is characterized as a semi-Markov decision
74
4.1 Related Work
process (SMDP) [65]. In that paper, maximization of the ratio of job throughput to
power consumption (the ratio of the long-run average job departure rate to the long-
run average energy consumption rate) is introduced as a measure of performance.
The use of long-run average reward per unit cost (e.g., time consumption, energy
consumption, etc.) as an objective function in [21] generalizes long-run average
service quality per unit time, studied previously.
In this chapter, we consider a large-scale realistically-dimensioned server farm
model with heterogeneous servers. The jockeying case discussed in [21, 22] is more
appropriate for a localized server farm, in which the cost of jockeying actions is
negligible. For a server farm with significant jockeying costs, a simple scalable
job assignment policy without jockeying is more attractive. A similar dynamic
programming methodology in which the computational complexity increases linearly
in the number of states can be applied to our case. Unfortunately, in the non-
jockeying server farm the number of states increases exponentially in the number of
servers, so that the optimal solution is limited to very small cases with a few servers.
As in Section 4, to show the asymptotic optimality MAIP in this chapter, we
consider an interesting and useful framework: bandit problem. Gittins and Jones
published the well-known index theorem in 1974 [92, 93]. In 1979, Gittins [94] gave
the optimal solution for traditional multi-armed bandit problem (MABP) in the form
of the so-called Gittins index policy. More details concerns Gittins indices can be
found in [92, Chapter 2.12] (and references therein). While studies for the traditional
MABP consider only that one machine (projector/bandit/process) is played at a time
and that only played machine changes states, Whittle [33] published a more general
model, restless multi-armed bandit problem (RMABP) and proposed the so-called
Whittle’s index as an approximation for optimality. In 1990, Weber and Weiss [95]
proved, under certain conditions, the asymptotical optimality of Whittle index, which
has been conjectured by Whittle. Papadimitriou and Tsitsiklis [34] proved that the
optimization of RMABP is PSPACE-hard. The Whittle index policy is derived from
75
Job Assignment Heuristics without Jockeying
a relaxation of the original linear optimization problem. Bertsimas and Niño-Mora
[96] analyzed the performance region for the RMABP problem with different levels
of relaxations, and obtained a hierarchy of linear programming (LP) relaxations.
They then proved that the performance region of a lower order relaxation is involved
in that of a higher order relaxation, where the highest order relaxation is equivalent
to the original problem.
In 2001, Niño-Mora claimed a partial conservation law (PCL) for the optimality
of RMABP [97], which is an extended version for the general conservation law (GCL)
published in 1996 [98]. Later, in [35], he defined a group of problems that satisfies
PCL-indexibility and proposed a new index policy that improves Whittle’s. The new
index policy is optimal for problems under PCL-indexibility. PCL-indexibility leads
to, and is stronger, than the indexibility introduced by Whittle. More details in the
optimality of problems with indexibility can be found in [99].
Recently, in [100], the proof of the asymptotic optimality of Whittle’s index in
[95] has been extended to cases with serveral classes of bandits, with arrivals of
new bandits and with multiple actions per bandit. Also, Verloop proposed an index
policy which is not restricted to indexable models and is numerically demonstrated
to be near optimal in non-asymptotic region. In [101], Larrañaga et al. analyzed
a system with multiple users (modeled as multiple bandits), aiming at minimizing
the average cost comprising of convex holding costs and user impatience. The user
impatience here represents a cost triggered by the voluntary leaving of customers
that have been waiting too long in the system. Larrañaga et al. [101] analyzed a
system with multiple users (modeled as multiple bandits), aiming at minimizing the
average cost: a combination of convex (also non-decreasing) holding costs and user
impatience. This work contains uncontrollable states as a bandit cannot be played
when the number of corresponding users is zero, but the non-decreasing holding
cost constraint, which simplifies the asymptotic optimality argument, cannot be
guaranteed in our problem.
76
4.1 Related Work
Stolyar [102] studied a switching model with multiple parallel servers (queues)
based on discrete time Markov chain, and proposed a MaxWeight discipline under
heavy traffic and an assumption of Resource Pooling which ensures the stability of
the multi-queue system. The MaxWeight discipline is a special case of so-called
cµ-rule, and has been proved in [102] to asymptotically minimizes the cumulative
holding cost during a finite time interval. A more generalized algorithm has been
proposed in [103] for a similar queueing system. In [104], Mandelbaum and Stolyar
considered a similar model in continuous time. They proved that a simple generalized
cµ-rule asymptotically minimizes instantaneous and cumulative holding costs in a
queueing with multiple-parallel flexible servers and multi-class jobs when the system
is in heavy traffic and a stablility condition is satisfied. The holding cost rate in
[104] is assumed to be increasing and covex which is more general than the holding
cost rate discussed in [102]. However, the model discussed in [102, 103] is more
general in that it is applicable to queueing system with arbitrary dependence between
servers but this general model is not amenable for a large-scale system. Stolyar
[105] proposed a MinDrift rule to combine dynamic programming and Lagrangian
relaxation and to approximate an optimal solution, for the output queued problem
under a complete resource pooling (CPR) condition, that minimizes the customer
workload, which can be specified as average job number. In particular, for the
customer workload, earlier results in [104] have shown the asymptotical optimality
of the generalized cµ rule (Gcµ-rule) under the CPR condition for the input queued
problem. Work in [102–105] focuses on infinite buffer size and heavy-traffic regime,
where server is only idle in an ignorable time. MaxWeight, Gcµ-rule and Mindrift
are typical index policies.
Nazarathy and Weiss [106] proposed a method for the control problem of a
multi-server queueing system over a finite time horizon. This method decomposes
the original problem into local sub-problems where each of the decisions relies only
on the local information of a sub-problem. They obtained the optimal solution of the
77
Job Assignment Heuristics without Jockeying
control problem in the fluid region by means of separated continuous linear program
(SCLP). Their heuristic method for the original control problem is shown to converge
to the optimal solution in the fluid limit, in other words, the method asymptotically
minimizes the total cost over a finite time horizon.
Ayesta et al. [107] have studied a preemptive queue with infinite buffer and
multiple users as a model for the flow-level behavior of end-users in a narrowband
HDR wireless channel (CDMA 1xEV-DO). They discussed conditions for the sta-
bility and the asymptotic optimality of policies under which users (channels) are
selected. They showed the importance of the tie-breaking rule on policies, such as
priority-index policy, best-rate (BR) policy and best-rate priority (BRP) policies.
They discuss conditions for the stability and asymptotic optimality of policies for
selecting users (channels). They claim that any BR policy is stable (satisfying the
maximum stability condition, which have been considered as certain conditions in
[108, 109]), but priority-index policies are essentially not stable. Those priority-
index policies, which are also referred to as index policies, have been studied for
decades. For the whittle-index-based policies, Taboada et al. in [110] studied the
time-varying channel problem with general job-size distribution and proposed a new
index rule, referred to as Attained Service Potential Improvement, in the case with
decreasing hazard rate flow (job) size distribution. They showed numerically that the
new rule outperforms cµ-rule, MaxRate scheduler, and Proportional Fair, which are
well-known opportunistic disciplines and are based on the spirit of Whittle index.
In [111], Atar and Shifrin analyzed a G/G/1 queue with a finite buffer and multiple
classes of jobs; all jobs share the finite buffer capacity of this queue. They improved
a methodology, which is based on cµ-rule. This methodology admits/rejects jobs
for the case with infinite buffer size where decision maker only need to know the
lowest priority of those arrival classes. However, in finite buffer case, the order of the
priorities of those job classes must be known. Asymptotic optimality of their method
was also proved under certain conditions. Glazebrook et al. in [112, 113] defined
78
4.2 Model
full indexability as an extended version of the indexability defined by Whittle, where
more flexible resource allocation was considered. Classical indexability requires
a binary set of actions, i.e. {0,1}, while, the full indexability allows for action
variables to be chosen from a more general set [0,1].
For the whittle-index-based policies, Taboada et al. in [110] studied the time-
varying channel problem with infinite buffer size and general job-size distribution
and proposed a new index rule, referred to as Attained Service Potential Improvement
under decreasing hazard rate distributions for flow (job) size. Numerically, they
showed that the new rule outperforms cµ-rule, MaxRate scheduler, and Proportional
Fair, which are well-known opportunistic disciplines based on Whittle index.
This large, but by no means complete, collection of related work, contains no
asymptotic optimality result directly applicable to our problem of a multi-server
queue system with a buffer constraint on each queue, requiring the presence of
uncontrollable states as mentioned before. As far as we are aware, there is no result
on the asymptotic optimality of an index policy in this context.
4.2 Model
We consider a heterogeneous server farm modeled as a multi-queue system where
reassignment of incomplete jobs is not allowed. For the reader’s convenience, Table
4.1 provides a list of symbols that are frequently used in this chapter.
The server farm has a total of K � 2 servers, forming the set K = {1,2, . . . ,K}.
These servers are characterized by their service rates, energy consumption rates, and
buffer sizes. For j 2K , we denote by µ j the service rate of server j and by B j its
buffer size. The energy consumption rate of server j is e j when it is busy and e
0j
when it is idle, respectively, where e j > e
0j � 0. We refer to the ratio µ j/(e j� e
0j )
as the effective energy efficiency of server j. Note that this definition of energy
79
Job Assignment Heuristics without Jockeying
Table 4.1 Summary of Frequently Used Symbols
Symbol Definition
K Set of servers in the systemK Number of servers in the systemB j Buffer size of server jµ j Service rate of server je j Energy consumption rate of server j when it is busye
0j Energy consumption rate of server j when it is idle
µ j/(e j� e
0j ) Effective energy efficiency of server j
l Job arrival rateL f Job throughput of the system under policy f
E f Power consumption of the system under policy f
L f/E f Energy efficiency of the system under policy f
efficiency for each server in the system that takes into account the effect of idle
power is key to the design of the MAIP policy proposed in this chapter.
Job arrivals follow a Poisson process with rate l , indicating the average number
of arrivals per time unit. An arriving job is assigned to one of the servers with at
least one vacant slot in its buffer, subject to the control of an assignment policy f . If
all buffers are full, the arriving job is lost.
We assume that job sizes (in units) are independent and identically distributed,
and normalize without loss of generality the average size of jobs to one. Each server
j serves its jobs at a total rate of µ j using the PS service discipline.
Our considerations are limited to realistic cases, and assume that the ratio of the
arrival rate to the total service rate, i.e., r
def= l/ÂK
j=1 µ j, is sufficiently large to be
economically justifiable but not too large to violate the required quality of service.
We refer to r as the normalized offered traffic.
The job throughput of the system under policy f , which is equivalent to the
long-run average job departure rate, is denoted by L f . The power consumption
of the system under policy f , which is equivalent to the long-run average energy
80
4.3 Job Assignment Policy: MAIP
consumption rate, is denoted by E f . By definition, L f/E f is the energy efficiency
of the system under policy f .
4.3 Job Assignment Policy: MAIP
Here we provide details of the MAIP policy. Note that, with a non-jockeying policy,
the server farm scheduler makes assignment decisions only at arrival events and
assigns a new job to one of the available servers in the system. MAIP is designed in
such a way that takes into account the effect of idle power. The key idea of MAIP
can be conveniently explained by using a simple example.
Consider a system with two servers only, where µ1 = µ2 = 1, e1 = 2, e
01 = 1,
e2 = 2.5, and e
02 = 2. It is clear that in this example e1 < e2 and e
01 < e
02 . If a job
arrives when both servers are idle, the scheduler has two choices:
1. Assigning the job to server 1 makes server 1 busy. The energy consumption
rate of the whole system becomes e1 + e
02 = 4.
2. Assigning the job to server 2 makes server 2 busy. The energy consumption
rate of the whole system becomes instead e2 + e
01 = 3.5.
Since (e1 + e
02 )> (e2 + e
01 ), which is equivalently (e1� e
01 )> (e2� e
02 ), and since
both servers have the same service rate, choosing server 2 for serving the job in
this particular example turns out to be better in terms of the energy efficiency of the
system, despite the fact that server 2 consumes more power when busy than server 1
does.
Intuitively, in the situation where power consumption of idle servers in a system
is not necessarily negligible, the energy used by the system can be categorized
into two parts: productive and unproductive. The productive part contributes to
job throughput, whereas the unproductive part is a waste of energy. For a server j,
when it is idle, the service rate is 0 accompanied by an energy consumption rate of
81
Job Assignment Heuristics without Jockeying
e
0j ; when it is busy, the service rate becomes µ j and the energy consumption rate
increases to e j. We regard the additional service rate µ j�0 as a reward at the cost of
an additional energy consumption rate e j� e
0j . In other words, if jobs are assigned
to server j, the productive power used to support the service rate µ j is effectively
e j� e
0j . Accordingly, productive power is our main concern in the design of MAIP.
Since MAIP aims for energy-efficient job assignment, for convenience of descrip-
tion, we label the servers according to their effective energy efficiency. In particular,
in the context of MAIP, server i is defined to be more energy-efficient than server j
if and only if µi/(ei� e
0i )> µ j/(e j� e
0j ). That is, for any pair of servers i and j, if
i < j, we have µi/(ei� e
0i )� µ j/(e j� e
0j ). Then, MAIP works by always selecting
a server with the highest effective energy efficiency among all servers that contain at
least one vacant slot in their buffers, where ties are broken arbitrarily. As a result of
this design, MAIP is a simple approach that requires only binary state information
(i.e., available or unavailable) from each server for its implementation.
4.4 Analysis
Here we give a precise definition of our optimization problem as described informally
in Section 4.4.1. In Section 4.4.2, the Whittle’s index policy for our problem is
described, and in Section 4.4.3, with exponentially distributed job sizes, we prove
the indexability of our server farm model and present a closed-form expression of
Whittle’s index, producing the equivalence of Whittle’s index policy and MAIP.
In Section 4.4.4, the proof of asymptotic optimality of Whittle’s index policy is
presented, leading to the asymptotic optimality of MAIP. Namely, the difference
between MAIP and an optimal solution, which maximizes the energy efficiency of
the server farm, asymptotically tends to zero as K!+•.
82
4.4 Analysis
4.4.1 Stochastic Process
For j = 1,2, . . . ,K, let N j denote the set of all states of server j, where the state,
n j is the number of jobs in the queue of or being served by server j. Thus, N j =
{0,1, . . . ,B j} where B j � 2 is the buffer size for server j.
For server j, states 0,1, . . . ,B j�1 are called controllable states, while the state
B j is termed uncontrollable state. The set of controllable states, in which server
j is available to be tagged, is denoted by N {0,1}j = {0,1, . . . ,B j � 1} while, for
the uncontrollable state in the set N {0}j = {B j}, server j is forced to be untagged
because it cannot accept jobs.
For t � 0, we define X(t) = (X1(t),X2(t), . . . ,XK(t)) to be a vector of random
variables (or a random vector) representing the state at time t of the stochastic process
of the multi-queue system. Define the vectors n = (n1,n2, . . . ,nK) as values of X(t)
with n j 2N j, j 2J , representing the states of the system. The set of all such states
n is denoted by N , and the sets of uncontrollable and controllable states in N are
given by
N {0} =n
n 2N | n j 2N {0}j , 8 j 2J
o
,
N {0,1} =n
n 2N | n /2N {0}o
.(4.1)
Decisions made upon job arrivals rely on the values of X(t) just before the
corresponding arrivals occur. We use af
j (t), j 2J as an indicator of activity at time
t under policy f , so that af
j (t) = 1 if server j is tagged, and af
j (t) = 0, otherwise.
Then  j2J af
j (t) 1 for all t > 0. All job assignment policies considered are
stationary, so that we can also use af
j (n), n 2N , to represent the action we take
when the system is in state n. A policy f comprises those af
j (t) for all t > 0 and,
according to the definition of stationary policy, f can also be represented by the
sequence of af (n) = (af
1 (n),af
2 (n), . . . ,af
K(n)), for all n 2N .
Let {Xf (t), t > 0}, represent the stochastic process under policy f with initial
state Xf (0) = x(0). For each j = 1,2, . . . ,K, define a mapping R j : N j! R, where
83
Job Assignment Heuristics without Jockeying
R j(n j) is the reward rate of server j in state n j. Let R j be the set of all such mappings.
Then, for a given vector of mappings R = (R1,R2, . . . ,RK), we define the long-run
average reward under policy f to be
g
f (R) = limt!+•
1tE(
Z t
0Â
j2JR j(X
f
j (u))du
)
. (4.2)
We refer to R as the reward rate function. Along similar lines, for each j = 1,2, . . . ,K,
we consider µ j(n j) and e j(n j), the service rate and power consumption of server j
in state n j, respectively, as rewards; that is, µ j,e j 2R j. As mentioned in Section
4.2, we define µ j(n j) = µ j and e j(n j) = e j for n j > 0 with µ j(0) = 0 and e j(0) =
e
0j where µ j > 0, e j > e
0j � 0, j 2J . For the vectors µ = (µ1,µ2, . . . ,µK) and
e = (e1,e2, . . . ,eK), the long-run average job service rate of the entire system is then
g
f (µ) and the long-run average energy consumption rate of the system is g
f (e).
For simplicity, we refer to the long-run average job service rate and the long-run
average energy consumption rate as the job throughput and power consumption,
respectively. As in [21, 22], and as informally discussed earlier, the energy efficiency
of the system is the ratio of the job throughput to the power consumption. The
problem of maximizing energy efficiency is then encapsulated in
maxf2F
g
f (µ)
g
f (e). (4.3)
Based on the definitions above, we rigorously define the MAIP as follows:
if n 2N {0,1}, aMAIPj (n) =
8
>
<
>
:
1, j = minargmaxj:n j2N
{0,1}j
{Q j},
0, otherwise,
if n 2N {0}, aMAIPj (n) = 0, f or all j = 1,2, . . . ,K.
(4.4)
84
4.4 Analysis
4.4.2 Whittle’s Index
Gittins and Jones published the well-known index theorem for SFABP in 1974
[92, 93], in 1979, Gittins [94] produced the optimal solution for the general multi-
armed bandit problem (MABP) in the form of the so-called Gittins’ index policy.
Additional details concerning the history of the Gittins index can be found in [92,
Chapter 2.12] along with proofs. Relaxing the constraint that only one machine
(project/bandit/process) is played at a time, and that only the played machine changes
state, Whittle [33] published a more general model, the restless multi-armed bandit
(RMAB) and proposed an index, the so-called Whittle’s index, as an approximation
for optimality.
The general definition of Whittle’s index is given here; a closed-form expression
will be provided in Section 4.4.3 for the case when job sizes are exponentially
distributed.
According to [21, Theorem 1], there exists a value e⇤ > 0 given by
e⇤ = maxf2F
⇢
g
f (µ)
g
f (e)
�
, (4.5)
so that our optimization problem (4.3) can be written as
supf2F
8
>
<
>
:
g
f (R) : Âj2J :Xf
j (t)2N{0,1}
j
af
j (t) = 1, t � 0
9
>
=
>
;
(4.6)
whereR = (R1,R2, . . . ,RK),
R j 2R j,
R j(n j) = µ j(n j)� e⇤e j(n j),
j 2J .
85
Job Assignment Heuristics without Jockeying
Following the Whittle’s index approach, we relax the problem (4.6) as
supf
limt!+•
1tE(
Z t
0Â
j2JR j(X
f
j (u))du
)
,
s.t. E
8
>
<
>
:
Âj2J :Xf
j (t)2N{0,1}
j
af
j (t)
9
>
=
>
;
= 1.
(4.7)
Seldom in the literature is it well explained that this means that the af
j (t) become
random variables, so that sometimes more than one server will be tagged simultane-
ously. This is not consistent with the framework of our original problem and is also
unrealistic.
The linear constraint in (4.7) is covered by the introduction of a Lagrangian
multiplier n 2R, namely,
infn
supf
limt!+•
1tE(
Z t
0
"
Âj2J
R j(Xf
j (u))
�n Âj2J :Xf
j (t)2N{0,1}
j
af
j (u)
3
7
5
du
9
>
=
>
;
+n . (4.8)
For a given n , we can now decompose (4.8) into K sub-problems:
supf
limt!+•
1tE⇢
Z t
0
h
R j(Xf
j (u))�naf
j (u)i
du
�
, (4.9)
where af
j (u) = 0 when Xf
j (u) 2N {0}j , for 0 < u < t, j 2J .
In [33], Whittle defined a n-subsidy policy for a project (server) as an optimal
solution for (4.9), which provides the set of states where the given project will be
passive (untagged), and introduced the following definition.
86
4.4 Analysis
Definition 4.1. With n 2R, let D(n) be the set of states of a project which is passive
under a n-subsidy policy. The project is indexable if D(n) increases monotonically
from /0 to the set of all possible states for the project as n increases from �• to +•.
In particular, if a project (server) j is indexable and there is a n
⇤ satisfying
n j /2 D(n) for n n
⇤ and n j 2 D(n) otherwise, then this value n
⇤ is the value of
Whittle’s index for project (server) j at state n j. The Whittle’s index values may be
different for different servers at different states. Then, Whittle’s index policy for the
multi-queue system will select a controllable server (a server in controllable states)
with highest Whittle’s index to be tagged (with others untagged) at each decision
making epoch.
4.4.3 Indexability
We give the closed form of the optimal solution for Problem (4.9), namely, the Whit-
tle’s index policy, for the case with exponentially distributed job sizes. Our approach
uses the theory of semi-Markov decision processes and the Hamilton-Jacobi-Bellman
equation. This formulation requires the exponential job size assumption.
For this section, we define R j(n j) = µ j(n j)� e⇤e j(n j), j 2J , where e⇤ is
defined as in (4.5). For policy f j, let V f j,nj (n j,R j) denote the expected value of the
cumulative reward of a process for server j 2J with reward rate R j(n j)�naf jj (n j)
that starts in state n j 2N j and ends when it first goes into an absorbing state n0j 2N j.
In particular, V f jj (n0
j) = 0 for any f j. Here, f j is a stationary policy for server j,
which determines whether it is tagged or not according to its current state Xf jj (t).
Because state 0 is reachable from all other states, we can assume without loss of
generality that n0j = 0 for all j 2J .
Now, let PHj represent a process of Xf j
j (t) for server j 2J that starts from
state 0 until it reaches state 0 again, where f j is constrained to those policies
satisfying af jj (0) = 1. The set of all such policies is denoted by FH
j . It follows from
87
Job Assignment Heuristics without Jockeying
[49, Corollary 6.20 and Theorem 7.5] that the average reward of process PHj is
equivalent to the long-run average reward of the system.
Now an application of the g-revised reward [49, Theorem 7.6, Theorem 7.7],
yields the following corollary.
Corollary 4.1. For a server j as defined in Section 4.2 and a given n <+•, with
R j(n j) = µ j(n j)� e⇤e j(n j) < +•, there exists a real g, with Rgj(n j) = R j(n j)� g
such that if policy f
⇤j 2 FH
j maximizes V f j,nj (n j,R
gj), then f
⇤j also maximizes the
long-run average reward of server j with reward rate R j(n j)�af
⇤j
j (n j)n , n j 2N j,
among all policies in FHj . In particular, this value of g, denoted by g⇤, is equivalent
to above maximized long-run average reward.
In other words, if we compare the maximized average reward of process PHj
under policy f
⇤j and policy f
0j with a
f
0j
j (0) = 0 (and all the actions for non-zero
states are the same as f
⇤j ), then the one with higher average reward is the optimal
policy for (4.9). Note that, in our server farm model, if af
0j
j (0) = 0, the actions for
non-zero states are meaningless since the corresponding server (queue) will never
leave state 0.
We start by finding this f
⇤j as in Corollary . Let V n
j (n j,Rgj) = sup
f jV f j,n
j (n j,Rgj).
The maximization of V f j,nj (n j,R
gj) can be written using the Hamilton-Jacobi-Bellman
equation as
V n
j (n j,Rgj) = max
f j
(
⇣
R j(n j)�g�naf jj (n j)
⌘
t
f jj (n j)+ Â
n2N j
Pf jj (n j,n)V n
j (n,Rgj)
)
,
(4.10)
where t
f jj (n j) is the expected sojourn time in state n j under f j, and Pf j
j (n j,n),
n j,n 2N j, is the transition probability from state n j to state n for the next epoch.
For the multi-queue system, policy f j comprises a sequence of af jj (n j) for n j 2N j.
We rewrite (4.10) as
88
4.4 Analysis
V n
j (n j,Rgj) = max
(
⇣
Rgj(n j)�n
⌘
t
1j (n j)+ Â
n2N j
P1j (n j,n)V n
j (n,Rgj),
Rgj(n j)t
0j (n j)+ Â
n2N j
P0j (n j,n)V n
j (n,Rgj)
)
, (4.11)
where t
1j (n j) and t
0j (n j), are the expected sojourn time in state n j for af j
j (n j)= 1, and
af jj (n j) = 0, respectively, and P1(n j,n) and P0(n j,n), n j,n 2N j, are the transition
probability for af jj (n j) = 1 and af j
j (n j) = 0, respectively.
For (4.11), there is a specific n , referred to as n
⇤j (n j,R
gj), satisfying
n
⇤j (n j,R
gj)t
1j (n j) = Â
n2N j
P1j (n j,n)V n
j (n,Rgj)� Â
n2N j
P0j (n j,n)V n
j (n,Rgj)
+Rgj(n j)(t
1j (n j)� t
0j (n j)). (4.12)
For an indexable server j, we define a policy as follows:
if n < n
⇤j (n j,R
gj), j will be tagged
if n > n
⇤j (n j,R
gj), j will be untagged, and
if n = n
⇤j (n j,R
gj), j can be either tagged or untagged.
(4.13)
The quantities n
⇤(n j,Rgj), n j 2N j, j 2J , constitute Whittle’s index [33] in this
context, and (4.13) defines the optimal solution for Problem (4.9). According to
(4.12), although the value of n
⇤j (n j,R
gj) may appear to rely on n , we prove later on
that here the value of n
⇤j (n j,R
gj) can be expressed in closed form and is independent
of n , and that the server farm is indexable according to the definition in [33].
Proposition 4.1. For the system defined in Section 4.2, for each j 2J , we have
n
⇤j (n j,R
gj) =
l (µ j� e⇤e j�g)µ j
, n j = 1,2, . . . ,B j�1. (4.14)
Proof. The proof is given in Appendix B.1.
89
Job Assignment Heuristics without Jockeying
Moreover, the optimal policy f
⇤j that maximizes V f j,n
j (n j,Rgj) also maximizes
the average reward of process PHj with the value of g specified in Corollary 4.4.3
among all policies in FHj . For the optimal n-subsidy policy, it remains to compare
f
⇤j with a
f
⇤j
j (0) = 1 and f
0j with a
f
0j
j (0) = 0.
Proposition 4.2. For the system defined in Section 4.2, for each j 2J , we have
n
⇤j (0,R
gj) =
l
µ j
�
µ j� e⇤e j + e⇤e0j�
. (4.15)
Proof. The proof is given in Appendix B.2.
Now Proposition 4.3 is a straightforward consequence of Propositions 4.1 and
4.2.
Proposition 4.3. For the system defined in Section 4.2, if job-sizes are exponentially
distributed then the Whittle’s index of server j at state n j is
n
⇤j (n j,R
gj) = l (1� e⇤
e j� e
0j
µ j), n j = 0,1, . . . ,B j�1. (4.16)
Thus, the system is indexable.
Proof. The proof is given in Appendix B.3.
It is clear that the Whittle’s index policy, which prioritizes server(s) with the
highest index value at each decision epoch, is equivalent to the MAIP policy defined
in (4.4), when job sizes are exponentially distributed. The form of Whittle’s index
in the case for general job-size distributions remains unclear. In Section 4.5, we
numerically demonstrate the sensitivity of MAIP to highly varying job sizes.
4.4.4 Asymptotic optimality
We will prove the asymptotic optimality of MAIP as the number of servers becomes
large when job sizes are exponentially distributes and the number of servers is scaled
90
4.4 Analysis
under appropriate and reasonable conditions for large server farms (as discussed in
Section 1.1).
We will apply the proof methodology of Weber and Weiss [95] for the asymptotic
optimality of index policies to our problem though, as we have already stated in
Section 4.1, this proof cannot be directly applied to our problem because of the
presence of uncontrollable states. We define an additional (virtual) server, designated
as server K+1, to handle the blocking case when all actual servers are full; this server
has only one state (server K + 1 never changes state) with zero reward rate. This
virtual server is only used in the proof of asymptotic optimality in this section. For
this server, |N {0,1}K+1 |= 1 and N {0} = /0. In addition, we define K + =K [{K+1}
as the set of servers including this extra zero-reward server. The set of controllable
states of these K +1 servers is defined as ˜N {0,1} =S
j2K + N {0,1}j , and the set of
uncontrollable states is ˜N {0} =S
j2K + N {0}j .
In this section, those servers with identical buffer size, service rate, and energy
consumption rate are grouped as a server group, and we label these server groups as
server groups 1,2, . . . , K. For servers i, j of the same server group, N {0,1}i =N {0,1}
j
and N {0}i = N {0}
j . For clarity of presentation, we define ˜N {0,1}i and ˜N {0}
i , i =
1,2, . . . , K as, respectively, the sets of controllable and uncontrollable states of servers
in server group i. We regard states for different server groups as different states; that
is, ˜N {0,1}j \ ˜N {0,1}
i = /0 and ˜N {0}j \ ˜N {0}
i = /0 for different server groups i and j,
i, j = 1,2, . . . , K. Let Zf
i (t) be the random variable representing the proportion of
servers in state i 2 ˜N {0,1}[ ˜N {0} at time t under policy f . As previously, we label
states i 2 ˜N {0,1}[ ˜N {0} as 1,2, . . . , I, where I = | ˜N {0,1}[ ˜N {0}|, and use Zf (t)
to denote the random vector (Zf
1 (t),Zf
2 (t), . . . ,Zf
I (t)). Correspondingly, actions
af
j (n j), n j 2N j, j 2K +, correspond to actions af (i), i 2 ˜N {0,1}[ ˜N {0}.
Let z,z0 2 RI be instantiations of Zf (t), t > 0, f 2F. Transitions of the random
vector Zf (t) from z to z0 can be written as z0 = z+ ei,i0 , where ei,i0 is a vector
of which the ith element is + 1K+1 , the i0th element is � 1
K+1 and otherwise is zero,
91
Job Assignment Heuristics without Jockeying
i, i0 2 ˜N {0,1}[ ˜N {0}. In particular, for the server farm defined in Section 4.2, servers
in server group j only appear in state i 2 ˜N {0,1}j [ ˜N {0}
j ; that is, the transition from
z to z0 = z+ei,i0 , i 2 ˜N {0,1}j [ ˜N {0}
j , i0 2 ˜N {0,1}j0 [ ˜N {0}
j0 , j, j0 = 1,2, . . . , K, j 6= j0
never occur. We address such impossible transitions by setting the corresponding
transition probabilities to zero. The states i 2 ˜N {0,1} are ordered according to
descending index values, where all states i 2 ˜N {0} follow the controllable states
in the ordering, with af (i) = 0 for i 2 ˜N {0}. Then, we set the state i 2N {0,1}K+1 of
the zero-reward server, which is also a controllable state, to come after all the other
controllable states but to precede the uncontrollable states. Because of the existence
of the zero-reward server K + 1, the number of servers in controllable states can
always meet the constraint (4.7). Note here that we artificially move the state of
server K+1 and the uncontrollable states to places in the ordering that do not accord
with their indices, which are zero. We will show later that such movements do
not affect the long-run average performance of Whittle’s index policy, which exists
and is equivalent to MAIP in our context. The position of a state in the ordering
i = 1,2, . . . , I is also defined as its label.
Let g
OR(f) be the long-run average reward of the original problem (4.6) (that is,
without relaxation) under policy f , and g
LR(f) the long-run average reward of the re-
laxed problem (4.7) under policy f . In addition, let g
OR = maxf
�
g
OR(f)
, the maxi-
mal long-run average reward of the original problem, and g
LR = maxf
�
g
LR(f)
, the
maximal long-run average reward of the relaxed problem. From the definition of our
system, g
LR(f)/K,gOR(f)/K max j2K ,n j2N j R j(n j) < +•, where R j(n j) is the
reward rate of server j in state n j as defined before. Let index denote the Whittle’s
index policy, then, we obtain g
OR(index)/K g
OR/K g
LR/K. Following the idea
of [95], we prove, under Whittle’s index policy, that g
OR(index)/K� g
LR/K ! 0
when K is scaled in a certain way.
To demonstrate asymptotic optimality, we now describe the stationary policies,
including Whittle’s index policy, in another way. Let uf
i (z) 2 [0,1], z 2 RI , i =
92
4.4 Analysis
1,2, . . . , I, be the probability for a server in state i 2 ˜N {0,1}[ ˜N {0} to be tagged
(af (i) = 1) when Zf (t) = z. Then, 1�uf
i (z) is the probability for a server in state i
to be untagged (af (i) = 0).
Define N +i , i 2 ˜N {0,1}[ ˜N {0} to be the set of states that precede state i in the
ordering. Then, for Whittle’s index policy, we obtain
uindexi (z) = 1
zimin
8
<
:
zi,max
8
<
:
0,1
K +1� Â
i02N +i
zi0
9
=
;
9
=
;
. (4.17)
Our multi-queue system is stable, since any stationary policy will lead to an
irreducible Markov chain for the associated process and the number of states is finite.
Then, for a policy f 2 F, the vector Xf (t) converges as t! • in distribution to a
random vector Xf . In the equilibrium region, let p
f
j be the steady state distribution
of Xf
j for server j, j 2K +, under policy f 2F, where p
f
j (i), i 2N j, is the steady
state probability of state i. For clarity of presentation, we extend vector p
f
j to a vector
of length I, written p
f
j , of which the ith element is p
f
j (i), if i 2N j, and otherwise,
0. The long-run expected value of Zf (t) is ÂK+1j=1 p
f
j /(K + 1). In the server farm
defined in Section 4.2, the long-run expected value of Zf (t) will be a member of the
set
Z =
(
z 2 RI| Âi2 ˜N {0,1}[ ˜N {0}
zi ⌘ 1,
8i 2 ˜N {0,1}[ ˜N {0}, zi � 0
)
. (4.18)
Write q1(z,zi,z0i) and q0(z,zi,z0i), z 2 Z , i 2 ˜N {0,1} [ ˜N {0}, as the average
transition rate of the ith element in vector z from zi to z0i, under tagged and untagged
actions, respectively. Then, the average transition rate of the ith element of z under
93
Job Assignment Heuristics without Jockeying
policy f is given by
qf (z,zi,z0i) = uf
i (z)q1(z,zi,z0i)+(1�uf
i (z))q0(z,zi,z0i).
We consider the following differential equation for a stochastic process, denoted by
zf (t) = (zf
1 (t),zf
2 (t), . . . ,zf
I (t)),
dzf
i (t)dt
= Âz0i
h
z0i(t)qf (zf (t),z0i,z
f
i (t))
�zf
i (t)qf (zf (t),zf
i (t),z0i)i
. (4.19)
Because of the global balance at an equilibrium point of limt!+•R t
0 zf (u)du/t, if
exists, denoted by zf , dzf (t)/dt|zf (t)=zf
= 0. Let OPT represent the optimal solution
of the relaxed problem (4.7) and recall that index represents the Whittle’s index
policy. Since uindexi (zindex) = uOPT
i (zindex), following the proof of [95, Theorem 2],
we obtain dzOPT (t)/dt|zOPT (t)=zindex = 0 and zindex = zOPT , if both zindex and zOPT
exist. The existence of zindex and zOPT will be discussed later.
For a small d > 0, we define Rd ,f as the average reward rates during the time
period that |Zf (t)� zf (t)| d under policy f with Zf (0) = zf (0), and Rm/K =
supf
limsupt!+• |R(Xf (t))/K|<+• is an upper bound of the absolute value of the
reward rate divided by K. Then,
g
LR
K� g
OR(index)K
limd!0
limt!+•
1t
Z t
0
Rm
KPn
|ZOPT (u)� zOPT (t)|> d
o
+Rm
KPn
|Zindex(u)� zindex(t)|> d
o
+Rd ,OPT
KPn
|ZOPT (u)� zOPT (t)| d
o
�Rd ,index
KPn
|Zindex(u)� zindex(t)| d
o
#
du. (4.20)
94
4.5 Numerical results
The server farm is decomposed into K server groups, with number of servers in
the ith group denoted by Ki, i = 1,2, . . . , K. Then, K = ÂKi=1 Ki. Following the proof
of [95, Proposition], for any Ki = K0i n, K0
i = 1,2, . . ., i = 1,2, . . . , K, n = 1,2, . . .,
d > 0 and f is set to be either index or OPT ,
limn!+•
limt!+•
1t
Z t
0P{|Zf (u)� zf (u)|> d}du = 0. (4.21)
We provide a justification of (4.21) in Appendix B.4, following [114, Chapter 7].
Then, as n! +•, the existence of an equilibrium point of limt!+•R t
0 Zf (u)du/t
leads to the existence of zf = limt!+•R t
0 zf (u)du/t (using the Lipschitz continuity
of the right side of Equation (4.19) as a function of zf (t)). We obtain
limn!+•
limd!0
Rd ,OPT
K� Rd ,index
K
!
= 0,
and
limn!+•
✓
g
LR
K� g
OR(index)K
◆
= 0. (4.22)
Finally, g
OR(index)/K� g
OR/K ! 0 as n! +•; that is, MAIP (Whittle’s index
policy) approaches the optimal solution in terms of energy efficiency as the number
of servers in each server group tends to infinity at the appropriate rate.
4.5 Numerical results
In this section, we provide extensive numerical results obtained by simulation to
evaluate the performance of the MAIP policy. All results are presented in the form of
an observed mean from multiple independent runs of the corresponding experiment.
The confidence intervals at the 95% level based on the Student’s t-distribution are
maintained within ±5% of the observed mean. For convenience of describing the
95
Job Assignment Heuristics without Jockeying
results, given two numerical quantities x > 0 and y > 0, we define the relative
difference of x to y as (x� y)/y.
In all experiments, we have a system of servers that are divided into three server
groups. Servers in each server group i, i = 1,2,3, have the same buffer size, service
rate and energy consumption rate, denoted by Bi, µi, ei and e
0i , respectively. We
consider this to be a realistic setting since in practice a server farm is likely to
comprise multiple servers of the same type purchased at a time. If not otherwise
specified, we assume that job sizes are exponentially distributed.
Recall that, as defined in Section 4.2, the job throughput is the average job
departure rate (jobs per second), the power consumption is the average energy
consumption rate (Watt), and the energy efficiency is the ratio of job throughput to
power consumption (jobs per Watt second). Also, we have normalized the average
job size to one (Byte) in Section 4.2.
4.5.1 Effect of Idle Power
Recall that MAIP is designed to take into account the effect of idle power. To
demonstrate the effect of idle power on job assignment, here we consider a baseline
policy, called Most energy-efficient available server first Neglecting Idle Power
(MNIP). As its name suggests, MNIP is a straightforward variant of MAIP that
neglects idle power and hence treats e
0j = 0 for all j 2K in the process of selecting
servers for job assignment. We compare MAIP with MNIP in terms of energy
efficiency, job throughput and power consumption under various system parameters.
For the set of experiments in Fig. 4.1, each server group has 15 servers, where we
set Bi = 10 and e
0i /ei = 0.4i�0.3 for i = 1,2,3, and randomly generate µi and ei as
µ1 = 6.86, e1 = 6.86, µ2 = 3.64, e2 = 3.72, µ3 = 2.87, e3 = 3.15. The normalized
offered traffic r is varied from 0.01 to 0.9.
96
4.5 Numerical results
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
15
20
25
30
35
40
45
50
ρ
Re
lativ
e d
iffe
ren
ce (
%)
(a)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−5
−4
−3
−2
−1
0
1
2
3
4
5
ρ
Re
lativ
e d
iffe
ren
ce (
%)
(b)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−35
−30
−25
−20
−15
−10
−5
0
ρ
Re
lativ
e d
iffe
ren
ce (
%)
(c)
Fig. 4.1 Performance comparison with respect to normalized system load r . (a)Relative difference of L MAIP/E MAIP to L MNIP/E MNIP. (b) Job throughput. (c)Relative difference of E MAIP to E MNIP.
The results in Fig. 4.1 are presented in the form of the relative difference of
MAIP to MNIP in terms of each corresponding performance measure. We observe
in Fig. 4.1b that, in all cases, both policies have almost the same performance in job
throughput. We also observe in Fig. 4.1a and Fig. 4.1c that, in the case where r ! 0
or r ! 1, the two policies are close to each other in terms of both energy efficiency
and power consumption. This is because in such trivial and extreme cases the system
is at all times either almost empty or almost fully occupied. However, in the realistic
97
Job Assignment Heuristics without Jockeying
0 100 200 300 400 500 600 7000.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Number of servers
En
erg
y e
ffic
ien
cy
MAIPMNIP
(a)
0 100 200 300 400 500 600 7000
250
500
750
1000
1250
1500
1750
2000
Number of servers
Job
th
rou
gh
pu
t
MAIPMNIP
(b)
0 100 200 300 400 500 600 7000
250
500
750
1000
1250
1500
1750
2000
2250
2500
2750
3000
Number of servers
Po
we
r co
nsu
mp
tion
MAIPMNIP
(c)
Fig. 4.2 Performance comparison with respect to number of servers K. (a) Energyefficiency. (b) Job throughput. (c) Power consumption.
cases where r is not too large and not too small, MAIP significantly outperforms
MNIP with a gain of over 45% in energy efficiency at r = 0.4,0.5.
In Fig. 4.2, we use the same settings as in Fig. 4.1, except that we fix the
normalized offered traffic r at 0.6 and vary the number of servers K from 3 to 690.
Note that here we increase K by increasing the number of servers in each of the
three server groups. We observe in Fig. 4.2b that, under such a medium traffic load,
the service capacity is sufficiently large, so that with both MAIP and MNIP almost
all jobs can be admitted and hence the job throughput is almost identical to the
98
4.5 Numerical results
0 10 20 30 40 50 600
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cu
mm
ula
tive
dis
trib
utio
n
β = 0.5
β = 1
β=1.5
(a)
0 10 20 30 40 50 600
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cu
mm
ula
tive
dis
trib
utio
n
β=0.5
β=1
β=1.5
(b)
0 10 20 30 40 50 600
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cu
mm
ula
tive
dis
trib
utio
n
β=0.5
β=1
β=1.5
(c)
Fig. 4.3 Cumulative distribution of relative difference of L MAIP/E MAIP toL MNIP/E MNIP. (a) r = 0.4. (b) r = 0.6. (c) r = 0.8.
arrival rate for all values of K. As a result, the job throughput increases almost
linearly with respect to the number of servers K. We observe in Fig. 4.2c that, for
both policies, the power consumption also increases almost linearly with respect
to the number of servers K. However, it is clear that the power consumption of
MAIP increases at a significantly smaller rate than that of MNIP, which results in a
substantial improvement of the energy efficiency by nearly 36% in all cases as seen
in Fig. 4.2a.
99
Job Assignment Heuristics without Jockeying
For the set of experiments in Fig. 4.3, we again consider a system where each
server group has 15 servers and we set Bi = 10 for i = 1,2,3. We introduce a param-
eter b , where different values of b lead to different levels of server heterogeneity.
In particular, we consider three different values for b , i.e., b = 0.5,1,1.5. We
set e
0i /ei = (0.4i� 0.3)b for i = 1,2,3. The set of service rates µi are randomly
generated from the range [1,10] and are arranged in a non-increasing order, i.e.,
µ1 � µ2 � µ3. For the set of energy consumption rates ei, we first choose two
real numbers a1 and a2 randomly from [0.5,1]. Then, with µ1/e1 = 200, we set
µi/ei = ab
i�1µi�1/ei�1 for i = 2,3.
The results in Fig. 4.3 are obtained from 1000 experiments and are plotted in the
form of cumulative distribution of the relative difference of MAIP to MNIP in terms
of the energy efficiency. We observe in Fig. 4.3 that MAIP significantly outperforms
MNIP by up to 60%. It can also be observed from Fig. 4.3a and Fig. 4.3b that
MAIP outperforms MNIP by more than 10% in nearly 100% of the experiments
for the case b = 0.5. In addition, we observe in Fig. 4.3 that, as the level of server
heterogeneity (i.e., the value of b ) becomes higher, the performance improvement
of MAIP over MNIP in general becomes larger, although the gain is decreasing
when the normalized offered traffic r approaches 0.8, similar to what is observed in
Fig. 4.1a.
4.5.2 Effect of Jockeying Cost
Recall that MAIP is designed as a non-jockeying policy, which is more appropriate
than jockeying policies for job assignment in a large-scale server farm. As discussed
in Section 4.1, jockeying policies suit a small server farm where the cost associated
with jockeying is negligible. In large-scale systems, the cost associated with jockey-
ing can be significant and may have a snowball effect on the system performance.
Here, we demonstrate the benefits of MAIP in a server farm where jockeying costs
100
4.5 Numerical results
are high, by comparing it with a jockeying policy known as Most Energy Efficient
Server First (MEESF) proposed in [22].
The settings of servers in each of the three server groups are based on the
benchmark results of Dell PowerEdge rack servers R610 (August 2010), R620
(May 2012) and R630 (April 2015) [115]. Specifically, we normalize µ3 and e3 to
one and then set µ1/µ3 = 3.5, e1/e3 = 1.2, e
01/e1 = 0.2, µ2/µ3 = 1.4, e2/e3 = 1.1,
e
02/e2 = 0.2 and e
03/e3 = 0.3. We also set Bi = 10 for i = 1,2,3 and r = 0.6. The
number of servers K is varied from 3 to 270, where we increase K by increasing the
number of servers in each of the three server groups.
Let us assume that each jockeying action incurs a (constant) delay D. That is,
when a job is reassigned from server i to server j, it will be suspended for a period D
before resumed on server j. Clearly, when D > 0, this is equivalent to increasing the
size of the job and hence its service requirement. Accordingly, for a given system,
a non-zero cost per jockeying action indeed increases the traffic load. We consider
three different values for D. The case where D = 0 is for zero jockeying cost, the
case where D = 0.0005 indicates a relatively small cost per jockeying action, and the
case where D = 0.01 represents a large cost per jockeying action. The results are
presented in Fig. 4.4 for the energy efficiency and in Fig. 4.5 for the job throughput.
For the case where D = 0, we have a similar observation in Fig. 4.5a to that in
Fig. 4.2b. That is, under a medium traffic load, the service capacity is sufficiently
large, so that both MAIP and MEESF yield a job throughput that is almost identical
to the arrival rate for all values of K. We observe in Fig. 4.4a that, in this case,
MEESF consistently outperforms MAIP in terms of the energy efficiency, even
though with a very small margin.
For the case where D = 0.0005, we observe in Fig. 4.5b that, since the cost per
jockeying action is relatively small, the service capacity turns out to be sufficiently
large so that the job throughput of MEESF is not affected. We also observe in
Fig. 4.4b that, when the number of servers K is small, the energy efficiency of
101
Job Assignment Heuristics without Jockeying
0 30 60 90 120 150 180 210 240 2700.3
0.6
0.9
1.2
1.5
1.8
2.1
Number of servers
En
erg
y e
ffic
ien
cy
MAIPMEESF
(a)
0 30 60 90 120 150 180 210 240 2700.3
0.6
0.9
1.2
1.5
1.8
2.1
Number of servers
En
erg
y e
ffic
ien
cy
MAIPMEESF
(b)
0 30 60 90 120 150 180 210 240 2700.3
0.6
0.9
1.2
1.5
1.8
2.1
Number of servers
En
erg
y e
ffic
ien
cy
MAIPMEESF
(c)
Fig. 4.4 Performance comparison in terms of the energy efficiency with respect tothe number of servers K. (a) D = 0. (b) D = 0.0005. (c) D = 0.01.
MEESF is still better than that of MAIP. However, when the number of servers K
is large, the energy efficiency of MEESF is clearly degraded. This is because, as K
increases, the number of jockeying actions required for a job on average increases.
With a non-zero cost per jockeying action, it can substantially increase the power
consumption, since we are forced to use more of those less energy-efficient servers
to meet the increased traffic load.
The effect is more profound when D is increased to 0.01. In this case, as shown
in Fig. 4.4c and Fig. 4.5c, the cost associated with jockeying is so high that both the
102
4.5 Numerical results
0 30 60 90 120 150 180 210 240 2700
30
60
90
120
150
180
210
240
270
300
330
Number of servers
Job
th
rou
gh
pu
t
MAIPMEESF
(a)
0 30 60 90 120 150 180 210 240 2700
30
60
90
120
150
180
210
240
270
300
330
Number of servers
Job
th
rou
gh
pu
t
MAIPMEESF
(b)
0 30 60 90 120 150 180 210 240 2700
30
60
90
120
150
180
210
240
270
300
330
Number of servers
Job
th
rou
gh
pu
t
MAIPMEESF
(c)
Fig. 4.5 Performance comparison in terms of the job throughput with respect to thenumber of servers K. (a) D = 0. (b) D = 0.0005. (c) D = 0.01.
job throughput and the energy efficiency of MEESF are significantly degraded, due
to the substantially increased traffic load.
4.5.3 Sensitivity to the Shape of Job-Size Distributions
The workload characterizations of many computer science applications, such as Web
file sizes, IP flow durations, and the lifetimes of supercomputing jobs, are known
to exhibit heavy-tailed distributions [27, 86]. Here, we are interested to see if the
performance of MAIP is sensitive to the job-size distribution. To this end, in addition
103
Job Assignment Heuristics without Jockeying
−5 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 00
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cu
mu
lativ
e d
istr
ibu
tion
DeterministicPareto−1Pareto−2
(a)
−5 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 00
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cu
mu
lativ
e d
istr
ibu
tion
DeterministicPareto−1Pareto−2
(b)
−5 −4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 00
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Relative difference (%)
Cu
mu
lativ
e d
istr
ibu
tion
DeterministicPareto−1Pareto−2
(c)
Fig. 4.6 Cumulative distribution of relative difference of L MAIP,F/E MAIP,F toL MAIP,Exponential/E MAIP,Exponential; (a) r = 0.4. (b) r = 0.6. (c) r = 0.8.
to the exponential distribution, we further consider three different distributions, i.e.,
deterministic, Pareto with the shape parameter set to 2.001 (Pareto-1 for short), and
Pareto with the shape parameter set to 1.98 (Pareto-2 for short). In all cases, we set
the mean to be one.
We use the same experiment settings as in Fig. 4.3 with b = 1. In each experiment,
we obtain the energy efficiency of MAIP, and compute the relative difference of
the one using each corresponding distribution to the one using the exponential
distribution. Fig. 4.6 plots the cumulative distribution of the relative difference results
obtained from the 1000 experiments for each particular value of the normalized
104
4.6 Conclusion
offered traffic r . We observe in Fig. 4.6 that all relative difference results are
between �5% and 0. Given that the confidence intervals of these simulation results
are maintained within ±5% of the observed mean with a 95% confidence level, the
energy efficiency of MAIP seems not to be too sensitive to the job-size distribution.
4.6 Conclusion
We have studied the stochastic job assignment problem in a server farm comprising
multiple processor sharing servers with different service rates, energy consumption
rates and buffer sizes. Our aim has been to maximize the energy efficiency of the
entire system, defined as the ratio of the long-run average job departure rate to the
long-run average power consumption, by effectively assigning jobs/requests to these
servers. To this end, we have introduced MAIP job assignment policy and have
proved its equivalence to the Whittle’s index policy when job sizes are exponentially
distributed. MAIP only requires information of binary states of servers, and can
be implemented by using a binary variable for each server. This policy does not
require any estimation or prediction of average arrival rate. MAIP has been proven
to approach optimality as the numbers of servers in server groups tend to infinity
when job sizes are exponentially distributed. This asymptotic property is appropriate
to a large-scale server farm that is likely to purchase and upgrade a large number of
servers with the same style and attributes at the same time. The proof for asymptotic
optimality has been completed by applying the ideas of Weber and Weiss [95] to
our multi-queue system with inevitable uncontrollable states. Extensive numerical
results have illustrated the significant superiority of MAIP over MNIP (the baseline
policy) in a general situation where energy efficiency of each server differs from
its effective energy efficiency. MAIP has been shown numerically to give similar
energy efficiency results in cases of exponential and Pareto job-size distributions,
which indicates that it is appropriate to a server farm with highly varying job sizes.
105
Job Assignment Heuristics without Jockeying
Through a numerical example, we have shown that MAIP is more appropriate than
MEESF for a server farm with non-zero jockeying cost in the example, which is also
useful for a real large-scale system with significant cost of job reassignment.
106
Chapter 5
Conclusions
In this thesis, we have analyzed energy-efficient job assignment policies in server
farms, aiming at maximization of the energy efficiency (the ratio of the long-run
average throughput to the long-run average power consumption). We have considered
a large-scale server farm consisting of heterogeneous servers, where an optimal job
assignment policy is computationally infeasible. We have sought and proposed
near-optimal policies with guaranteed performance in terms of energy efficiency.
Both theoretical and numerical analysis has been provided for these policies.
In Chapter 2, we have introduced the connection between the optimization
of expected e-revised cumulative reward and the optimization of the ratio of the
long-run average reward to the long-run average cost. Then, the corresponding
optimal solution can be approached by conventional value iteration or policy iteration
methodologies. In this way, an algorithm has been designed and implemented for
an optimal insensitive jockeying policy in terms of energy efficiency in Chapter 3.
This algorithm has been proved to have polynomial complexity, but, according to
the numerical results for a server farm with one thousand servers, it has been shown
that this algorithm takes around five days for an optimal solution. Thus, the optimal
insensitive jockeying policy is more appropriate to be used as a benchmark in lab
107
Conclusions
experiments, and scalable near-optimal solutions are still necessary in modern server
farms with tens of thousands of servers or other computing components.
We then have studied two classified job assignments: jockeying and non-jockeying
job assignment. In Chapter 3, for the jockeying case, we have proposed a new ap-
proach that gives rise to an insensitive job assignment policy for the popular server
farm model comprising a parallel system of finite-buffer PS queues with hetero-
geneous server speeds and energy consumption rates. The insensitivity has been
possessed by constructing a logically combined queue based on the concept of the
symmetric queue [75]. MEESF is a generalized version of the SSF policy proposed
in [21], and has been proved, under a certain condition, to provide the highest
instantaneous ratio of the total service rate to the total power consumption.
Unlike the straightforward MEESF approach that greedily chooses the most
energy-efficient servers for job assignment, one important feature of the more robust
E⇤ policy is to aggregate an optimal number of most energy-efficient servers as a
virtual server. The aggregate service rate in E⇤ is controlled by a parameter bK. E⇤
is designed to give preference to this virtual server and utilize its service capacity
in such a way that both the job throughput and the energy efficiency of the system
can be improved. We have provided a rigorous analysis of the E⇤ policy where it
has been shown that E⇤ has always a higher job throughput than that of MEESF
and there exist realistic and sufficient conditions under which E⇤ is guaranteed to
outperform MEESF in terms of the energy efficiency of the system. Also, to seek the
optimal value of bK in E⇤, we have given a proof that, under a certain condition, if the
value of bK is within a given set determined by the job arrival rate, then the E⇤ policy
with this bK value achieves higher or equal energy efficiency compared to MEESF.
We have further proposed a rule of thumb (RM) to form the virtual server by
simply matching its aggregate service rate to the job arrival rate. Experiments based
on random settings have confirmed the effectiveness of the resulting RM policy.
Noting that the fundamentally important model of parallel PS queues has broader
108
applications in communication systems, our proposed solution for insensitive energy-
efficient job assignment has potentially wider applicability to green communications
and networking.
In Chapter 4, for the non-jockeying case, we have studied the stochastic job
assignment problem in a server farm comprising multiple processor sharing servers
with different service rates, energy consumption rates and buffer sizes. Our aim has
been to maximize the energy efficiency of the entire system, defined as the ratio of the
long-run average job departure rate to the long-run average power consumption, by
effectively assigning jobs/requests to these servers. To this end, we have introduced
MAIP job assignment policy and have proved its equivalence to the Whittle’s index
policy when job sizes are exponentially distributed. MAIP only requires information
of binary states of servers, and can be implemented by using a binary variable for
each server. This policy does not require any estimation or prediction of average
arrival rate. MAIP has been proven to approach optimality as the numbers of servers
in server groups tend to infinity when job sizes are exponentially distributed. This
asymptotic property is appropriate to a large-scale server farm that is likely to
purchase and upgrade a large number of servers with the same style and attributes
at the same time. The proof for asymptotic optimality has been completed by
applying the ideas of Weber and Weiss [95] to our multi-queue system with inevitable
uncontrollable states. Extensive numerical results have illustrated the significant
superiority of MAIP over MNIP (the baseline policy) in a general situation where
energy efficiency of each server differs from its effective energy efficiency. MAIP
has been shown numerically to give similar energy efficiency results in cases of
exponential and Pareto job-size distributions, which indicates that it is appropriate to
a server farm with highly varying job sizes. Through a numerical example, we have
shown that MAIP is more appropriate than MEESF for a server farm with non-zero
jockeying cost in the example, which is also useful for a real large-scale system with
significant cost of job reassignment.
109
References
[1] M. Arregoces and M. Portolani, Data center fundamentals. Cisco Press,
2003.
[2] U. E. P. Agency, Washington, DC, USA, “EPA report on server and data
center energy efficiency,” 2007, tech. Rep.
[3] T. C. Group, “Smart 2020: Enabling the low carbon economy in the informa-
tion age,” 2008, tech. Rep.
[4] A. Beloglazov, R. Buyya, Y. C. Lee, A. Zomaya et al., “A taxonomy and survey
of energy-efficient data centers and cloud computing systems,” Advances in
Computers, vol. 82, no. 2, pp. 47–111, 2011.
[5] Emerson Network Power, “State of the data center 2011,” 2011.
[Online]. Available: http://www.emersonnetworkpower.com/en-US/Solutions/
infographics/Pages/2011DataCenterState.aspx
[6] Cisco, “The zettabyte era: Trends and analysis,” May 2015. [Online].
Available: http://www.cisco.com/c/en/us/solutions/collateral/service-provider/
visual-networking-index-vni/VNI_Hyperconnectivity_WP.pdf
[7] Natural Resources Defense Council, “America’s data centers consuming
massive and growing amounts of electricity,” Aug. 2014. [Online]. Available:
http://www.nrdc.org/media/2014/140826.asp
111
References
[8] D. Kliazovich, P. Bouvry, F. Granelli, and N. L. S. Fonseca, “Energy con-
sumption optimization in cloud data centers,” Cloud Services, Networking
and Management, 2015.
[9] I. Mitrani, “Managing performance and power consumption in a server farm,”
Ann. Oper. Res., vol. 202, no. 1, pp. 121–134, Jan. 2013.
[10] F. Yao, A. Demers, and S. Shenker, “A scheduling model for reduced CPU
energy,” in Proc. IEEE FOCS, Washington, D.C., USA, Jul. 1995, pp. 374–
382.
[11] A. Wierman, L. L. Andrew, and A. Tang, “Power-aware speed scaling in pro-
cessor sharing systems: Optimality and robustness,” Performance Evaluation,
vol. 69, no. 12, pp. 601–622, Dec. 2012.
[12] N. Bansal, K. Pruhs, and C. Stein, “Speed scaling for weighted flow time,”
SIAM J. Comput., vol. 39, no. 4, pp. 1294–1308, Oct. 2009.
[13] N. Bansal, H.-L. Chan, and K. Pruhs, “Speed scaling with an arbitrary power
function,” ACM Transactions on Algorithms (TALG), vol. 9, no. 2, p. 18, Mar.
2013.
[14] L. L. H. Andrew, M. Lin, and A. Wierman, “Optimality, fairness, and ro-
bustness in speed scaling designs,” in Proc. ACM SIGMETRICS, Columbia
University, New York, USA, Jun. 2010, pp. 37–48.
[15] M. Andrews, S. Antonakopoulos, and L. Zhang, “Energy-aware scheduling
algorithms for network stability,” in Proc. IEEE INFOCOM, Shanghai, China,
Apr. 2011, pp. 1359–1367.
[16] S. Albers, F. Müller, and S. Schmelzer, “Speed scaling on parallel processors,”
Algorithmica, vol. 68, no. 2, pp. 404–425, Feb. 2014.
112
References
[17] A. Gandhi and M. Harchol-Balter, “How data center size impacts the effec-
tiveness of dynamic power management,” in Proc. 2011 49th Annual Allerton
Conference on Communication, Control, and Computing (Allerton). Monti-
cello, IL, USA: IEEE, Sep. 2011, pp. 1164–1169.
[18] M. Lin, A. Wierman, L. L. H. Andrew, and E. Thereska, “Dynamic right-
sizing for power-proportional data centers,” IEEE/ACM Trans. Netw., vol. 21,
no. 5, pp. 1378–1391, Oct. 2013.
[19] E. Hyytia, R. Righter, and S. Aalto, “Task assignment in a heterogeneous
server farm with switching delays and general energy-aware cost structure,”
Performance Evaluation, vol. 75-76, pp. 17–35, 2014.
[20] A. Gandhi, M. Harchol-Balter, R. Das, and C. Lefurgy, “Optimal power
allocation in server farms,” in ACM SIGMETRICS 2009, Seattle, USA, Jun.
2009, pp. 157–168.
[21] Z. Rosberg, Y. Peng, J. Fu, J. Guo, E. W. M. Wong, and M. Zukerman,
“Insensitive job assignment with throughput and energy criteria for processor-
sharing server farms,” IEEE/ACM Trans. Netw., vol. 22, no. 4, pp. 1257–1270,
Aug. 2014.
[22] J. Fu, J. Guo, E. W. M. Wong, and M. Zukerman, “Energy-efficient heuristics
for insensitive job assignment in processor-sharing server farms,” IEEE J. Sel.
Areas Commun., vol. 33, no. 12, pp. 2878–2891, Dec. 2015.
[23] W. Q. M. Guo, A. Wadhawan, L. Huang, and J. T. Dudziak, “Server
farm management,” Jan. 2014, US Patent 8,626,897. [Online]. Available:
http://www.google.com/patents/US8626897
113
References
[24] V. Gupta, M. Harchol-Balter, K. Sigman, and W. Whitt, “Analysis of join-
the-shortest-queue routing for web server farms,” Perform. Eval., vol. 64, no.
9-12, pp. 1062–1081, Oct. 2007.
[25] E. Altman, U. Ayesta, and B. J. Prabhu, “Load balancing in processor sharing
systems,” Telecommun. Syst., vol. 47, no. 1-2, pp. 35–48, Jun. 2011.
[26] K. Sigman, “Processor sharing queues,” 2013. [Online]. Available:
http://www.columbia.edu/~ks20/4404-Sigman/4404-Notes-PS.pdf
[27] M. E. Crovella and A. Bestavros, “Self-similarity in world wide web traffic:
evidence and possible causes,” Networking, IEEE/ACM Transactions on,
vol. 5, no. 6, pp. 835–846, Dec. 1997.
[28] S. Gunawardena and W. Zhuang, “Service response time of elastic data traffic
in cognitive radio networks,” IEEE J. Sel. Areas Commun., vol. 31, no. 3, pp.
559–570, Mar. 2013.
[29] F. Liu, K. Zheng, W. Xiang, and H. Zhao, “Design and performance analysis
of an energy-efficient uplink carrier aggregation scheme,” IEEE J. Sel. Areas
Commun., vol. 32, no. 2, pp. 197–207, Jan. 2014.
[30] J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Maz-
ières, S. Mitra, A. Narayanan, G. Parulkar, M. Rosenblum, S. M. Rumble,
E. Stratmann, and R. Stutsman, “The case for RAMClouds: Scalable high-
performance storage entirely in DRAM,” SIGOPS Operating Systems Review,
vol. 43, no. 4, pp. 92–105, Dec. 2009.
[31] A. M. Caulfield, L. M. Grupp, and S. Swanson, “Gordon: Using flash memory
to build fast, power-efficient clusters for data-intensive applications,” ACM
SIGPLAN Notices, vol. 44, no. 3, pp. 217–228, Mar. 2009.
114
References
[32] L. A. Barroso and U. Holzle, “The case for energy-proportional computing,”
Computer, vol. 40, no. 12, pp. 33–37, Dec. 2007.
[33] P. Whittle, “Restless bandits: Activity allocation in a changing world,” J. Appl.
Probab., vol. 25, pp. 287–298, 1988.
[34] C. H. Papadimitriou and J. N. Tsitsiklis, “The complexity of optimal queuing
network control,” Math. Oper. Res., vol. 24, no. 2, pp. 293–305, May 1999.
[35] J. Niño-Mora, “Dynamic allocation indices for restless projects and queue-
ing admission control: a polyhedral approach,” Mathematical programming,
vol. 93, no. 3, pp. 361–413, 2002.
[36] A. Gandhi, “Dynamic server provisioning for data center power management,”
Ph.D. dissertation, School of Computer Science, Carnegie Mellon University,
Jun. 2013.
[37] D. Wong and M. Annavaram, “Knightshift: scaling the energy proportionality
wall through server-level heterogeneity,” in Proc. MICRO. Vancouver,
Canada: IEEE, Dec. 2012, pp. 119–130.
[38] Z. Liu, M. Lin, A. Wierman, S. H. Low, and L. L. H. Andrew, “Greening
geographical load balancing,” in Proc. ACM SIGMETRICS. Portland, Oregan:
ACM, Jun. 2011, pp. 233–244.
[39] J. Li, Z. Li, K. Ren, and X. Liu, “Towards optimal electric demand manage-
ment for internet data centers,” IEEE Trans. Smart Grid, vol. 3, no. 1, pp.
183–192, Mar. 2012.
[40] Y. Yao, L. Huang, A. Sharma, L. Golubchik, and M. Neely, “Data centers
power reduction: A two time scale approach for delay tolerant workloads,” in
Proc. IEEE INFOCOM. Orlando, USA: IEEE, Mar. 2012, pp. 1431–1439.
115
References
[41] J. Cao, K. Li, and I. Stojmenovic, “Optimal power allocation and load distri-
bution for multiple heterogeneous multicore server processors across clouds
and data centers,” IEEE Trans. Comput., vol. 63, no. 1, pp. 45–58, Jan. 2014.
[42] Z. Liu, M. Lin, A. Wierman, S. H. Low, and L. L. H. Andrew, “Geographical
load balancing with renewables,” ACM SIGMETRICS PER, vol. 39, no. 3, pp.
62–66, Dec. 2011.
[43] P. Wang, Y. Qi, X. Liu, Y. Chen, and X. Zhong, “Power management in
heterogeneous multi-tier web clusters,” in Proc. ICPP. San Diego, USA:
IEEE, Sep. 2010, pp. 385–394.
[44] Q. Zhang, M. F. Zhani, S. Zhang, Q. Zhu, R. Boutaba, and J. L. Hellerstein,
“Dynamic energy-aware capacity provisioning for cloud computing environ-
ments,” in Proc. ICAC. San Jose, California, USA: ACM, Sep. 2012, pp.
145–154.
[45] T. Lu, M. Chen, and L. L. H. Andrew, “Simple and effective dynamic provi-
sioning for power-proportional data centers,” Parallel and Distributed Systems,
IEEE Transactions on, vol. 24, no. 6, pp. 1161–1171, Apr. 2013.
[46] K. M. Tarplee, R. Friese, A. A. Maciejewski, and H. J. Siegel, “Efficient and
scalable computation of the energy and makespan pareto front for heteroge-
neous computing systems,” in Proc. FedCSIS. Krakow, Poland: IEEE, Sep.
2013, pp. 401–408.
[47] T. Vondra and J. Sedivy, “Maximizing utilization in private iaas clouds with
heterogenous load through time series forecasting,” International Journal on
Advances in Systems and Measurements, vol. 6, no. 1 and 2, pp. 149–165,
2013.
116
References
[48] H. Khazaei, J. Misic, and V. B. Misic, “Performance analysis of cloud com-
puting centers using M/G/m/m+ r queuing systems,” IEEE Trans. Parallel
Distrib. Syst., vol. 23, no. 5, pp. 936–943, Apr. 2012.
[49] S. M. Ross, Applied probability models with optimization applications. Dover
Publications (New York), 1992.
[50] F. A. Haight, “Two queues in parallel,” Biom., vol. 45, no. 3-4, pp. 401–410,
1958.
[51] J. F. C. Kingman, “Two similar queues in parallel,” Ann. Math. Stat., vol. 32,
no. 4, pp. 1314–1323, Dec. 1961.
[52] W. Winston, “Optimality of the shortest line discipline,” J. Appl. Probab.,
vol. 14, no. 1, pp. 181–189, Mar. 1977.
[53] R. R. Weber, “On the optimal assignment of customers to parallel servers,” J.
Appl. Probab., vol. 15, no. 2, pp. 406–413, Jun. 1978.
[54] A. Ephremides, P. Varaiya, and J. Walrand, “A simple dynamic routing prob-
lem,” IEEE Trans. Autom. Control, vol. 25, no. 4, pp. 690–693, 1980.
[55] C. Knessl, B. Matkowsky, Z. Schuss, and C. Tier, “Two parallel M/G/1 queues
where arrivals join the system with the smaller buffer content,” IEEE Trans.
Comput., vol. 35, no. 11, pp. 1153–1158, Nov. 1987.
[56] A. Hordijk and G. Koole, “On the assignment of customers to parallel queues,”
Probability in the Engineering and Informational Sciences, vol. 6, no. 04, pp.
495–511, Oct. 1992.
[57] F. Bonomi, “On job assignment for a parallel system of processor sharing
queues,” IEEE Trans. Comput., vol. 39, no. 7, pp. 858–869, 1990.
117
References
[58] W. Whitt, “Deciding which queue to join: Some counterexamples,” Oper.
Res., vol. 34, no. 1, pp. 55–62, Jan./Feb. 1986.
[59] V. Gupta, “Stochastic models and analysis for resource management in server
farms,” Ph.D. dissertation, School of Computer Science, Carnegie Mellon
University, 2011.
[60] Y. Zhao and W. K. Grassmann, “Queueing analysis of a jockeying model,”
Oper. Res., vol. 43, no. 3, pp. 520–529, May/Jun. 1995.
[61] I. J. B. F. Adan, J. Wessels, and W. H. M. Zijm, “Matrix-geometric analysis of
the shortest queue problem with threshold jockeying,” Oper. Res. Lett., vol. 13,
no. 2, pp. 107–112, Mar. 1993.
[62] Y. Sakuma, “Asymptotic behavior for MAP/PH/c queue with shortest queue
discipline and jockeying,” Oper. Res. Lett., vol. 38, no. 1, pp. 7–10, Jan. 2010.
[63] E. Hyytiä, R. Righter, and S. Aalto, “Energy-aware job assignment in server
farms with setup delays under LCFS and PS,” in Proc. ITC 26, Sep. 2014, pp.
1–9.
[64] K. Li, X. Tang, and K. Li, “Energy-efficient stochastic task scheduling on het-
erogeneous computing systems,” IEEE Trans. Parallel Distrib. Syst., vol. 25,
no. 11, pp. 2867–2876, Nov. 2014.
[65] D. P. Bertsekas, Dynamic programming and optimal control. Athena Scien-
tific Belmont, MA, 1995.
[66] S. A. Lippman, “Applying a new device in the optimization of exponential
queuing systems,” Oper. Res., vol. 23, no. 4, pp. 687–710, Aug. 1975.
[67] J. Wijngaard and S. Stidham, “Forward recursion for Markov decision pro-
cesses with skip-free-to-the-right transitions, part I: theory and algorithm,”
Math. Oper. Res., vol. 11, no. 2, pp. 295–308, May 1986.
118
References
[68] S. Stidham and R. R. Weber, “Monotonic and insensitive optimal policies
for control of queues with undiscounted costs,” Oper. Res., vol. 37, no. 4, pp.
611–625, Jul./Aug. 1989.
[69] J. M. George and J. M. Harrison, “Dynamic control of a queue with adjustable
service rate,” Oper. Res., vol. 49, no. 5, pp. 720–731, Sep.-Oct. 2001.
[70] F. Yao, A. Demers, and S. Shenker, “A scheduling model for reduced CPU
energy,” in Proc. IEEE FOCS, Milwaukee, WI, USA, Oct. 1995, pp. 374–382.
[71] N. Bansal, K. Pruhs, and C. Stein, “Speed scaling for weighted flow time,” in
Proc. ACM-SIAM SODA, New Orleans, LA, USA, Jan. 2007, pp. 805–813.
[72] S. Albers and H. Fujiwara, “Energy-efficient algorithms for flow time mini-
mization,” ACM TALG, vol. 3, no. 4, p. 49, Nov. 2007.
[73] N. Bansal, H.-L. Chan, and K. Pruhs, “Speed scaling with an arbitrary power
function,” in Proc. ACM-SIAM SODA, New York, NY, USA, Jan. 2009, pp.
693–701.
[74] A. Wierman, L. L. H. Andrew, and A. Tang, “Power-aware speed scaling in
processor sharing systems,” in Proc. IEEE INFOCOM, Rio de Janeiro, Brazil,
Apr. 2009, pp. 2007–2015.
[75] F. Kelly, “Networks of queues,” Adv. Appl. Probab., vol. 8, no. 2, pp. 416–432,
Jun. 1976.
[76] P. G. Taylor, “Insensitivity in stochastic models,” in Queueing Networks, R. J.
Boucherie and N. M. van Dijk, Eds. New York, NY: Springer, 2011, ch. 3,
pp. 121–140.
[77] A. D. Barbour, “Networks of queues and the method of stages,” Adv. Appl.
Probab., vol. 8, no. 3, pp. 584–591, Sep. 1976.
119
References
[78] J. Fu, “Implementation for an optimal insensitive jockeying policy,” 2016.
[Online]. Available: http://www.ee.cityu.edu.hk/~zukerman/Codes/JingFu/
JSAC-2015-code.rar
[79] L. Wang, F. Zhang, J. A. Aroca, A. V. Vasilakos, K. Zheng, C. Hou, D. Li, and
Z. Liu, “GreenDCN: A general framework for achieving energy efficiency in
data center networks,” IEEE J. Sel. Areas Commun., vol. 32, no. 1, pp. 4–15,
Jan. 2014.
[80] Y. Tian, C. Lin, and M. Yao, “Modeling and analyzing power management
policies in server farms using stochastic Petri nets,” in Proc. e-Energy 2012.
Madrid, Spain: IEEE, May 2012, pp. 1–9.
[81] Y. Yao, L. Huang, A. B. Sharma, L. Golubchik, and M. J. Neely, “Power cost
reduction in distributed data centers: A two-time-scale approach for delay
tolerant workloads,” IEEE Trans. Parallel Distrib. Syst., vol. 25, no. 1, pp.
200–211, Jan. 2014.
[82] D. Niyato, S. Chaisiri, and L. B. Sung, “Optimal power management for
server farm to support green computing,” in Proc. IEEE/ACM CCGRID 2009.
Shanghai: IEEE Computer Society, May 2009, pp. 84–91.
[83] S. Li, S. Wang, T. Abdelzaher, M. Kihl, and A. Robertsson, “Temperature
aware power allocation: An optimization framework and case studies,” Sus-
tainable Computing: Informatics and Systems, vol. 2, no. 3, pp. 117–127, Sep.
2012.
[84] M. Pore, Z. Abbasi, S. K. S. Gupta, and G. Varsamopoulos, “Techniques to
achieve energy proportionality in data centers: A survey,” in Handbook on
Data Centers. Springer, Mar. 2015, pp. 109–162.
120
References
[85] E. Gelenbe and R. Lent, “Energy-qos trade-offs in mobile service selection,”
Future Internet, vol. 5, no. 2, pp. 128–139, Apr. 2013.
[86] M. Harchol-Balter, Performance Modeling and Design of Computer Systems:
Queueing Theory in Action. Cambridge University Press, 2013.
[87] T. Bonald and A. Proutiere, “Insensitivity in processor-sharing networks,”
Performance Evaluation, vol. 49, no. 1, pp. 193–209, Sep. 2002.
[88] A. Singh, J. Ong, A. Agarwal, G. Anderson, A. Armistead, R. Bannon,
S. Boving, G. Desai, B. Felderman, P. Germano, A. Kanagala, J. Provost,
J. Simmons, E. Tanda, J. Wanderer, U. Hölzle, S. Stuart, and A. Vahdat,
“Jupiter rising: A decade of clos topologies and centralized control in Google’s
datacenter network,” in Proc. ACM SIGCOMM, London, UK, Aug. 2015, pp.
183–197.
[89] L. A. Barroso, J. Dean, and U. Hölzle, “Web search for a planet: The Google
cluster architecture,” IEEE Micro, vol. 23, no. 2, pp. 22–28, Apr. 2003.
[90] C. Calero, Handbook of research on web information systems quality. IGI
Global, 2008.
[91] O. T. Akgun, D. G. Down, and R. Righter, “Energy-aware scheduling on
heterogeneous processors,” IEEE Trans. Autom. Control, vol. 59, no. 3, pp.
599–613, Feb. 2014.
[92] J. Gittins, K. Glazebrook, and R. R. Weber, Multi-armed bandit allocation
indices: 2nd edition. Wiley, Mar. 2011.
[93] J. C. Gittins and D. M. Jones, “A dynamic allocation index for the sequential
design of experiments,” in Progress in Statistics, J. Gani, Ed. Amsterdam,
NL: North-Holland, 1974, pp. 241–266.
121
References
[94] J. C. Gittins, “Bandit processes and dynamic allocation indices,” Journal of
the Royal Statistical Society. Series B (Methodological), pp. 148–177, 1979.
[95] R. R. Weber and G. Weiss, “On an index policy for restless bandits,” J. Appl.
Probab., no. 3, pp. 637–648, Sep. 1990.
[96] D. Bertsimas and J. Niño-Mora, “Restless bandits, linear programming re-
laxations, and a primal-dual index heuristic,” Oper. Res., vol. 48, no. 1, pp.
80–90, Feb. 2000.
[97] J. Niño-Mora, “Restless bandits, partial conservation laws and indexability,”
Advances in Applied Probability, vol. 33, no. 1, pp. 76–98, 2001.
[98] D. Bertsimas and J. Niño-Mora, “Conservation laws, extended polymatroids
and multiarmed bandit problems; a polyhedral approach to indexable systems,”
Mathematics of Operations Research, vol. 21, no. 2, pp. 257–306, May 1996.
[99] J. Niño-Mora, “Dynamic priority allocation via restless bandit marginal pro-
ductivity indices,” TOP, vol. 15, no. 2, pp. 161–198, Sep. 2007.
[100] I. M. Verloop, “Asymptotically optimal priority policies for indexable and
non-indexable restless bandits,” to appear in Ann. Appl. Probab., 2015.
[101] M. Larranaga, U. Ayesta, and I. M. Verloop, “Asymptotically optimal index
policies for an abandonment queue with convex holding cost,” Queueing
Systems, pp. 1–71, May 2015.
[102] A. L. Stolyar, “Maxweight scheduling in a generalized switch: State space
collapse and workload minimization in heavy traffic,” Ann. Appl. Probab.,
vol. 14, no. 1, pp. 1–53, Feb. 2004.
[103] A. L. Stolyar, “Maximizing queueing network utility subject to stability:
Greedy primal-dual algorithm,” Queueing Systems, vol. 50, no. 4, pp. 401–
457, Aug. 2005.
122
References
[104] A. Mandelbaum and A. L. Stolyar, “Scheduling flexible servers with convex
delay costs: Heavy-traffic optimality of the generalized cµ-rule,” Oper. Res.,
vol. 52, no. 6, pp. 836–855, Dec. 2004.
[105] A. L. Stolyar, “Optimal routing in output-queued flexible server systems,”
Probability in the Engineering and Informational Sciences, vol. 19, no. 02,
pp. 141–189, Apr. 2005.
[106] Y. Nazarathy and G. Weiss, “Near optimal control of queueing networks over
a finite time horizon,” Ann. Oper. Res., vol. 170, no. 1, pp. 233–249, Sep.
2009.
[107] U. Ayesta, M. Erausquin, M. Jonckheere, and I. M. Verloop, “Scheduling in a
random environment: stability and asymptotic optimality,” IEEE/ACM Trans.
Netw., vol. 21, no. 1, pp. 258–271, Feb. 2013.
[108] S. Borst, “User-level performance of channel-aware scheduling algorithms in
wireless data networks,” IEEE/ACM Trans. Netw., vol. 13, no. 3, pp. 636–647,
Jun. 2005.
[109] S. Aalto and P. Lassila, “Flow-level stability and performance of channel-
aware priority-based schedulers,” in Proc. EURO-NF Conference on Next
Generation Internet (NGI). Paris, France: IEEE, Jun. 2010, pp. 1–8.
[110] I. Taboada, F. Liberal, and P. Jacko, “An opportunistic and non-anticipating
size-aware scheduling proposal for mean holding cost minimization in time-
varying channels,” Performance Evaluation, vol. 79, pp. 90–103, Sep. 2014.
[111] R. Atar and M. Shifrin, “An asymptotic optimality result for the multiclass
queue with finite buffers in heavy traffic,” Stoch. Syst., vol. 4, no. 2, pp.
556–603, 2014.
123
References
[112] K. D. Glazebrook, D. J. Hodge, and C. Kirkbride, “General notions of index-
ability for queueing control and asset management,” The Annals of Applied
Probability, vol. 21, no. 3, pp. 876–907, Jun. 2011.
[113] K. D. Glazebrook, D. J. Hodge, C. Kirkbride, and R. J. Minty, “Stochastic
scheduling: A short history of index policies and new approaches to index
generation for dynamic resource allocation,” Journal of Scheduling, vol. 17,
no. 5, pp. 407–425, Oct. 2014.
[114] J. I. Freidlin, Markand Szücs and A. D. Wentzell, Random perturbations of
dynamical systems. Springer Science & Business Media, 2012, vol. 260.
[115] Standard Performance Evaluation Corporation, tested by Dell Inc. [Online].
Available: https://www.spec.org/power_ssj2008/results/
[116] F. Kelly, Reversibility and stochastic networks. Cambridge University Press,
1979.
124
Appendix A
Theoretical results for Jockeying
Job-Assignments
The system model and notation used in this appendix has been defined in Chapter 3.
A.1 An Example for Mapping Qf
In this example, we consider a server farm with K = 3 and B j = 4, j = 1,2,3. The
multi-queue system is shown in Fig. A.1. We define QE⇤ , bK = 2, as the mapping that
yeilds a logically combined queue shown in Fig. A.2, where the slots in the buffers
of the multi-queue system has been one-to-one mapped to the slots in the buffer of
the single queue.
We then demonstrate how to address an arrival/departure event in the logically
combined queue, when there are 9 jobs (being served) in the system, i.e. state
n = 9. If a new job arrives, as described in Section 3.3.1, according to a probability
g(l,9), l = 1,2, . . . , BK , we demonstrate in Fig. A.3 that a position (slot) is selected
to allocate the new job, such that, in the view of the logically combined queue, all
the jobs in and after this position should be moved backward by one slot. Jockeying
125
Theoretical results for Jockeying Job-Assignments
Fig. A.1 An example for multi-queue system.
Fig. A.2 An example for multi-queue system.
is required when we move backward these jobs. In Fig. A.4, we demonstrate the
correspondence between jockeying in the logically combined queue and that in the
multi-queue system.
In a similar way, if a job at a slot is completed, in the view of the logically
combined queue, all the jobs after this slot should be moved forward by one slot.
In Figs. A.5 and A.6, we present an example for departure event and the correspon-
dence between the two systems with respect to jockeying upon this departure event,
respectively.
A.2 Pseudo-Code of the Algorithm for an Insensitive
Optimal Policy
Here, we present the pseudo-code of the algorithm for an optimal policy, called
Algorithm 1. The theoretical analysis is given in Sections 3.4.1 and 3.4.2.
126
A.3 Proof of Proposition 3.1
(a) (b)
Fig. A.3 An example for arrival event. (a) Multi-queue system. (b) Logicallycombined queue.
(a) (b)
Fig. A.4 An example for jockeying upon arrival event. (a) Multi-queue system. (b)Logically combined queue.
A.3 Proof of Proposition 3.1
Here, we provide a proof that I-J-OPT is optimal among all jockeying policies in the
case where the job-sizes are exponentially distributed. We begin with background
information on the symmetric queue model, which has also been given in Chapter 3.
When the service requirements of jobs are independently and identically distributed,
the symmetric queue [116, chapter 3.3], which is insensitive to the shape of job-size
distribution, is briefly constructed as follows. If there are n jobs being processed in the
queue, they should be located at the first n positions of the queue. When a departure
occurs at position l, the jobs in positions l +1, l + 2, . . . ,n move to positions l, l +
1, . . . ,n�1, respectively. A proportion g(l,n) of the total service effort is directed
to the job in position l when there are n jobs being processed in the queue. When
a new job arrives, it is assigned to the position l with probability g(l,n+1). Then,
jobs previously in positions l, l + 1, . . . ,n move to positions l + 1, l + 2, . . . ,n+ 1,
127
Theoretical results for Jockeying Job-Assignments
(a) (b)
Fig. A.5 An example for departure event. (a) Multi-queue system. (b) Logicallycombined queue.
(a) (b)
Fig. A.6 An example for jockeying upon departure event. (a) Multi-queue system.(b) Logically combined queue.
respectively. For our multi-queue server farm model, we can achieve insensitivity
by mapping the multi-queue system into a symmetric queue, as described in [21];
the available mappings are not unique. Because many configurations of multi-queue
server farm model can be mapped to the same symmetric queue configuration.
Let FJ be the set of all stationary jockeying policies. Let Fs be a subset of FJ ,
which are insensitive to the shape of the job-size distribution. Let Fsd be a subset
of Fs, where the policies in Fsd are deterministic policies. We define |n|, n 2N
(N is the set of all states defined in Chapter 3), as the total number of jobs in
the multi-queue system in state n, i.e. |n| = ÂKj=1 n j. Let p
f
m,L be the steady-state
probability that under policy f , there are m jobs in the multi-queue system and the
set of active servers is L. The probability of state n under policy f is p
f
n , as defined
128
A.3 Proof of Proposition 3.1
Algorithm 1 Algorithm for an Insensitive Optimal Policy.Require: service rate µ = (µ1,µ2, . . . ,µK), power consumption e = (e1,e2, . . . ,eK),
buffer size B = (B1,B2, . . . ,BK), and arrival rate l .Ensure: An optimal policy that maximizes the energy efficiency among all insensi-
tive jockeying policies, namely, the sets of busy servers for each state, T OPT(n),n = 1,2, . . . , BK .
1: e1 02: e2 The value of E/T under any feasible policy3: e e1+e2
24: f0 any feasible policy for initialization5: W e OPTIMALCOSTPERCIRCLE(e,f0,f⇤)6: while |W e|> s/2 do7: if W e > s/2 then8: e1 e9: else
10: e2 e11: end if12: e e1+e2
213: Y e(0) 014: for n = 1 to BK , increment by 1 do15: j1, j2, . . . , jK represent servers decendingly ordered in terms of the value
of the right hand side of (3.13).16: l⇤ argmaxl2Le(n)y{ j1, j2,..., jl},e(n).17: T OPT(n) { j1, j2, . . . , jl⇤}.18: Y e(n) yT OPT(n),e(n).19: end for20: f
⇤ is the policy comprising of T OPT(n), n = 1,2, . . . , BK .21: W e g
f
⇤(µ)� eg
f
⇤(e).
22: end while
129
Theoretical results for Jockeying Job-Assignments
in Section 2.1. Acccordingly, we can write
p
f
m,L = Â|n|=m,
L(n)=L,n2N
p
f
n . (A.1)
Lemma A.1. Assume job-sizes to be exponentially distributed. For any policy
f 2FJ, we define a policy f
0 derived from f as follows. For a given state n 2N
under policy f , we reassign a job from server j1 to server j2, where n j1 > 2 and
1 6 n j2 6 b� 1. We perform this reassignment once and only once whenever the
process reaches n and the conditions for n j1 and n j2 hold. For other states n0 6= n,
n0 2N , the decisions made in n0 are the same as those in policy f . In this way,
policy f
0 is derived from f only by the reassignment at state n. Then, p
f
0
m,L = p
f
m,L.
Proof. Because of the memoryless property, the remaining work of each job is of the
same exponential distribution and will not change after the reassignment. The total
departure rate and the power consumption are also maintained after the reassignment,
since the set of active servers L does not change. After the reassignment in state
n, the state becomes state n0. We refer to the new sojourn time in state n0 as the
period which starts from the moment t1 that reassignment occurs to the moment t2
that the first arrival or departure occurs after t1. Because the arrival and departure
rate remain the same, the length of the new sojourn time in state n0 is statistically
equal to the length of the original period from t1 to the end of this sojourn in state n
without reassignment. When this sojourn ends at t2, the policy is again f , because
we consider here jockeying policies. The modification from n to n0 does not change
the steady state probability p
f
m,L, therefore, p
f
0
m,L = p
f
m,L.
Lemma A.2. There exists a deterministic policy f 2Fsd, which is optimal in Fs.
130
A.3 Proof of Proposition 3.1
Proof. We consider a formulation for state n as follows.
V ef
⇤(n) = minp+(n,n0)p�(n,n0)
8
>
<
>
:
C (n)+ l
l+µ(n) · Ân02N
|n0|=|n|+1
p+(n,n0) ·V ef
⇤(n0)
+ µ(n)l+µ(n) Â
n02N|n0|=|n|�1
p�(n,n0) ·V ef
⇤(n0)
9
>
=
>
;
(A.2)
where p+(n,n0) and p�(n,n0) are the transition probabilities from state n to n0. They
satisfy
Ân02N
|n0|=|n|+1
p+(n,n0) = 1,
and
Ân02N
|n0|=|n|�1
p�(n,n0) = 1.
For such an policy f
⇤, if (A.2) holds for all n 2 N , then f
⇤ is opitmal among
policies in Fs. In a similar way, we define policy f
d 2Fsd as
V ef
d(n) = C (n)+ l
l+µ(n) · minn02N
|n0|=|n|+1
V ef
⇤(n0)+ µ(n)l+µ(n) min
n02N|n0|=|n|�1
V ef
⇤(n0). (A.3)
According to (A.2) and (A.3), for policy f
d , if (A.3) holds true for all n 2N , then
under f
d , (A.2) also holds true for all n 2N . That is, f
d is an optimal solution
among all policies in Fs. This proves the lemma.
Let S(L,m) = {n0,n1, . . . ,nk} be the set of all states in N , where L is the set
of active servers for states ni, i = 0, . . . ,k with |ni| = m, i = 0, . . . ,k. Those states
are always reachable under non-jockeying policies, but may not be reached under
jockeying policies. We refer to the set S(L,m) that contains reachable states under
policy f 2FJ as reachable S(L,m) under policy f . Henceforth, we only consider
jockeying policies.
131
Theoretical results for Jockeying Job-Assignments
We define a state n 2N that is reachable under a policy f 2 F0, F0 ⇢ FJ as
a possible state for F0. Recall that our multi-queue server farm model is mapped
to a single symmetric queue in the same way for all policies in Fs (as discussed
above) and that no hole is allowed in the single symmetric queue. Thus, for a given
set of active servers L and a given number of jobs in the system m, under any policy
f
s 2 Fs, the positions of jobs in the multi-queue system are also decided. Thus,
only one possible state for Fs exists in S(L,m) for given L and m. For a given
m 2 [0, BK], under any policy f
s 2 Fs, there always exists at least one reachable
S(L,m), since two jobs can not arrive or be completed simultaneously. In any
policy f 2 FJ , when the process steps into any state n 2 S(L,m), we reassign a
job from server j1 to server j2, where n j1 > 2 and 1 6 n j2 6 b�1. We repeat this
reassignment until the positions of jobs are the same as those in the possible state
for Fs in S(L,m). The indices j1 and j2 are variables which can be different at
each reassignment. We refer to f
0 2FJ as the new policy after the reassignments.
Based on Lemma A.1, each of these reassignments will not change the steady-state
probability p
f
m,L, and, we obtain p
f
m,L = p
f
0
m,L.
For a given set of active servers and number of jobs in the system, the positions
of jobs in the multi-queue system in a reachable state, under policy f
0, are the same
as those in a possible state for Fs. The reachable states under policy f
0 are therefore
the possible states for Fs.
With S(L0,m+1) = {n00,n01, . . . ,n0k0} and S(L00,m�1) = {n000,n001, . . . ,n00k00}, the
routing probabilities in f
0 are obtained by
q+(n,n00) =k0
Âj=0
kÂ
i=0
pniÂk
l=0 pnlp+(ni,n0j),
and
q�(n,n000) =k00
Âj=0
kÂ
i=0
pniÂk
l=0 pnlp�(ni,n00j ),
(A.4)
132
A.4 Proof of Lemma 3.2
where q+(n,n00) and q�(n,n000) are the routing probabilities of policy f
0, and n, n00and n000 are the reachable states in S(L,m), S(L0,m+ 1) and S(L00,m� 1) under
policy f
0, respectively. For any state n and unreachable state n under policy f
0,
q+(n, n) = 0 for |n|= |n|+1 and q�(n, n) = 0 for |n|= |n|�1.
Since reachable states in f
0 are the possible states for Fs, and for any given
number of jobs in the system m there exists at least one reachable S(L,m) under
policy f
0, there should be a policy f
s 2 Fs that has the same Markov chain as f
0.
Then, p
f
n = p
f
0n and p
f
m,L = p
f
0
m,L. That is, p
f
m,L = p
f
s
m,L. Since L decides the current
service rate and power consumption of the system, f and f
s also have the same
expected throughput and power consumption.
The proof for Proposition 3.1 is given next.
Proof. Use f
⇤ 2 Fsd to denote I-J-OPT. Assume that there exists a policy f0 2
FJ, f0 6= f
⇤, satisfyingE
f0
Tf0
<E
f
⇤
Tf
⇤. (A.5)
Let f1 be the insensitive jockeying policy associated to f0, where f1 and f0 have the
same expected power consumption and throughput. Thus, policy f1 2Fs, satisfies
Ef0
Tf0
=E
f1
Tf1
<E
f
⇤
Tf
⇤. (A.6)
Since f
⇤ is optimal in Fsd , it is also optimal in Fs (Lemma A.2), and this contradicts
(A.6). The theorem is now proved.
A.4 Proof of Lemma 3.2
Some notation have been defined in Chapter 3.
133
Theoretical results for Jockeying Job-Assignments
Proof. We can rewrite FA as
FSSFA =
Ân2N SSF
A
FSSF(n)pSSFn
Ân2N SSF
A
pSSFn
=
bK�1Â
i=1
BiÂ
k=1Fi(
ril
)kbK
’j=i+1
(r jl
)B j
bK�1Â
i=1
BiÂ
k=1( ri
l
)kbK
’j=i+1
(r jl
)B j
, (A.7)
where Fi = Âij=1 e j/ri. Since for j = 1,2, . . . , bK�2, B j+1 � B j � 1, (A.7) is equiv-
alent to
FSSFA =
bK�2Â
m=0
Bm+1Â
k=Bm+1
bK�1Â
i=m+1Fi� ri
l
�kbK
’j=i+1
� r jl
�B j
bK�2Â
m=0
Bm+1Â
k=Bm+1
bK�1Â
i=m+1
� ril
�kbK
’j=i+1
� r jl
�B j
(A.8)
where we have set B0 = 0.
Let
ai =
8
>
>
<
>
>
:
bK’
j=i+1
� r jl
�B j�1 � ril
�k, i = m+1,m+2, . . . , bK�1,
0, otherwise.
Since r j/l 1, B j � 1, j = 1,2, . . . , bK, k = Bm + 1,Bm + 2, . . . ,Bm+1, and m =
0,1, . . . , bK�2, we obtain ai+1 � ai for i = 1,2, . . . , bK�1.
Since Fi+1 �Fi, and ai+1 � ai, i = 1,2, . . . , bK�1, we obtain
bK�1Â
i=m+1Fi� ri
l
�kbK
’j=i+1
� r jl
�B j
bK�1Â
i=m+1
� ril
�kbK
’j=i+1
� r jl
�B j
=
bK�1Â
i=1Fi
bK’
j=i+1
� r jl
�
ai
bK�1Â
i=1
bK’
j=i+1
� r jl
�
ai
�
bK�1Â
i=1Fi
bK’
j=i+1(
r jl
)
bK�1Â
i=1
bK’
j=i+1(
r jl
)
. (A.9)
Or equivalently,
FSSFA �
bK�2Â
m=0
Bm+1Â
k=Bm+1
bK�1Â
i=1Fi� ri
l
�kbK
’j=i+1
� r jl
�
bK�2Â
m=0
Bm+1Â
k=Bm+1
bK�1Â
i=1
� ril
�kbK
’j=i+1
� r jl
�
= FE⇤(bK)A .
This proves the lemma.
134
A.5 Proof of Lemma 3.3
A.5 Proof of Lemma 3.3
Proof. Since Ff
B = ÂbKj=1 e j/r
bK when f =SSF, E⇤(bK), (3.71) is proved. Equation
(3.72) can be written as
Ân2N SSF
C
qSSFn FSSF(n)
Ân2N SSF
C
qSSFn
=
Ân2N E⇤(bK)
C
qE⇤(bK)n FE⇤(bK)(n)
Ân2N E⇤(bK)
C
qE⇤(bK)n
(A.10)
where qf
n is defined in (3.65). Since N SSFC = N E⇤(bK)
C with qSSFn = qE⇤(bK)
n and
FSSF(n) = FE⇤(bK)(n) for all n 2 N SSFC (n 2 N E⇤(bK)
C ), (A.10) holds true. This
proves the lemma.
A.6 Proof of Lemma 3.4
Proof. According to the definition of qf
n in (3.65), we obtain
pf
Bbn=
1
ÂBn=0 qf
n, (A.11)
since ÂBn=0 pf
n = 1. Based on the definition (3.65), we also obtain
qSSFn
8
>
<
>
:
= qE⇤(bK)n , B
bK�1 +1 n B,
qE⇤(bK)K , 0 n B
bK�1,(A.12)
where qSSFBbK= qE⇤(bK)
BbK
= 1. Thus ÂBn=0 qSSF
n ÂBn=0 qE⇤(bK)
n . Together with (A.11), this
implies pSSFBbK� pE⇤(bK)
BbK
. Then, by (A.12) and (3.65), we get pSSFBbK
qSSFn � pE⇤(bK)
BbK
qE⇤(bK)n
135
Theoretical results for Jockeying Job-Assignments
for BbK +1 n B. This also reads as
pSSF(r j) =B j
Ân=B j�1+1
pSSFn �
B j
Ân=B j�1+1
pE⇤(bK)n = pE⇤(bK)(r j), r j 2 RC, j = 1,2, . . . ,K
and this proves the lemma.
A.7 Proof of Proposition 3.7
For clarity of presentation, we define
Pf
q
= Ân2N f
q
pf
n . (A.13)
Lemma A.3. Consider the system defined in Section 3.2, if B j � 2, l �p
5+12 r
bK�1,
j = 1,2, . . . , bK�1, then
l
r j
l
r j
B j+1�1Â
i=0
⇣
l
r j+1
⌘i
B j�1Â
i=0
� r jl
�i, j = 1,2, . . . , bK�1. (A.14)
Proof. We rewrite (A.14) as
B j�1
Âi=0
⇣r j
l
⌘i
B j+1�1
Âi=0
✓
l
r j+1
◆i. (A.15)
For the case where B j+1 = 2, B j is sufficiently large, and r j = r j+1 =p
5�12 l
(l =p
5+12 r
bK�1), (A.15) can be specified as
1
1�p
5�12
1+p
5+12
, (A.16)
136
A.7 Proof of Proposition 3.7
which clearly holds true. Also, (A.16) is also a sufficient condition for (A.15). This
proves this lemma.
Lemma A.4. Consider the system defined in Section 3.2. If B j � 2, j = 1,2, . . . ,K,
l �p
2+12 r
bK�1, and PSSFA PE⇤(bK)
A , then
FSSFA PSSF
A �FE⇤(bK)A PE⇤(bK)
A �ÂbK�1
j=1 e j
rbK�1
(PSSFB �PE⇤(bK)
B ). (A.17)
Proof. Based on definitions (3.64) and (3.67), for j = 1,2, . . . , bK�2, we have
pE⇤(bK)(r j+1) = pE⇤(bK)j+1 = pE⇤(bK)
jl
r j= pE⇤(bK)(r j)
l
r j, (A.18)
and
pSSF(r j+1) =B j+1
Âi=B j+1
pSSFi =
l
r j
B j+1�1
Âi=0
✓
l
r j+1
◆ipSSF
B j=
l
r j
B j+1�1Â
i=0
⇣
l
r j+1
⌘i
B j�1Â
i=0
� r jl
�ipSSF(r j).
(A.19)
According to Lemma A.3, it holds
pE⇤(bK)(r j+1)
pE⇤(bK)(r j)
pSSF(r j+1)
pSSF(r j). (A.20)
Thus, if pSSF(r j+1) pE⇤(bK)(r j+1), then, pSSF(r j+1) pE⇤(bK)(r j); while if pSSF(r j)>
pE⇤(bK)(r j), then, pSSF(r j+1)> pE⇤(bK)(r j+1). Therefore there exists j⇤ 2 {1,2, . . . , bK�
2} satisfying
pSSF(r j)
8
>
<
>
:
> pE⇤(bK)(r j), j⇤+1 j bK�1,
pE⇤(bK)(r j), 1 j j⇤.(A.21)
137
Theoretical results for Jockeying Job-Assignments
That is
FSSFA PSSF
A �FE⇤(bK)A PE⇤(bK)
A =bK�1
Âj=1
F j
⇣
pSSF(r j)� pE⇤(bK)(r j)⌘
=j⇤
Âj=1
F j
⇣
pSSF(r j)� pE⇤(bK)(r j)⌘
+bK�1
Âj= j⇤+1
F j
⇣
pSSF(r j)� pE⇤(bK)(r j)⌘
= FbK�1j⇤+1
bK�1
Âj= j⇤+1
⇣
pSSF(r j)� pE⇤(bK)(r j)⌘
�F j⇤1
j⇤
Âj=1
⇣
pE⇤(bK)(r j)� pSSF(r j)⌘
(A.22)
where we recall F j = Â ji=1 ei/r j, and define
F ji =
jÂ
j0=iF j0 |pSSF(r j0)� pE⇤(bK)(r j0)|
jÂ
j0=i|pSSF(r j0)� pE⇤(bK)(r j0)|
, i = 1,2, . . . , j.
Then, according to (A.22), we obtain
FSSFA PSSF
A �FE⇤(bK)A PE⇤(bK)
A �F j⇤1
bK�1
Âj=1
⇣
pSSF(r j)� pE⇤(bK)(r j)⌘
. (A.23)
SincebK�1Âj=1
⇣
pSSF(r j)� pE⇤(bK)(r j)⌘
0, (A.23) implies
FSSFA PSSF
A �FE⇤(bK)A PE⇤(bK)
A �Fbk�1
bK�1
Âj=1
⇣
pSSF(r j)� pE⇤(bK)(r j)⌘
. (A.24)
This proves the lemma.
A proof of Proposition 3.7.
138
A.8 Lemmas for Proposition 3.6
Proof. Since FSSFB = FE⇤(bK)
B and FSSFC = FE⇤(bK)
C by Lemma 3.3, the condition (3.73)
of our feasible region can be written as
FSSFA PSSF
A �FE⇤(bK)A PE⇤(bK)
A +FSSFB (PSSF
B �PE⇤(bK)B )+FSSF
C (PSSFC �PE⇤(bK)
C )� 0.
Based on Lemma A.4,
FSSFA PSSF
A �FE⇤(bK)A PE⇤(bK)
A +FSSFB (PSSF
B �PE⇤(bK)B )+FSSF
C (PSSFC �PE⇤(bK)
C )
�FbK�1(P
SSFA �PE⇤(bK)
A )+FSSFB (PSSF
B �PE⇤(bK)B )+FSSF
C (PSSFC �PE⇤(bK)
C )
= (FSSFB �F
bK�1)(PE⇤(bK)A �PSSF
A )+(FSSFC �FSSF
B )(PSSFC �PE⇤(bK)
C )
� (PSSFB �PE⇤(bK)
B )(FSSFB �FE⇤(bK)
A )� 0. (A.25)
The identity is obtained based on PFA +Pf
B +Pf
C = 1 for f = SSF,E⇤(bK). Since
PSSFC � PE⇤(bK)
C by Lemma 3.4, we obtain the second inequality. Note that PSSFA <
PE⇤(bK)A (equality is not achieved) requires bK < K. Thus, (3.73) holds true, which is
equivalent to {2,3, . . . ,K�1}✓FbK .
A.8 Lemmas for Proposition 3.6
Lemma A.5. For a general system, ifp
2�12 µ3 � µ1, e3/µ3 �
p2e2/µ2, then
µ2e3� e2µ3 � 2(µ1e2� e1µ2). (A.26)
Proof. For clarity of presentation, let a1 = e1/µ1, a2 = e2/µ2 and a3 = e3/µ3.
Inequality (A.26) is equivalent to
a3µ2µ3�a2µ2µ3 � 2(a2µ1µ2�a1µ1µ2), (A.27)
139
Theoretical results for Jockeying Job-Assignments
ora3�a2
a2�a1� 2µ1
µ3. (A.28)
Using a3 �p
2a2, we obtain the inequality
a3�a2
a2�a1� 2�
p2p
2�2a1a3
. (A.29)
A sufficient condition for (A.28) is then that
2�p
2p2� 2µ1
µ3, (A.30)
orµ1
µ3p
2�12
.
As the latter holds true, the lemma is proven.
Lemma A.6. For an extended policy with k 2 [1,B1], if l � µ1 +µ2 +µ3, then
⇣
l
µ1
⌘
k
⇣
l
r2
⌘B2⇣
l
r1
⌘B1l
r3
✓
⇣
l
r3
⌘B3�1◆
l
r3�1
2
4
l
r2
✓
⇣
l
r2
⌘B2+B1�k
�1◆
l
r2�1
�⇣
l
r2
⌘B1�k
⇣
l
r2
⌘
✓
⇣
l
r2
⌘B2�1◆
⇣
l
r2
⌘
�1
3
5
+ l
µ1
⇣
l
r2
⌘B2l
r3
✓
⇣
l
r3
⌘B3�1◆
⇣
l
r3
⌘
�1
2
4
⇣
l
µ1
⌘B1⇣
l
µ1
⌘
k
�1l
µ1�1�⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘
k�1 l
µ1
✓
⇣
l
µ1
⌘B1�1◆
⇣
l
µ1
⌘
�1
3
5� 0.
(A.31)
Proof. The inequality (A.31) is the same as
⇣
l
µ1
⌘
k
⇣
l
r2
⌘B2⇣
l
r1
⌘B1l
r3
✓
⇣
l
r3
⌘B3�1◆
l
r3�1
0
@
l
r2
✓
⇣
l
r2
⌘B1�k
�1◆
l
r2�1
1
A
+ l
µ1
⇣
l
r2
⌘B2 l
r3
⇣
l
r3
⌘B3�1l
r3�1
2
4
⇣
l
µ1
⌘B1⇣
l
µ1
⌘
k
�1l
µ1�1�⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘
k�1 l
µ1
✓
⇣
l
µ1
⌘B1�1◆
l
µ1�1
3
5� 0,
140
A.8 Lemmas for Proposition 3.6
or
l
µ1
⇣
l
r2
⌘B2 l
r3
⇣
l
r3
⌘B3�1l
r3�1
2
4
⇣
l
r1
⌘B1+k�1 l
r2
✓
⇣
l
r2
⌘B1�k
�1◆
l
r2�1
+⇣
l
µ1
⌘B1⇣
l
µ1
⌘
k�1�1
l
µ1�1
�⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘
k�1 l
µ1
✓
⇣
l
µ1
⌘B1�1◆
l
µ1�1
1
A� 0,
or
Âk�1i=1
⇣
l
µ1
⌘B1+i�1�⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘i+k�1�
+ÂB1i=k
⇣
l
µ1
⌘B1+k�1⇣l
r2
⌘i�k
�⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘i+k�1�
� 0,
or
k�1
Âi=1
✓
l
µ1
◆i+k�1"
✓
l
µ1
◆B1�k
�✓
l
r2
◆B1�k
#
+B1
Âi=k
✓
l
µ1
◆i+k�1✓l
r2
◆i�k+1"
✓
l
µ1
◆B1�i�✓
l
r2
◆B1�i�1#
� 0. (A.32)
As the inequality (A.32) holds, Lemma A.6 is proven.
Lemma A.7. For a general system and a given integer k 2 [1,B1], if l � µ1 +µ2 +
µ3, µ3 � 1B3�1(µ1 +µ2) and B3 � 2, then,
⇣
l
r2
⌘B1+B2�k
0
@
⇣
l
r3
⌘
✓
⇣
l
r3
⌘B3�1◆
l
r3�1
1
A�l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
l
r2�1
. (A.33)
Proof. The inequality (A.33) is equivalent to
l
r3
✓
⇣
l
r3
⌘B3�1
◆
⇣
l
r2
⌘B1+B2�k
⇣
l
r2�1
⌘
� l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
⇣
l
r3�1
⌘
,
141
Theoretical results for Jockeying Job-Assignments
or
l
r3
✓
⇣
l
r3
⌘B3�1
◆
⇣
l
r2
⌘B1+B2�k+1�⇣
l
r3�1
⌘⇣
l
r2
⌘B1+B2�k+1
� l
r3
✓
⇣
l
r3
⌘B3�1
◆
⇣
l
r2
⌘B1+B2�k
+⇣
l
r3�1
⌘
l
r2� 0,
or
✓
⇣
l
r3
⌘B3+1�2 l
r3+1
◆
⇣
l
r2
⌘B1+B2�k+1
� l
r3
✓
⇣
l
r3
⌘B3�1
◆
⇣
l
r2
⌘B1+B2�k
+⇣
l
r3�1
⌘
l
r2� 0,
or
⇣
l
r2
⌘B1+B2�k
⇣
l
r2�1
⌘⇣
l
r3
⌘B3+1�⇣
2 l
r2�1
⌘
l
r3+ l
r2
�
+⇣
l
r3�1
⌘
l
r2� 0.
Setting
f (l ) =✓
l
r2�1
◆✓
l
r3
◆B3+1�✓
2l
r2�1
◆
l
r3+
l
r2.
142
A.8 Lemmas for Proposition 3.6
We compute the derivatives. It is a sample matter to check that
8
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
:
f (l ) = 1r2rB3+1
3l
B3+2� 1rB3+13
l
B3+1� 2r2r3
l
2 +⇣
1r3+ 1
r2
⌘
l ,
f (1)(l ) = B3+2r2rB3+1
3l
B3+1� B3+1rB3+13
l
B3� 4r2r3
l + 1r3+ 1
r2,
f (2)(l ) = (B3+2)(B3+1)r2rB3+1
3l
B3� (B3+1)B3
rB3+13
l
B3�1� 4r2r3
,
f (3)(l ) = (B3+2)(B3+1)B3
r2rB3+13
l
B3�1� (B3+1)B3(B3�1)rB3+1
3l
B3�2,
= (B3+1)B3l
B3�2
rB3+13
⇣
(B3 +2) l
r2�B3 +1
⌘
.
For l � r3 > r2, f (l ), f (1)(l ), f (2)(l ), and f (3)(l ) are straightforwardly positive.
Since µ3 � 1B3�1r2, we obtain r3
r2� B3
B3�1 , hence, f (2)(r3) =(B3+2)(B3+1)
r2r3�
(B3+1)B3r2
3� (B3+1)(B3+2)
r23
⇣
r3r2� B3
B3+2
⌘
> 0, f (1)(r3) =B3�1
r2� B3
r3� 0 and f (r3) = 0,
so that f (l )� 0 is true for all l � r3. This establishes Lemma A.7.
Lemma A.8. For a general system and a given integer k 2 [1,B1� 1], if l �
µ1 +µ2 +µ3 and µ1 µ2, then,
l
µ1
2
4
⇣
l
µ1
⌘B1
⇣
l
µ1
⌘
k
�1l
µ1�1
!
l
r2
✓
⇣
l
r2
⌘B2�1◆
l
r2�1
�⇣
l
µ1
⌘
k�1 l
µ1
✓
⇣
l
µ1
⌘B1�1◆
l
µ1�1
l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
l
r2�1
3
5
+ l
µ1
⇣
l
r2
⌘B2l
r3
✓
⇣
l
r3
⌘B3�1◆
l
r3�1
2
4
⇣
l
µ1
⌘B1⇣
l
µ1
⌘
k
�1l
µ1�1�⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘
k�1 l
µ1
✓
⇣
l
µ1
⌘B1�1◆
l
µ1�1
3
5
+2⇣
l
µ1
⌘
k
⇣
l
r2
⌘B2⇣
l
r1
⌘B1l
r3
✓
⇣
l
r3
⌘B3�1◆
l
r3�1
2
4
l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
l
r2�1
�⇣
l
r2
⌘B1�k
l
r2
✓
⇣
l
r2
⌘B2�1◆
l
r2�1
3
5
� 0.(A.34)
143
Theoretical results for Jockeying Job-Assignments
Proof. The inequality (A.34) is the same as
l
µ1
2
4
⇣
l
µ1
⌘B1
⇣
l
µ1
⌘
k
�1l
µ1�1
!
l
r2
✓
⇣
l
r2
⌘B2�1◆
l
r2�1
�l
µ1
✓
⇣
l
µ1
⌘B1�1◆
l
µ1�1
⇣
l
µ1
⌘
k�1 l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
l
r2�1
3
5
+ l
µ1
⇣
l
r2
⌘B2l
r3
✓
⇣
l
r3
⌘B3�1◆
l
r3�1
"
2⇣
l
r1
⌘B1+k�1⇣
l
r2
⌘B1�k+1�1
l
r2�1
�⇣
l
r1
⌘B1+k�1�⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘
k�1 l
µ1
✓
⇣
l
µ1
⌘B1�1◆
l
µ1�1
+⇣
l
µ1
⌘B1⇣
l
µ1
⌘
k�1�1
l
µ1�1
3
5� 0.
Based on Lemma A.7, a sufficient condition for (A.8) to occur is
l
µ1
⇣
l
µ1
⌘B1
⇣
l
µ1
⌘
k
�1l
µ1�1
!
2
4
l
r2
✓
⇣
l
r2
⌘B2�1◆
l
r2�1
3
5
+ l
µ1
⇣
l
r2
⌘B2l
r3
✓
⇣
l
r3
⌘B3�1◆
l
r3�1
"
2⇣
l
r1
⌘B1+k�1⇣
l
r2
⌘B1�k+1�1
l
r2�1
�⇣
l
r1
⌘B1+k�1�2
⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘
k�1 l
µ1
✓
⇣
l
µ1
⌘B1�1◆
l
µ1�1
+⇣
l
µ1
⌘B1⇣
l
µ1
⌘
k�1�1
l
µ1�1
3
5� 0.
The expression in the square parenthesis
2⇣
l
r1
⌘B1+k�1⇣
l
r2
⌘B1�k+1�1
l
r2�1
�⇣
l
r1
⌘B1+k�1
�2⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘
k�1 l
µ1
✓
⇣
l
µ1
⌘B1�1◆
l
µ1�1
+⇣
l
µ1
⌘B1⇣
l
µ1
⌘
k�1�1
l
µ1�1
,
144
A.8 Lemmas for Proposition 3.6
is equal to
Âk�1i=1
⇣
l
µ1
⌘B1+i�1�2
⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘i+k�1�
+⇣
l
µ1
⌘B1+k�1
�⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘2k�1+2ÂB1
i=k+1
⇣
l
µ1
⌘B1+k�1⇣l
r2
⌘i�k
�⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘i+k�1�
.
(A.35)
We can rewrite (A.35) as
k�1Â
i=1
⇣
l
µ1
⌘i+k�1
⇣
l
µ1
⌘B1�k
�2⇣
l
r2
⌘B1�k
�
+⇣
l
µ1
⌘2k�1
⇣
l
µ1
⌘B1�k
�⇣
l
r2
⌘B1�k
�
+2B1Â
i=k+1
⇣
l
µ1
⌘i+k�1⇣l
r2
⌘i�k
⇣
l
µ1
⌘B1�i�⇣
l
r2
⌘B1�i�
.
(A.36)
Since r1 = µ1, l � µ1 +µ2 +µ3, µ2 � µ1, (A.36) is no less than zero. Thus (A.8)
is true and Lemma A.8 is proven.
Lemma A.9. For a general system, ifp
2�12 µ3 � µ1 and e3/µ3 �
p2e2/µ2, then
µ1e3 +µ2e3 +µ1e4 +µ2e4� e1µ3� e1µ4� e2µ3� e2µ4
� µ1e2 +µ1e3 +µ1e4� e1µ2� e1µ3� e1µ4.
(A.37)
Proof. Inequality (A.37) is equivalent to
µ2e3 +µ2e4� e2µ3� e2µ4 � µ1e2� e1µ2. (A.38)
According to Lemma A.5, we have µ2e3� e2µ3 � µ1e2� e1µ2, so that a sufficient
condition for (A.38) is
µ2e4� e2µ4 � 0. (A.39)
As this last inequality holds true, Lemma A.9 is proven.
145
Theoretical results for Jockeying Job-Assignments
A.9 Proof of Proposition 3.6
Proof. For clarity of presentation, Let E = E E⇤k ,L = L E⇤
k and E 0 = E SSF,L 0 =
L SSF. Note that bK = 2 is set through all this proof. Also, let p
0n and pn be the steady
state probability of state n under SSF and E⇤k
, respectively. The reciprocal values of
energy efficiency for SSF and E⇤k
are given by
E 0
L 0 =p
00
L 0
0
B
B
@
l
µ1
✓
⇣
l
µ1
⌘B1�1
◆
l
µ1�1
e1 +
✓
l
µ1
◆B1l
r2
✓
⇣
l
r2
⌘B2�1
◆
l
r2�1
(e1 + e2)
+
✓
l
µ1
◆B1✓
l
r2
◆B2l
r3
✓
⇣
l
r3
⌘B3�1
◆
l
r3�1
(e1 + e2 + e3)
+
✓
l
µ1
◆B1✓
l
r2
◆B2✓
l
r3
◆B3l
r4
✓
⇣
l
r4
⌘B4�1
◆
l
r4�1
(e1 + e2 + e3 + e4)
1
C
C
A
, (A.40)
and
E
L=
p0
L
0
B
B
@
l
µ1
⇣⇣
l
µ1
⌘
k
�1⌘
l
µ1�1
e1 +
✓
l
µ1
◆
k
l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
l
r2�1
(e1 + e2)
+
✓
l
µ1
◆
k
✓
l
r2
◆B1+B2�k
l
r3
✓
⇣
l
r3
⌘B3�1
◆
l
r3�1
(e1 + e2 + e3)
+
✓
l
µ1
◆
k
✓
l
r2
◆B1+B2�k
✓
l
r3
◆B3l
r4
✓
⇣
l
r4
⌘B4�1
◆
l
r4�1
(e1 + e2 + e3 + e4)
1
C
C
A
, (A.41)
where ri = Âij=1 µ j, as defined before. In addition, according to the definitions of
policies MEESF and E⇤ for our server farms,
146
A.9 Proof of Proposition 3.6
L 0
p
00=
l
µ1
✓
⇣
l
µ1
⌘B1�1
◆
l
µ1�1
·µ1 +
✓
l
µ1
◆B1l
r2
✓
⇣
l
r2
⌘B2�1
◆
l
r2�1
· (µ1 +µ2)
+
✓
l
µ1
◆B1✓
l
r2
◆B2l
r3
✓
⇣
l
r3
⌘B3�1
◆
l
r3�1
· r3
+
✓
l
µ1
◆B1✓
l
r2
◆B2✓
l
r3
◆B3l
r4
✓
⇣
l
r4
⌘B4�1
◆
l
r4�1
· r4, (A.42)
and
L
p0=
l
µ1(⇣
l
µ1
⌘
k
�1)l
µ1�1
·µ1 +
✓
l
µ1
◆
k
l
r2(( l
r2)B1+B2�k �1)
l
r2�1
· (µ1 +µ2)
+
✓
l
µ1
◆
k
✓
l
r2
◆B1+B2�k
l
r3
✓
⇣
l
r3
⌘B3�1
◆
l
r3�1
· r3
+
✓
l
µ1
◆
k
✓
l
r2
◆B1+B2�k
✓
l
r3
◆B3l
r4
✓
⇣
l
r4
⌘B4�1
◆
l
r4�1
· r4. (A.43)
We want to show that E 0/L 0 �E /L � 0, which is equivalent to
E 0/L 0
p
00/L
0 ·L
p0� E /L
p0/L· L
0
p
00� 0. (A.44)
Substituting factors in (A.44) by (A.40)-(A.43), we obtain a long expression for
E 0/L 0 �E /L , which is the sum of the four terms, S1, . . . ,S4, given by
S1 =
l
µ1
⇣⇣
l
µ1
⌘
k
�1⌘
l
µ1�1
· µ1 · E 0/L 0
p
00/L
0 �l
µ1
⇣⇣
l
µ1
⌘
k
�1⌘
l
µ1�1
· e1 · L 0
p
00, (A.45)
S2 =
✓
l
µ1
◆
k
·
l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
l
r2�1
r2 ·E 0/L 0
p
00/L
0
147
Theoretical results for Jockeying Job-Assignments
�✓
l
µ1
◆
k
·
l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
l
r2�1
(e1 + e2)L 0
p
00, (A.46)
S3 =
✓
l
µ1
◆
k
✓
l
r2
◆B1+B2�k
l
r3
✓
⇣
l
r3
⌘B3�1
◆
l
r3�1
r3 ·E 0/L 0
p
00/L
0
�✓
l
µ1
◆
k
✓
l
r2
◆B1+B2�k
l
r3
✓
⇣
l
r3
⌘B3�1
◆
l
r3�1
(e1 + e2 + e3)L 0
p
00, (A.47)
and
S4 =
✓
l
µ1
◆
k
✓
l
r2
◆B1+B2�k
✓
l
r3
◆B3l
r4
✓
⇣
l
r4
⌘B4�1
◆
l
r4�1
r4 ·E 0/L 0
p
00/L
0
�✓
l
µ1
◆
k
✓
l
r2
◆B1+B2�k
✓
l
r3
◆B3l
r4
✓
⇣
l
r4
⌘B4�1
◆
l
r4�1
(e1 + e2 + e3 + e4)L 0
p
00.
(A.48)
Recall that
S1 +S2 +S3 +S4 =E 0/L 0
p
00/L
0 ·L
p0� E /L
p0/L· L
0
p
00,
and that we want to show that S1 +S2 +S3 +S4 � 0. But the sum S1 +S2 +S3 +S4
is also equal to the sum of x1 + x2 + x3 + x4 + x5 where
x1 = (µ1e2� e2µ2)l
µ1
2
6
6
4
✓
l
µ1
◆B1
⇣
l
µ1
⌘
k
�1l
µ1�1
l
r2
✓
⇣
l
r2
⌘B2�1
◆
l
r2�1
�✓
l
µ1
◆
k�1l
µ1
✓
⇣
l
µ1
⌘B1�1
◆
l
µ1�1
·
l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
l
r2�1
3
7
7
5
, (A.49)
148
A.9 Proof of Proposition 3.6
x2 = (µ1e2 +µ1e3� e1µ2� e1µ3)l
µ1·✓
l
r2
◆B2l
r3
✓
⇣
l
r3
⌘B3�1
◆
l
r3�1
·
2
6
6
4
✓
l
µ1
◆B1
⇣
l
µ1
⌘
k
�1l
µ1�1
�✓
l
r2
◆B1�k
✓
l
µ1
◆
k�1l
µ1
✓
⇣
l
µ1
⌘B1�1
◆
l
µ1�1
3
7
7
5
,
(A.50)
x3 = (µ1e3 +µ2e3� e1µ3� e2µ3)
✓
l
µ1
◆
k
✓
l
r2
◆B2✓
l
r1
◆B1
·
l
r3
✓
⇣
l
r3
⌘B3�1
◆
l
r3�1
·
2
6
6
4
l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
l
r2�1
�✓
l
r2
◆B1�k
l
r2
✓
⇣
l
r2
⌘B2�1
◆
l
r2�1
3
7
7
5
,
(A.51)
x4 = (µ1e2 +µ1e3 +µ1e4� e1µ2� e1µ3� e1µ4)l
µ1
✓
l
r2
◆B2✓
l
r3
◆B3
·
l
r4
✓
⇣
l
r4
⌘B4�1
◆
l
r4�1
2
6
6
4
⇣
l
µ1
⌘
k
�1l
µ1�1
✓
l
µ1
◆B1
�✓
l
r2
◆B1�k
✓
l
µ1
◆
k�1l
µ1
✓
⇣
l
µ1
⌘B1�1
◆
l
µ1�1
3
7
7
5
,
(A.52)
and
x5 =(µ1e3 +µ2e3 +µ1e4 +µ2e4� e1µ3� e1µ4� e2µ3� e2µ4)
✓
l
µ1
◆B1+k
✓
l
r2
◆B2
✓
l
r3
◆B3l
r4
✓
⇣
l
r4
⌘B4�1
◆
l
r4�1
2
6
6
4
l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
l
r2�1
�✓
l
r2
◆B1�k
l
r2
✓
⇣
l
r2
⌘B2�1
◆
l
r2�1
3
7
7
5
.
(A.53)
149
Theoretical results for Jockeying Job-Assignments
We shall prove S1 + S2 + S3 + S4 � 0 by showing x1 + x2 + x3 + x4 + x5 � 0. A
sufficient condition for this inequality is that x1 + x2 + x3 � 0 and x4 + x5 � 0.
As, we want to show that x1 + x2 + x3 � 0; we rewrite x1 + x2 + x3 as
[(µ2e3� e2µ3)y3 +(µ1e2� e1µ2)(y1 + y2)]+(µ1e3� e1µ3)(y3 + y2)
where8
>
>
>
>
<
>
>
>
>
:
y1 = x1/(µ1e2� e1µ2) ,
y2 = x2/(µ1e2 +µ1e3� e1µ2� e1µ3) ,
y3 = x3/(µ1e3 +µ2e3� e1µ3� e2µ3) .
But, by Lemma A.5, we have
µ2e3� e2µ3 � 2(µ1e2� e1µ2) ,
where µ3 � µ1 and e3µ3�p
2 e2µ2
, and a sufficient condition for x1 + x2 + x3 � 0 is
therefore
2y3 + y1 + y2 � 0,andy3 + y2 � 0.
These inequalities were established as Lemma A.6 and Lemma A.8, and therefore
x1 + x2 + x3 � 0.
To show x4 + x5 � 0, we proceed as follows: Based on Lemma A.9, we obtain
µ1e3 +µ2e3 +µ1e4 +µ2e4� e1µ3� e1µ4� e2µ3� e2µ4
� µ1e2 +µ1e3 +µ1e4� e1µ2� e1µ3� e1µ4.
Thus, a sufficient condition for x4+x5 � 0 is y4+y5 � 0, where y4,y5 are defined by
y4 =⇣
l
µ1
⌘B1+k
⇣
l
r2
⌘B2⇣
l
r3
⌘B3l
r4
✓
⇣
l
r4
⌘B4�1◆
l
r4�1
2
4
l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
l
r2�1
�⇣
l
r2
⌘B1�k
l
r2
✓
⇣
l
r2
⌘B2�1◆
l
r2�1
3
5 ,
150
A.10 Proof of Proposition 3.8
and
y5 =l
µ1
⇣
l
r2
⌘B2⇣
l
r3
⌘B3l
r4
✓
⇣
l
r4
⌘B4�1◆
l
r4�1
2
4
⇣
l
r1
⌘B1⇣
l
µ1
⌘
k
�1l
µ1�1�⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘
k�1 l
r1
✓
⇣
l
r1
⌘B1�1◆
l
r1�1
3
5 .
Note that y4 + y5 � 0 is the same as
⇣
l
µ1
⌘B1+k�10
@
l
r2
✓
⇣
l
r2
⌘B1+B2�k
�1◆
l
r2�1
�⇣
l
r2
⌘B1�k
⇣
l
r2
⌘B2�1l
r2�1
1
A
+⇣
l
r1
⌘B1⇣
l
µ1
⌘
k
�1l
µ1�1�⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘
k�1 l
r1
✓
⇣
l
r1
⌘B1�1◆
l
r1�1
� 0,
or
⇣
l
µ1
⌘B1+k�1 l
r2
✓
⇣
l
r2
⌘B1�k
�1◆
l
r2�1
+⇣
l
r1
⌘B1⇣
l
µ1
⌘
k
�1l
µ1�1�⇣
l
r2
⌘B1�k
⇣
l
µ1
⌘
k�1 l
r1
✓
⇣
l
r1
⌘B1�1◆
l
r1�1
� 0,
or
k�1
Âi=1
✓
l
µ1
◆i+k�1"
✓
l
µ1
◆B1�k
�✓
l
r2
◆B1�k
#
+B1
Âi=k
✓
l
µ1
◆i+k�1✓l
r2
◆i�k
"
✓
l
µ1
◆B�i�✓
l
r2
◆B�i#
� 0 (A.54)
But (A.54) is clearly true, then x4 + x5 � 0. In summary, we have x1 + x2 + x3 + x4 +
x5 � 0 as deserved.
All the lemmas used in above proof are given in Appendix A.8.
A.10 Proof of Proposition 3.8
For clarity of presentation, we define a new system associated with the system
defined in Section 3.2. We refer to the system defined in Section 3.2 as the original
151
Theoretical results for Jockeying Job-Assignments
system and to the new system as a surrogate system. Given an original system
characterized by given values of bK for E⇤, l , µ j, and e j, j = 1, . . . ,K, we construct
a surrogate system, where the average arrival rate of this surrogate system is set to
l . This surrogate system is constructed as follows. Four servers are defined in the
surrogate system, for which the service rates and power consumptions are denoted
by n1, n2, n3 and n4, and e1, e2, e3 and e4, respectively. These service rates and
power consumptions are set to satisfy following equations.
8
>
>
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
>
>
:
ÂBbK�1
i=1�
n1l
�i=
bK�2Â
i=1
⇣
ÂBij=1
� ril
� j⌘
·⇣
’bK�1j=i+1
� r jl
�B j⌘
+BbK�1Âj=1
(rbK�1l
) j,
n1+n2l
= ÂbKi=1 µil
,
n3l
=µ
bK+1l
,
n4l
=µ
bK+2l
,
(A.55)
and
8
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
:
e1 = FSSFA n1,
e2 + e2 = ÂbKj=1 e j,
e3 = e
bK+1,
e4 = e
bK+2,
(A.56)
where µ j, l , and B j are the parameters in the original system, ri = Âij=1 µ j, and bK
is the parameter of E⇤ in the original system. The buffer size of the first server in
the surrogate system is BbK�1, and buffer sizes of the second, third and fourth servers
in the surrogate system are equal to BbK , B
bK+1, and BbK+2, respectively. For clarity,
we denote the policy E⇤ and the policy E⇤k
with parameter bK as E⇤(bK) and E⇤k
(bK),
respectively. We will analyze the original system under SSF and E⇤ by considering
the corresponding surrogate system under SSF and E⇤bK(2) in the surrogate system.
For a given bK, consider the original system and the corresponding surrogate
system. We divide the states of both systems into three periods denoted by A, B and
152
A.10 Proof of Proposition 3.8
C. Remember that q , q 2 {A,B,C}, is the time period during which the total service
rate of the system is in set Rq
defined in (3.69).
We again recall the definition of pf
n in (3.64) with Ân2N pf
n = 1, f 2 F. To
clarify our two systems, we denote pf
n in the original system and the surrogate system
as pf
n and pSS,fn , respectively. In a similar way, we denote Pf
q
and Ff
q
, which are
defined in (A.13) and (3.70), as Pf
q
, PSS,fq
, Ff
q
and FSS,fq
for the original and the
surrogate system, respectively.
Lemma A.10. Consider the system defined in Section 3.2 and its associated surro-
gate system, of which the parameters are defined by (A.55) and (A.56). If K = bK +2,
then
PSSFq
= PSS,SSFq
, q 2 {A,B,C}. (A.57)
Proof. For q 2 {A,B,C} and f 2 {SSF,E⇤(bK),E⇤bK(2)}, we define
qf
q
= µ2R
q
pf (µ)
pf
BbK
,
qSS,fq
= µ2R
q
pSS,f (µ)
pSS,fBbK
.
(A.58)
Definition (A.55) is equivalent to qSSFq
= qSS,SSFq
, and we obtain
PSSFq
=q
q
SSF
Âq
02{A,B,C}qSSF
q
0=
qSS,SSFq
Âq
02{A,B,C}qSS,SSF
q
0
= PSS,SSFq
, q 2 {A,B,C}.
This proves the lemma.
Lemma A.11. Consider the system defined in Section 3.2 and its associated surro-
gate system, of which the parameters are defined by (A.55) and (A.56). If Condition
3.2 hold true for m = bK, thene3
n3�p
2e2
n2. (A.59)
153
Theoretical results for Jockeying Job-Assignments
Proof. Lete2
n2= F(n1,FA,SSF) =
ÂbKj=1 e j�FA,SSFn1
ÂbKj=1 µ j�n1
. (A.60)
Then, since FA,SSF � e1µ1
and n1 rbK�1,
F(n1,FA,SSF)F(n1,e1
µ1)F(r
bK�1,e1
µ1), (A.61)
where rbK�1 = ÂbK�1
j=1 µ j. Note that, in (A.61), the equality is achieved when e jµ j
= e1n1
for j = 1,2, . . . , bK�1 and µ1� µ j for j = 2,3, . . . , bK�1.
We rewrite (A.61) as
e2
n2= F(n1,FA,SSF)F(r
bK�1,e1
µ1) =
e
bKµ
bK+
ÂbK�1j=2 e j
µ
bK 1p
2
e
bK+1µ
bK+1=
e3p2n3
where the second inequality is based on Condition 3.2. This proves the lemma.
Lemma A.12. Consider the system defined in Section 3.2 and its associated surro-
gate system, if K = bK +2, l � rbK+1, and Condition 3.2 hold true for m = bK, then,
BbK
Ân=0
pE⇤(bK)n �
BbK
Ân=0
pSS,E⇤
bK(2)
n �⇣r
bKl
⌘BbK�bK�1 bK
Âi=1
⇣
n1
l
⌘i� 0. (A.62)
Proof.
BbK
Ân=0
pE⇤(bK)n �
BbK
Ân=0
pSS,E⇤
bK(2)
n
=⇣r
bKl
⌘BbK�bK
bK�1
Âi=1
bK�1
’j=i
r j
l
+BbK�bK
Âi=0
⇣rbK
l
⌘i�⇣r
bKl
⌘BbK�bK�1 bK
Âi=1
⇣
n1
l
⌘i�
BbK�bK�1
Âi=0
⇣rbK
l
⌘i
=⇣r
bKl
⌘BbK�bK
+⇣r
bKl
⌘BbK�bK�1
"
rbK
l
bK�1
Âi=1
bK�1
’j=i
r j
l
�bK
Âi=1
⇣
n1
l
⌘i#
�⇣r
bKl
⌘BbK�bK�1
"
rbK
l
�bK
Âi=1
⇣
n1
l
⌘i#
. (A.63)
154
A.10 Proof of Proposition 3.8
Since rbK � (1+
p3)r
bK�1 � (1+p
3)n1, we obtain
rbK
l
�bK
Âi=1
⇣
n1
l
⌘i� (p
3+1)rbK�1l
�rbK�1l
1� rbK�1l
�
(p
3+1)✓
1�rbk�1l
◆
�1�
rbK�1l
1� rbK�1l
.
Together with l � µ
bK+1 + rbK � (
p3+2)r
bK�1, we obtain
rbK
l
�bK
Âi=1
⇣
n1
l
⌘i�
(p
3+1)✓
1�rbk�1l
◆
�1�
rbK�1l
1� rbK�1l
�rbK�1l
1� rbK�1l
�bK
Âi=1
⇣
n1
l
⌘i,
namely
BbK
Ân=0
pE⇤(bK)n �
BbK
Ân=0
pSS,E⇤
bK(2)
n �⇣r
bKl
⌘BbK�bK�1
"
rbK
l
�bK
Âi=1
⇣
n1
l
⌘i#
�⇣r
bKl
⌘BbK�bK�1 bK
Âi=1
⇣
n1
l
⌘i.
This proves the lemma.
Lemma A.13. Consider the system defined in Section 3.2 and its associated sur-
rogate system, if K = bK + 2, l � rbK+1, and Condition 3.2 hold true for m = bK,
then
✓
PSS,E⇤
bK(2)
A �PE⇤(bK)A
◆
⇣
FSS,SSFA �FSS,SSF
B
⌘
�✓
PSS,E⇤
bK(2)
C �PE⇤(bK)C
◆
⇣
FSS,SSFC �Fss,SSF
B
⌘
� 0. (A.64)
Proof. Let
Qf
q
=Pf
q
pf
BbK
, (A.65)
and
QSS,fq
=PSS,f
q
pSS,fBbK
, (A.66)
for q = A,B,C and f = SSF,E⇤(bK),E⇤bK(2). Since QE(bK)
C = QSS,E⇤
bK(2)
C , we get
155
Theoretical results for Jockeying Job-Assignments
PSS,E⇤
bK(2)
A �PE⇤(bK)A
PSS,E⇤
bK(2)
C �PE⇤(bK)C
=
QSS,E⇤
bK(2)
A
⇣
QE⇤(bK)A +QE⇤(bK)
B +QE⇤(bK)C
⌘
�QE⇤(bK)A
✓
QSS,E⇤
bK(2)
A +QSS,E⇤
bK(2)
B +QSS,E⇤
bK(2)
C
◆
QSS,E⇤
bK(2)
C
⇣
QE⇤(bK)A +QE⇤(bK)
B +QE⇤(bK)C
⌘
�QE⇤(bK)C
✓
QSS,E⇤
bK(2)
A +QSS,E⇤
bK(2)
B +QSS,E⇤
bK(2)
C
◆
=Q
SS,E⇤bK(2)
A QE⇤(bK)B +Q
SS,E⇤bK(2)
A QE⇤(bK)C �QE⇤(bK)
A QSS,E⇤
bK(2)
B �QE⇤(bK)A Q
SS,E⇤bK(2)
C
QE⇤(bK)C
✓
QE⇤(bK)A +QE⇤(bK)
B �QSS,E⇤
bK(2)
A �QSS,E⇤
bK(2)
B
◆ .
(A.67)
According to definition (A.65) and (A.66), we can rewrite (A.62) as
QE⇤(bK)A +QE⇤(bK)
B �QSS,E⇤
bK(2)
A �QSS,E⇤
bK(2)
B � QSS,E⇤
bK(2)
A .
Then, based on Lemma A.12 and (A.67), we obtain
PSS,E⇤
bK(2)
A �PE⇤(bK)A
PSS,E⇤
bK(2)
C �PE⇤(bK)C
Q
SS,E⇤bK(2)
A QE⇤(bK)B +Q
SS,E⇤bK(2)
A QE⇤(bK)C �QE⇤(bK)
A QSS,E⇤
bK(2)
B �QE⇤(bK)A Q
SS,E⇤bK(2)
C
QE⇤(bK)C Q
SS,E⇤bK(2)
A
.
=
QE⇤(bK)B +QE⇤(bK)
C �QE⇤(bK)A
QSS,E⇤
bK(2)
B +QSS,E⇤
bK(2)
C
QSS,E⇤
bK(2)
A
!
QE⇤(bK)C
.
=QE⇤(bK)
B
QE⇤(bK)C
+1�QE⇤(bK)
A
QSS,E⇤
bK(2)
A
0
@
QSS,E⇤
bK(2)
B
QSS,E⇤
bK(2)
C
+1
1
A . (A.68)
Based on definition (A.65) and (A.66), we can obtain
QE⇤(bK)A
QSS,E⇤
bK(2)
A
� 0,Q
SS,E⇤bK(2)
B
QSS,E⇤
bK(2)
C
� 0 andQE⇤(bK)
B
QE⇤(bK)C
1BbK+1 +B
bK+2. (A.69)
156
A.10 Proof of Proposition 3.8
Then, according to (A.68), we obtain
PSS,E⇤
bK(2)
A �PE⇤(bK)A
PSS,E⇤
bK(2)
C �PE⇤(bK)C
1BbK+1 +B
bB+2+1 5
4. (A.70)
According to Lemma A.12 and (A.70), a sufficient condition for (A.64) is
FSS,E⇤
bK(2)
C �FSS,E⇤
bK(2)
B
FSS,E⇤
bK(2)
B �FSS,E⇤
bK(2)
A
�F
SS,E⇤bK(2)
C �FSS,E⇤
bK(2)
B
FSS,E⇤
bK(2)
B
� 54,
or
FSS,E⇤
bK(2)
C �
bK+1Âj=1
e j
bK+1Âj=1
µ j
� 94
FSS,E⇤
bK(2)
B =94
bKÂj=1
e j
bKÂj=1
µ j
,
which is given in Condition 3.2. The lemma is proven.
Lemma A.14. Consider the system defined in Section 3.2 and its associated sur-
rogate system, if K = bK + 2, and Conditions 3.1-3.2 hold true for m = bK, then,
Âq2{A,B,C}
FE⇤(bK)q
PE⇤(bK)q
Âq2{A,B,C}
FSS,E⇤
bK(2)
q
PSS,E⇤
bK(2)
q
. (A.71)
Proof. Since FE⇤(bK)A FSSF
A (Lemma 3.2), FE⇤(bK)B = FSSF
B and FE⇤(bK)C = FSSF
C
(Lemma 3.3), we obtain
Âq2{A,B,C}
FE⇤(bK)q
PE⇤(bK)q
Âq2{A,B,C}
FSSFq
PE⇤(bK)q
. (A.72)
By the definition (A.56), FSSFq
= FSS,SSFq
forq = A,B,C, hence
Âq2{A,B,C}
FE⇤(bK)q
PE⇤(bK)q
Âq2{A,B,C}
FSSFq
PE⇤(bK)q
= Âq2{A,B,C}
FSS,SSFq
PE⇤(bK)q
. (A.73)
157
Theoretical results for Jockeying Job-Assignments
With Lemma A.13, we now obtain
Âq2{A,B,C}
FSS,E⇤
bK(2)
q
PSS,E⇤
bK(2)
q
= Âq2{A,B,C}
PE⇤(bK)q
FSS,SSFq
+
✓
PSS,E⇤
bK(2)
A �PE⇤(bK)A
◆
⇣
FSS,SSFA �FSS,SSF
B
⌘
+
✓
PSS,E⇤
bK(2)
C �PE⇤(bK)C
◆
⇣
FSS,SSFC �FSS,SSF
B
⌘
� Âq2{A,B,C}
PE⇤(bK)q
FSS,SSFq
� Âq2{A,B,C}
PE⇤(bK)q
FE⇤(bK)q
. (A.74)
This proves the lemma.
Lemma A.15. Consider the system defined in Section 3.2, with K = bK +2, we have
FbK \ F 0
bK ✓ FbK,
where F 0bK✓ {1,2, . . . ,K} has the property that m 2 F 0
bKif and only if Conditions 3.1
and 3.2 hold true.
Proof. We consider the original system as defined in Section 3.2, and its associated
surrogate system. When equations in (A.55) hold true and K = bK +2, Lemma A.10
implies
PSSFA = Â
BbK�1
n=0 pSSFn = Â
BbK�1
n=0 pSS,SSFn = PSS,SSF
A ,
PSSFB = Â
BbK
n=BbK�1+1 pSSF
n = ÂBbK
n=BbK�1+1 pSS,SSF
n = PSS,SSFB ,
and
PSSFC = ÂB
n=BbK+1 pSSF
n = ÂBn=B
bK+1 pSS,SSFn = PSS,SSF
B .
158
A.10 Proof of Proposition 3.8
Using definitions (A.55) and (A.56), we obtain
e4
n4=
e
bK+2µ
bK+2�
e
bK+1µ
bK+1=
e3
n3(A.75)
ande1 + e2
n1 +n2=
ÂbKj=1 e j
ÂbKj=1 µ j
�FA,SSF =e1
n1. (A.76)
We can rewrite (A.76) ase2
n2� e1
n1, (A.77)
since e1, e2, n1 and n2 are all positive.
By Conditions 3.1-3.2 for m = bK, according to Lemma A.11, we obtain
e3
n3�p
2e2
n2. (A.78)
Therefore, under Conditions 3.1-3.2, the surrogate system following (A.55) and
(A.56) is a special case for the system analyzed in Proposition 3.6.
Moreover, based on (A.10) and (A.56), we obtain
Âq2{A,B,C}
Fq ,SSFP
q ,SSF = Âq2{A,B,C}
FSS,SSFq
PSS,SSFq
. (A.79)
According to Lemma A.14, if K = bK +2, l � ÂbK+1j=1 µ j and Conditions 3.1-3.2
hold true for m = bK, then,
Âq2{A,B,C}
FE⇤(bK)q
PE⇤(bK)q
Âq2{A,B,C}
FSS,E⇤
bK(2)
q
PSS,E⇤
bK(2)
q
. (A.80)
159
Theoretical results for Jockeying Job-Assignments
Since the surrogate system follows Proposition 3.6, we obtain
Âq2{A,B,C}
FSSFq
PSSFq
= Âq2{A,B,C}
FSS,SSFq
PSS,SSFq
� Âq2{A,B,C}
FSS,E⇤
bK�1(2)
q
PSS,E⇤
bK�1(2)
q
� Âq2{A,B,C}
FE⇤(bK)q
PE⇤(bK)q
,
(A.81)
which is equivalent to E SSF
L SSF � E E⇤(bK)
L E⇤(bK)for the original system. This proves the
lemma.
The proof of Proposition 3.8.
Proof. According to Lemma A.15 and Corollary 3.1, if K � bK + 2, l � ÂbK+1j=1 µ j
and then Conditions 3.1-3.2 hold true for m = bK, then, E SSF
L SSF � E E⇤(bK)
L E⇤(bK).
This proves Proposition 3.8.
160
Appendix B
Theoretical Proofs for Non-Jockeying
Job-Assignments
The system model and notation used in this appendix has been defined in Chapter 4.
B.1 Proof for Proposition 4.1
Lemma B.1. For the system defined in Section 4.2 with exponentially distributed job
sizes, let an ,gj (n j) denote the action taken in state n j under the policy which optimizes
(4.11) for a given n . The following statements are equivalent:
1. an ,gj (n j) = 1,
2. n l
⇣
V n
j (n j +1,Rgj)�V n
j (n j,Rgj)⌘
,
where n j = 1,2, . . . ,B j�1, j = 1,2, . . . ,K, n 2 R.
Proof. According to (4.11) and (4.12), observe that, for a given Rgj , n j = 1,2, . . . ,B j�
1, j 2J , there exists a value n
⇤j (n j,R
gj) 2 R, satisfying
an ,gj (n j) =
8
>
<
>
:
1, n n
⇤j (n j,R
gj),
0, otherwise.
161
Theoretical Proofs for Non-Jockeying Job-Assignments
We rewrite (4.11) as
V n
j (n j,Rgj) = max
⇢
µ j� e⇤e j�gµ j
+V n(n j�1,R⇤j),
µ j� e⇤e j�gl +µ j
� n
l +µ j+
l
l +µ jV n(n j +1,R⇤j)+
µ j
l +µ jV n(n j�1,R⇤j)
�
,
(B.1)
where 1l+µ j
and 1µ j
are the average lengths of one sojourn in state n j with and without
new arrivals (tagged and untagged), respectively. We obtain
n
⇤j (n j,R
gj) = l
⇣
V n
j (n j +1,Rgj)�V n
j (n j�1,Rgj)⌘
�l (µ j� e⇤e j�g)
µ j(B.2)
Again, by (B.1), if an ,gj (n j) = 1, then
V n
j (n j,Rgj)�V n
j (n j�1,Rgj)
=l
µ j
⇣
V n
j (n j +1,Rgj)�V n
j (n j,Rgj)⌘
+µ j� e⇤e j�g
µ j� n
µ j. (B.3)
Recall that an ,gj (n j) = 1 when n n
⇤j (n j,R
gj), where the identity indicates no dif-
ference between an ,gj (n j) = 0 and an ,g
j (n j) = 1. We conclude that n
⇤j (n j,R
gj) =
l
⇣
V n
j (n j +1,Rgj)�V n
j (n j,Rgj)⌘
for each n j = 1,2, . . . ,B j � 1. This proves the
lemma.
Lemma B.2. For the system defined in Section 4.2 with exponentially distributed job
sizes, we have V n
j (n j +1,Rgj)�V n
j (n j,Rgj)�
µ j�e⇤e j�gµ j
for all n j = 1,2, . . . ,B j�1
and j = 1,2, . . . ,K,
Proof. For j = 1,2, . . . ,K and n j = B j�1, V n
j (B j,Rgj)�V n
j (B j�1,Rgj) =
µ j�e⇤e j�gµ j
.
According to (B.1), if an ,gj (n j) = 0, n j = 1,2, . . . ,B j � 1, then, V n
j (n j,Rgj)�
V n
j (n j � 1,Rgj) =
µ j�e⇤e j�gµ j
. Also, if an ,gj (n j) = 1, n j = 1,2, . . . ,B j � 1, then we
162
B.1 Proof for Proposition 4.1
obtain (B.3). Based on Lemma B.1, we find
V n
j (n j,Rgj)�V n
j (n j�1,Rgj)
�µ j� e⇤e j�g
µ j+
l
µ j
⇣
V n
j (n j +1,Rgj)�V n
j (n j,Rgj)⌘
� 1µ j
l
⇣
V n
j (n j +1,Rgj)�V n
j (n j,Rgj)⌘
=µ j� e⇤e j�g
µ j.
(B.4)
This proves the lemma.
Next we start the proof of Proposition 4.1.
Proof. For j = 1,2, . . . ,K, if an ,gj (n j) = 0 then, according to Lemmas B.1 and B.2,
we see that
n > l
⇣
V n(n j +1,Rgj)�V n(n j,R
gj)⌘
�l (µ j� e⇤e j�g)
µ j.
It remains to prove that if n >l (µ j�e⇤e j�g)
µ j, then an ,g
j (n j) = 0.
From the definition, V n
j (B j,Rgj)�V n
j (B j�1,Rgj) =
µ j�e⇤e j�gµ j
, and according to
Lemma B.1, n
⇤j (B j�1,Rg
j) =l (µ j�e⇤e j�g)
µ j.
Finally, we complete the proof by induction. Assume that n
⇤j (n,R
gj)=
l (µ j�e⇤e j�g)µ j
,
for all n � n j, n j = 2,3, . . . ,B j� 1. If n >l (µ j�e⇤e j�g)
µ j, then an ,g
j (n) = 0 for all
n� n j; that is, V n
j (n,Rgj)�V n
j (n�1,Rgj) =
µ j�e⇤e j�gµ j
, for all n� n j. Together with
Lemma B.1, n
⇤j (n j� 1,Rg
j) = l
⇣
V n
j (n j,Rgj)�V n
j (n j�1,Rgj)⌘
=l (µ j�e⇤e j�g)
µ j. In
other words, if n >l (µ j�e⇤e j�g)
µ j, then n > n
⇤j (n j�1,Rg
j); that is, an ,gj (n j�1) = 0.
This proves the lemma.
163
Theoretical Proofs for Non-Jockeying Job-Assignments
B.2 Proof for Proposition 4.2
Proof. We now dicuss the action made in state 0; that is, af jj (0). Let p j,n j be the
steady state distribution of state n j 2N j under policy f
⇤j over the process PH
j . We
consider the following problem.
max
(
�e⇤e0j , (�e⇤e0
j )p j,0 +(1�p j,0)(µ j� e⇤e j)�B j�1
Ân j=1
an
j (n j)p j,n jn�p j,0n
)
.
(B.5)
It follows that an
j (0) = 1 is equivalent to
n (1�p j,0)(µ j� e⇤e j + e⇤e0
j )
B j�1Â
n j=1an
j (n j)p j,n j +p j,0
. (B.6)
Based on Proposition 4.1, if n l (µ j�e⇤e j�g)µ j
where g is a given value, then
an ,gj (n j) = 1 for all n j = 1,2, . . . ,B j � 1; otherwise, an ,g
j (n j) = 0. According to
our definitions and Corollary 4.4.3, an
j (n j) = an ,g⇤j (n j), n j = 1,2, . . . ,B j�1, where
g⇤ > 0 is the average reward of process PHj under policy f
⇤j 2FH
j . Now we split
the establishment of (B.6) into two cases.
If n l (µ�e⇤e j�g⇤)µ j
, then, (B.6) is equivalent to
n
B j
Âi=1
⇣
l
µ j
⌘i(µ j� e⇤e j + e⇤e0
j )
B j�1Â
i=0
⇣
l
µ j
⌘i=
l
µ j
�
µ j� e⇤e j + e⇤e0j�
. (B.7)
According to the definition of our model in Section 4.2, (B.7) is valid when e
0j � 0
and n l (µ j�e⇤e j+e⇤e0j )
µ j.
If n >l (µ j�e⇤e j�g⇤)
µ j, then (B.6) is equivalent to
n l
µ j
�
µ j� e⇤e j + e⇤e0j�
. (B.8)
164
B.3 Proof for Proposition 4.3
As a consequence,
n
⇤j (0) =
l
µ j
�
µ j� e⇤e j + e⇤e0j�
. (B.9)
B.3 Proof for Proposition 4.3
Proof. By virtue of Proposition 4.2, Proposition 4.3 is proved for n j = 0. We assume
without loss of generality that n l (µ j�e⇤e j+e⇤e0j )
µ j, i.e., an ,g(0) = 1.
According to Corollary 4.4.3, we obtain
g⇤ = Ân j2N j
p j,n j
⇣
R j(n j)�an ,g⇤(n j)n⌘
. (B.10)
Together with Proposition 4.1, we rewrite n n
⇤j (n j,R
g⇤j ) for n j = 1,2, . . . ,B j�1 as
n l
µ j
�
µ j� e⇤e j� (1�p j,0)(µ j� e⇤e j)�p j,0(�e
0j )+n(1�p j,B j))
�
=l
µ j
�
p j,0(µ j� e⇤e j + e⇤e0j )+n(1�p j,B j)
�
, (B.11)
where
n l
µ j(µ j� e⇤e j + e⇤e0
j ). (B.12)
As was the case for n n
⇤(n j,Rg⇤j ), for n j = 1,2, . . . ,B j� 1, we rewrite n >
n
⇤(n j,Rg⇤j ) as
n >l
µ j(µ j� e⇤e j + e⇤e0
j ). (B.13)
Equations (B.12) and (B.13) together prove the proposition.
165
Theoretical Proofs for Non-Jockeying Job-Assignments
B.4 Consequences of the averaging principle
For K0i , i = 1,2, . . . , K with ÂK
i=1 K0i = K0, define random variables xt as follows.
Let t0,k, k = 1,2, . . . and ti,k, i = 1,2, . . . ,K0, k = 1,2, . . ., be the times of the kth
arrival and of the kth departure from server i, respectively. We assume without loss
of generality that K = K0. For the server farm, inter-arrival and inter-departure times
are positive with probability one and, also with probability one, two events will not
occur at the same time. Define a random vector x
x
x t = (x0,t ,x1,t , . . . ,xK0,t) as follows.
For j = 0,1, . . . ,K0,
x j,t =
8
>
>
>
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
>
>
>
:
t j,k� t j0,k0 , t j,k = mink00=1,2,...
{t j,k00 |t j,k00 > t},
t j0,k0 = maxj00=0,1,...,K0,k00=1,2,...
{t j00,k00 |t j00,k00 < t j,k},
t j0,k0 t < t j,k,
0, otherwise.
(B.14)
Here, the x
x
x t is almost surely continuous except for a finite number of discontinuities
of the first kind in any bounded interval of t > 0.
For x 2 RI , and i = 1,2, . . . , I, we define the action for the Whittle’s index policy
as
aindexi (x) =
8
>
>
<
>
>
:
1 i = min{i|x�i > 0, i = 1,2, . . . , I},
0 otherwise,(B.15)
where x�i = Âik=1 xk, and the action for the optimal solution of the relaxed problem
is
aOPTi (x) =
8
>
>
<
>
>
:
1 n
⇤i � n , xi > 0,
0 otherwise,(B.16)
for given state indices n
⇤i , i 2 ˜N {0,1}[ ˜N {0}, and n .
166
B.4 Consequences of the averaging principle
We define a function, Qf (i, i0,x,xxx ), where f 2 F, i, i0 2 ˜N {0,1} [ ˜N {0}. For
given x
x
x = (x0,x1, . . . ,xK0) 2 RK0+1 and x = (x1,x2, . . . ,xI) 2 RI , Qf (i, i0,x,xxx ) is
given, for i�1, i, i+1 2 ˜N {0,1}j [ ˜N {0,1}
j , j = 1,2, . . . , K by
Qf (i, i+1,x,xxx ) = af
i (x)1x0+ f 0
i,a(x,xxx ),
Qf (i, i�1,x,xxx ) =dx�i eÂ
j=dx�i�1e+1
1x j+ fi,a(x,xxx ),
Qf (i, i0,x,xxx ) = 0, otherwise,
(B.17)
where f is set to be either index or OPT , x�i =Âik=1 xk, and, with 0< a< 1, f 0
i,a(x,xxx )
and fi,a(x,xxx ) are appropriate functions to make Q(i, i0,x,xxx ) smooth in x for all given
x
x
x 2 RK0+1 and 0 < a < 1. Here a is a parameter controlling the Lipschitz constant.
Then, for any given 0 < a < 1, Qf (i, i0,x,xxx ) is bounded and satisfies a Lipschitz
condition over any bounded set of x 2 RI and x
x
x 2 RK0+1.
For 0 < a < 1 and e > 0, we define Xf ,et to be a solution of the differential
equation
Xf ,et = bf (Xe
t ,xxx t/e
)
= Âi02 ˜N {0,1}[ ˜N {0}
Qf (i0, i,Xe
t ,xxx t/e
)�Qf (i, i0,Xe
t ,xxx t/e
), (B.18)
where f is set to be either index or OPT . It follows that bf (Xe
t ,xxx t/e
) also satisfies a
Lipschitz condition on bounded sets in RI⇥RK0+1.
From the above definitions, for any x 2 RI , d > 0, there exists bf
(x) satisfying
limT!+•
P⇢
�
�
�
�
1T
Z t+T
tbf (x,xs)ds�bf
(x)�
�
�
�
> d
�
= 0, (B.19)
uniformly in t > 0. Let xf
t be the solution of xf
t = bf
(xt), xf
0 = Xf ,e0 .
167
Theoretical Proofs for Non-Jockeying Job-Assignments
Now we invoke [114, Chapter 7, Theorem 2.1]: if (B.19) holds, and E�
�bf (x,xxx t)�
�
2<
+• for all x 2 RI , then, for any T > 0, d > 0,
lime!0
P
(
sup0tT
�
�
�
Xf ,et �xf (t)
�
�
�
> d
)
= 0. (B.20)
We scale the time line of the stochastic process by e > 0. As e tends to zero,
time is speeded up, and in this way the stochastic process {Xf ,et }, driven by the
random variable x
x
x t/e
, converges to the deterministic process {xf (t)} defined by the
differential equation bf
(x).
Now we interpret the scaling by e in another way. Along similar lines, for a
positive integer n and a scaled system, where Ki = nK0i replaces K0
i , i = 1,2, . . . , K,
and K = nÂKi=1 K0
i , we define tnj,k, j = 0,1, . . . ,K, k = 1,2, . . . and the random vari-
ables x
x
x
nt by analogy with the unscaled systems. Then the random variables x
nj,t ,
j = 1,2, . . . ,K, and x
n0,t are exponentially distributed with rate l
0i , i= b( j�1)/nc+1
and nl
00 , respectively, where l
0i , i = 0,1, . . . ,K0, are the corresponding rates for
random variables xi,t . We then define, Qf ,n(i, i0,x,xxx n) for x 2 RI and x
x
x
n 2 RnK0+1
as in Equation (B.17), with appropriate modifications for the change in dimension,
where again the functions f 0,ni,a (x,xxx n) and f n
i,a(x,xxxn) are apprppriately defined to
guarantee smoothness of Qf ,n(i, i0,x,xxx ) x for all given x
x
x
n 2 RnK0+1 and 0 < a < 1.
Here, again, a is a parameter controlling the Lipschitz constant, and f is set to be
either index or OPT , x�i = Âik=1 xk,
In the same vein, for x 2 RI and x
x
x
n 2 RnK0+1, a differential equation is given by
bf ,n(x,xxx n)
= Âi02 ˜N {0,1}[ ˜N {0}
Qf ,n(i0, i,x,xxx n)�Qf ,n(i, i0,x,xxx n). (B.21)
If we set e = 1/n then, for any x 2 RI , n > 0 and T > 0,R T
0 bf (x,xt/e
)dt andR T
0 (bf ,n(nx,x nt )/n)dt are equivalently distributed with f set to be either index or
168
B.4 Consequences of the averaging principle
OPT . We define Zf ,e0 = Zf ,n
0 = x0/(K0 +1) (there is a zero-reward virtual server),
and
Zf ,nt =
1n(K0 +1)
bf ,n(n(K0 +1)Zf ,nt ,x n
t ),
and
Zf ,et =
1K0 +1
bf ((K0 +1)Zf ,et ,xt/e
).
From (B.20), we obtain
limn!+•
P
(
sup0tT
�
�
�
Zf ,nt �xf (t)/(K0 +1)
�
�
�
> d
)
= 0, (B.22)
for f set to be either index or OPT . Therefore the scaling of time by e = 1/n is
equivalent to scaling of system size by n.
Because of the Lipschitz continuity of Zf ,nt and xf
(t) on 0< a< 1, lima!0 dZnt /da=
0 and lima!0 dxf
(t)/da = 0, equation (B.22) holds true in the limiting case as a! 0.
Also, if Zf ,n0 = Zf (0), and xf (0)/(K0 +1) = zf (0), then lima!0 xf (t)/(K0 +1)!
zf (t) and lima!0 Zf ,nt ! Zf (t), where f is set to be either index or OPT , Zf (t)
represents the proportions of servers under policy f at time t and zf (t) is given by
(4.19) (as defined in Section 4.4.4). This leads to (4.21).
169