a survey on load balancing algorithms for virtual machines ... · 2016; 00:1–20 a survey on load...

2016; 00:1–20

A Survey on Load Balancing Algorithms for Virtual MachinesPlacement in Cloud Computing

Minxian Xu1∗, Wenhong Tian 2, Rajkumar Buyya1

1Cloud Computing and Distributed Systems (CLOUDS) Labratory, Department of Computing and Information Systems,The University of Melbourne, Australia

2School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu,China

SUMMARY

The emergence of cloud computing based on virtualization technologies brings huge opportunities to hostvirtual resource at low cost without the need of owning any infrastructure. Virtualization technologiesenable users to acquire, configure and be charged on pay-per-use basis. However, Cloud data centers mostlycomprise heterogeneous commodity servers hosting multiple virtual machines (VMs) with potential variousspecifications and fluctuating resource usages, which may cause imbalanced resource utilization withinservers that may lead to performance degradation and service level agreements (SLAs) violations. To achieveefficient scheduling, these challenges should be addressed and solved by using load balancing strategies,which have been proved to be NP-hard problem. From multiple perspectives, this work identifies thechallenges and analyzes existing algorithms for allocating VMs to PMs in infrastructure Clouds, especiallyfocuses on load balancing. A detailed classification targeting load balancing algorithms for VM placementin cloud data centers is investigated and the surveyed algorithms are classified according to the classification.The goal of this paper is to provide a comprehensive and comparative understanding of existing literatureand aid researchers by providing an insight for potential future enhancements.

Received . . .

KEY WORDS: Cloud computing; Data Centers; Virtual Machine; Placement Algorithms; LoadBalancing

1. INTRODUCTION

In traditional data centers, applications are tied to specific physical servers that are often over-provisioned to deal with the upper-bound workload. Such configuration makes data centersexpensive to maintain with wasted energy and floor space, low resource utilization and significantmanagement overhead. With virtualization technology, Cloud data centers become more flexible,secure and provide better support for on-demand allocation. It hides server heterogeneity, enablesserver consolidation, and improves server utilization [1][2]. A host is capable of hosting multipleVMs with potential different resource specifications and variable workload types. Servers hostingheterogeneous VMs with variable and unpredictable workloads may cause a resource usageimbalance, which results to performance deterioration and violation of service level agreements(SLAs) [3]. Imbalance resource usage [4] can be observed in cases, such as a VM is running acomputation-intensive application while with low memory requirement.

∗Correspondence to: Department of Computing and Information Systems, Doug McDonell Building, The University ofMelbourne, Parkville 3010, VIC, Australia. E-mail: [email protected]

arX

iv:1

607.

0626

9v2

[cs

.DC

] 2

7 Ju

l 201

6

2 M.XU ET AL.

Figure 1. Application, VM and Host relationship in Cloud Data Center

Cloud data centers are highly dynamic and unpredictable due to 1) irregular resource usagepatterns of consumers constantly requesting VMs, 2) fluctuating resource usages of VMs, 3)unstable rates of arrivals and departure of data center consumers, and 4) the performance of hostswhen handling different load levels may vary greatly. These situations are easy to trigger unbalancedloads in cloud data center, and they may also lead to performance degradation and service levelagreement violations, which requires load balancing mechanism to mitigate this problem.

Load balancing in clouds is a mechanism that distributes the excess dynamic local workloadideally-balanced across all the nodes [5]. It is applied to achieve both better user satisfaction andhigher resource utilization, ensuring that no single node is overwhelmed, thus improving the systemoverall performance. For VM scheduling with load balancing objective in cloud computing, itaims to assign VMs to suitable hosts and balance the resource utilization within all of the hosts.Proper load balancing algorithms can help in utilizing the available resources optimally, therebyminimizing the resource consumption. It also helps in implementing fail-over, enabling scalability,avoiding bottlenecks and over-provisioning and reducing response time [6]. Fig. 1 shows theapplication, VM and host relationship in cloud data centers. The hosts at the bottom represent thereal resource for provisions, like CPU, memory and storage resource. Upper the hosts, the servervirtualization platform like XEN, makes the physical resource be virtualized and manages the VMshosted by hosts. The applications are executed on VMs and may have predefined dependenciesbetween them. Each host could be allocated with multiple VMs, and VMs are installed withmultiple applications. Load balancing algorithms is applied both at application level and VM level.At application level, load balancing algorithm is integrated into Application Scheduler, and at VMlevel, load balancing algorithm can be integrated into VM Manager. This survey paper mainlyfocuses on the load balancing algorithms at VM level to improve hosts performance, which is oftenmodeled as bin-packing problem and has been proved as NP-hard problem [7].

The challenges of load balancing algorithms for VM placement † on host lies in follows:Overhead: It determines the amount of overhead involved while implementing a load balancing

system. It is composed of overhead due to VM migration cost or communication cost. A well-designed load balancing algorithm should reduce overhead.

†we note load balancing algorithms for VM placement as VM load balancing algorithms in the following sections

(2016)

A SURVEY ON LOAD BALANCING ALGORITHMS FOR VM PLACEMENT IN CLOUD COMPUTING 3

Performance: It is defined as the efficiency of the system. Performance can be indicated fromusers experience and satisfaction. How to ensure performance is a considerate challenge for VMload balancing algorithms. The performance includes following perspectives:

1) Resource Utilization: It is used to measure whether a host is overloaded or underutilized.According to different VM load balancing algorithms, overloaded hosts with higher resourceutilization should be offloaded.

2) Scalability: It represents that the quality of service keeps smooth even if the number of usersincreases, which is associated to algorithm management approach, like centralized or distributed.

3) Response Time: It can be defined as the amount of time taken to react by a load balancingalgorithm in a cloud system. For better performance, this parameter should be reduced.

The point of Failure: It is designed to improve the system in such a way that the single pointfailure does not affect the provisioning of services. Like in centralized system, if one centralnode fails, then the whole system would fail, so load balancing algorithms should be designed toovercome this problem.

In this survey, we extend and complement the classifications from existing survey works throughanalyzing the different characteristics for VM load balancing comprehensively, like the schedulingscenario, management approaches, resource type, VM type uniformity and allocation dynamicity.We also summarize the scheduling metrics for VM load balancing algorithms, and these metricscould be used to evaluate the load balancing effects as well as other additional scheduling objectives.We then discuss performance evaluation approaches followed by existing work, which show thepopular realistic platforms and simulation toolkits for researching VM load balancing algorithms inCloud. Through a detailed discussion of existing VM load balancing algorithms, the strength andweakness of different algorithms are also presented in this survey.

The rest of the paper is organized as follows: Section 2 introduces the related technologyfor VM load balancing and the general VM load balancing scenarios as well as managementapproaches. Section 3 discusses models for VM load balancing, including VM resource type,VM type uniformity, VM dynamicity, and scheduling process while Section 4 presents differentscheduling metrics for load balancing algorithms. Section 5 compares different algorithms fromimplementation and evaluation perspective. Detailed introductions for a set of VM load balancingalgorithms are summarized in Section 6. Finally, conclusions and future directions are given inSection 7.

2. VM LOAD BALANCING SCENARIO AND MANAGEMENT

2.1. Related Technology

Before we discuss the VM load balancing algorithms, we firstly introduce some related technologiesfor load balancing.

Virtualization technology: Virtualization reinforces the ability and capacity of existinginfrastructure and resource, and opens opportunities for cloud data centers to host applicationson shared infrastructure.VM technology was firstly introduced in the 1960s and has been widelyexploited in recent years for consolidating hardware infrastructure in enterprises data centers withtechnologies like VMware [8] and Xen [9].

VM Migration: Live migration of virtual machines [10] means the virtual machine seems tobe responsive all the time during the migration process from the user perspective. Comparedwith traditional suspend/resume migration, live migration brings many benefits such as energysaving, load balancing and online maintenance [11]. Voorsluys et al [12] evaluate the effects of livemigration for VMs on the performance of applications running inside Xen VMs and show the resultsthat migration overhead is acceptable but cannot be disregarded. Since the live migration technologyis widely supported in the current cloud computing data center, live migration of multiple virtualmachines becomes a common activity.

VM Consolidation: The VM consolidation is also implemented in cloud computing dependingon the resource requirements of VMs. VM consolidation increases the number of servers that have

(2016)

4 M.XU ET AL.

been suspended and performs VM live migration. This also helps in implementing fault toleranceby migrating the VMs from failure.

2.2. Scenario

We outline the scenarios for VM load balancing algorithms as public cloud, private cloud and hybridcloud. Under different scenarios, the algorithms may have different constraints.

Public Cloud: The public cloud refers to when a Cloud is made available in a pay-as-you-gomanner [13]. Several key benefits to service providers are offered by the public cloud, including noinitial capital investment on infrastructure and shifting of risks to infrastructure providers. However,public clouds lack fine-grained control over data, network and security settings, which hamperstheir effectiveness in many business scenarios [14]. Various and frequently changing APIs makeit difficult to capture all the VMs and hosts information in this scenario easily, due to the lack ofstandardization. Unpredictable load or periodical load is another challenge for VM load balancingalgorithms, therefore some research has adopted historic data to predict future load to overcome thischallenge [15][16].

Private Cloud: The Private Cloud term refers to internal datacenters of a business or otherorganization not made available to the general public. Although a public Cloud has the benefitof reduced capital investment and better deployment speed, private Clouds are even more popularamong enterprises according to a survey by IDG in [17]. The survey revealed that companies tendto optimize existing infrastructure with the implementation of a private Cloud which results in alower total cost of ownership. In some academic experiments, the private clouds with mini size areimplemented to evaluate VM load balancing performance. As within private cloud, more complexload balancing algorithms could be deployed and tested by defining more constraints like limitingthe number of migrations. Compared to the public cloud, the loads are comparatively predicted andcontrolled, so heuristic algorithms like Ant Colony Optimization and Particle Swarm Optimizationcould be applied. An example of the private cloud is the intra-cloud network that connects acustomers instances among themselves and with the shared services offered by a cloud. Withina cloud, the intra-datacenter network often has quite different properties compared to the inter-datacenter network [18]. Therefore, dealing with the VM load balancing problem in a private cloud,the performance like throughput would be considered as a constraint.

Hybrid clouds: A hybrid cloud is a combination of public and private cloud models that tries toaddress the limitations of each approach. In a hybrid cloud, part of the service infrastructure runs inprivate clouds while the remaining part runs in public clouds. Hybrid clouds offer more flexibilitythan both public and private clouds. Specifically, they provide tighter control and security overapplication data compared to public clouds, while still facilitating on-demand service expansion andcontraction. On the downside, designing a hybrid cloud requires carefully determining the best splitbetween public and private cloud components [19]. Under this condition, the communication costwould be the main constraint for VM load balancing algorithms. For instance, a distributed cloudin hybrid cloud scenario, the provider may receive a request with specific locational constraints. Inaddition, in a multi-cloud that involves two or more clouds (public cloud and private cloud) [20],the migrations operations may be related to load migration from a private cloud to a public cloud.

2.3. Centralized and Distributed Management

Generally, load balancing algorithms are implemented in load scheduler, like a scheduling systemmay contain a centralized scheduler or distributed schedulers.

Centralized: The central load balancing algorithm in Clouds are commonly supported by acentralized controller that balances VMs to hosts as shown in Fig.2, like Red Hat EnterpriseVirtualization Suite [21]. The benefits of a central management algorithm for load balancing arethat they are simpler to implement, easier to manage and quicker to repair in case of a failure. Ascentral algorithms need to obtain the global information (utilization, load, connections informationand etc.), in these algorithms, the schedulers are centralized to monitor for better management,the best-fit algorithm is the typical example, as well as algorithms introduced in articles [22] [23][24] [25] [26]. In each execution process of the centralized algorithms, the statuses of all hosts are

(2016)


Figure 2. Centralized Scheduler

collected, analyzed, reordered to provide information for VM allocation. In heuristic algorithms,like greedy algorithms, the centralized scheduler allocates VMs to the hosts with the lowest load.In meta-heuristic algorithms, like genetic algorithms [27] [15], the centralized scheduler controlscrossover, mutation, interchange operations to achieve better VM-host mapping results according tofitness functions.

Distributed: Centralized load balancing algorithms rely on a single controller to monitor andbalance loads for whole system, which may be the system bottleneck. To relieve this problem, asshown in Fig. 3, a distributed load balancing algorithm enables the scheduling decision made bylocal scheduler on each node and the associated computation overhead is distributed.

The distributed algorithm eliminates the bottleneck pressure posed by the central algorithmscheduler and improves the reliability and scalability of the network. While the drawback ofdistributed algorithm is the cooperation of a set of distributed scheduler and the control planeoverhead is required. This overhead should be taken into consideration when comparing theperformance improvement [28]. Cho et al. [29] proposed ACOPS by combining ant colonyoptimization and particle swarm optimization together to improve VM load balancing effects andreduce overhead by enhancing convergence speed.

3. VM LOAD BALANCING ALGORITHM MODELING IN CLOUDS

In this section, we will discuss the details when designing VM load balancing algorithm. Basically,the algorithm should consider VM resource type, VM type uniformity, allocation dynamicity,optimization strategy, scheduling process.

(2016)

6 M.XU ET AL.

Figure 3. Distributed Scheduler

3.1. VM Resource Type

When designing load balancing algorithm for VMs, the administrator can focus on single resourcetype or multiple resource type for scheduling.

Single Resource Type: In this category, the VM resource type that considered for balancing islimited to single resource type, generally the CPU resource. This assumption is made to simplifythe load balancing process without considering other resource types, which is common in balancingVMs running computational intensive tasks.

Multiple Resource Type: Multiple resource type is considered in some algorithms, whichmonitors not only CPU load but also memory load, I/O load and etc. These algorithms admitthe fact that cloud provider would offer heterogeneous and other resource-intensive types of VMsfor resource provision. The general techniques to deal with multiple resource type are throughconfiguring different resource types with weights [25][30][16] or labeling resource type withpriorities for scheduling [23].

3.2. VM Type Uniformity

Assumptions could be made for VM type as homogeneous or heterogeneous when scheduling VMsfor load balancing.

Homogeneous: In this category, VM instances offered by cloud provider are limited to ahomogeneous type. Like the single resource type, this assumption is also made to simplify thescheduling process and ignores the diverse characteristic of tasks. While this assumption is rarelyadopted in a real cloud environment, in addition it has a negative impact on the outcome of thealgorithm and it fails to take full advantage of the heterogeneous nature of cloud resource.

Heterogeneous: Cloud service providers like Amazon has offered more than 50 types of VMs inEC2 to support various task characteristics and scheduling objectives, like for general purposes,compute optimized, memory optimized and so on. The algorithm that considers heterogeneous

(2016)


VM type selects the PM with corresponding type to allocate according to task characteristic andscheduling objectives.

3.3. VM Allocation Dynamicity

According to VM allocation dynamicity, load balancing algorithms for VM allocation can beclassified as static and dynamic categories:

Static: Algorithms in this class are offline algorithms that the information of VMs needed to beallocated to hosts is known in advance. While static resource allocation algorithms in Clouds areeasy to violate the requirements of dynamic VM allocation, in which demands are changing overthe time. The resource would be wasted if VM with adequate capacity only is lighted utilized, andperformance degradation happens if VM with limited capacity is allocated with more loads.

Dynamic: Algorithms in this class are online algorithms that VMs could be allocated dynamicallyaccording to the loads at each time interval. The load information of VM would not be obtained untilit comes into the scheduling stage. These algorithms could dynamically configure the VM placementcombining with VM migration technique.

3.4. Optimization Strategy

As a NP-hard problem, it is expensive to find the optimal solutions for algorithms. Therefore,the majority of proposed algorithms are focusing on finding approximate solutions for VM loadbalancing problem. For this category, we classify the surveyed algorithms as three types: heuristic,meta-heuristic and hybrid.

Heuristic: Heuristic is a set of constraints that aim at finding a good solution for a particularproblem [31]. The constraints are problem dependent and are designed for obtaining a solutionin a limited time. In our surveyed algorithms, algorithms have various constraints, like numberof migrations, SLAs, cost and etc., thus, the optimization functions are constructed in differentways. The advantage of heuristic algorithms is that they can find a satisfactory solution efficiently,especially in limited time cost. In addition, heuristic algorithms are easier to implement incomparison to meta-heuristic algorithms. As heuristic algorithms run fast, they are suitable foronline scheduling that requires system to response in time. Greedy algorithm is a type of heuristicalgorithms and is applied in [22] [23] [25] to quickly obtain a solution for online schedulingscenario.

Meta-heuristic: Different from heuristic algorithms, meta-heuristic algorithms are mainlydesigned for a general purpose problem [31]. Therefore, meta-heuristic algorithms follow a setof uniform procedures to construct and solve problems. The typical meta-heuristic algorithmsare inspired from nature, like Genetic algorithms, Ant Colony Optimization Particle SwarmOptimization and Honeybee Foraging algorithms. These algorithms are based on populationevolutions and obtaining the best population in each evolution and keep it into next evolution. Adistributed VM migration strategy based on Ant Colony Optimization is proposed in [16]. AntColony Optimization and Particle Swarm Optimization are combined in [29] to deal with VM loadbalancing. The results in these proposed strategies show that better load balancing effects can beachieved compared to heuristic algorithms. However, in comparison to heuristic algorithms, meta-heuristic algorithms need more time to run and find the final solution as its solution space can bequite large. Moreover, the meta-heuristic are generally stochastic processes and their convergencetime and solution results depend on the nature of problem, initial configurations and the way tosearch the solutions.

Hybrid For hybrid algorithm, heuristic algorithm is used to fulfill the initial VM placement andmeta-heuristic algorithm is used to optimize the placement of VMs during migration. It can alsobe applied as using meta-heuristic algorithms to generate a set of solutions and finally applyingheuristic algorithms to obtain the final solution. In either way, the time cost and solution space areboth reduced, while the implementation complexity increases. Thiruvenkadam et al. [32] proposeda hybrid genetic algorithm that follows the first approach.

(2016)

8 M.XU ET AL.

3.5. Scheduling Process Modeling

The load balancing scheduling process can be mainly divided into scheduling at the VM initialplacement stage and scheduling at the VM live migration stage. Some research has focused onthe VM load balancing at the initial placement stage without considering live migration [26] [23][24] [16][33]. At this stage, the key component of the scheduling process is the VM acceptancepolicy, which decides the host placement that the is allocated to. The policy generally takes the hostremaining resource into consideration.

As for the live migration stage in scheduling process, it mainly considers following aspects:(1) VM migration policies enable cloud data center to establish preferences over when VMs

are migrated to other hosts. A VM migration policy indicates when to trigger a VM migrationfrom one host to another host. Generally, a VM migration policy consists of a migration thresholdto trigger migration operations, and the threshold is decided by a data center administrator basedon the computing capabilities of each host as in Red Hat [21] and VMware [8]. For instance, aCPU-intensive host may be configured with a relatively high threshold on CPU usage, while an I/Ointensive host may be configured with a relatively low threshold on CPU usage.

(2) VM selection policies enable cloud data centers to establish polices which VMs should bemigrated from overloaded hosts. As an overloaded host has a high probability of hosting multipleVMs. A VM selection policy decides which VMs should be migrated to reduce the load of theoverloaded host and satisfy other objectives, like minimizing the number of migrations [15] [29]and reducing migration latency [15].

(3) VM acceptance policies enable cloud data center to establish approaches which VMs shouldbe accepted from other overloaded hosts in the process of conducting collaborative load balancingamong hosts by using VM live migration. A VM acceptance policy consists of 1) remaining resourceof hosts, 2) an associated resource type either CPU or memory, and 3) a category either above orbelow a certain remaining resource amount. The remaining resource amount indicates the maximumor the resource usage level that a VM is enduring. Then, VM acceptance policies are used by loadbalancing algorithms to determine whether to bid for hosting a given VM.

4. LOAD BALANCING SCHEDULING METRICS COMPARISON

For VM load balancing, there are different metrics to evaluate the performance of load balancingalgorithms. These metrics are reflected from different scheduling behavior (maximal or minimal).In this section, we introduce prominent metrics adopted in VM load balancing algorithms, likeutilization standard deviation, makespan and etc. Table I lists the metrics adopted in our surveyedalgorithms and their optimization behavior.

Load Variance: Assuming there are n hosts in data center, the utilization of host i is U(hosti),the average utilization of all hosts in data center is avg(U) = 1

n

∑ni=1 hosti, the standard deviation

of utilization [34] equals to 1n

∑ni=1(hosti − avg(U))2

Makespan: Makespan is the longest processing time on all hosts and it is one of the mostcommon criteria for evaluating a scheduling algorithm. Sometimes, keeping the load balanced is toshorten the makespan, and a shorter makespan is the ultimate purpose of a scheduling algorithm[29].

Standard Deviation of Utilization: The standard deviation of utilization is calculated as thearithmetic square root of load variance:

√1n

∑ni=1(hosti − avg(U))2, which is also a popular

metric as it can reflect the load variation on hosts.

Number of Overloaded Hosts: The overloaded threshold of host is defined as T (U), and thereare n hosts in data center, assuming that the host utilization of host i is U(hosti) , the number of

(2016)


overloaded hosts can be computed as num(T (U) ≤ U(hosti))

Percent of all VMs to be Located: algorithms defines the minimum (LOCmin) and maximum(LOCmax) percentage of all VMs to be located in each host or cloud. This metric is modeledas LOCmin ≤ (

∑ni=1

∑lj=1)xijk)/n ≤ LOCmax , 1 ≤ k ≤ m in the integer programming

formulation, where m is the number of VMs, l is the fixed number of possible hardwareconfigurations of the VMs, xijk = 1 represents vmi is allocated on hostk [24].

Quadratic Equilibrium Entropy: Using variable X to represent current information usageof VM and another variable Y to represent VM load balance measurement on one moment,HY (X), Y = 1, 2, . . . ,m, m ≥ 2, let p(y) be percentage of VM load balancing measurement vs.overall measurement on time y, as p(y) =

Hy(X)∑mj=1

Hj(X), then the quadratic equilibrium entropy is

HY (X) = −∑m

i=1 p(y)logp(y). Quadratic equilibrium entropy can reflect the stability of VM loadbalancing degree during the system life cycle [35].

Throughput: Hosts with imbalanced load would influence throughput. Therefore higherthroughput comes along better system load balancing situation.

Standard Deviation of Connections: The connections could also be regarded as a kind ofloads, and to some degree its standard deviation can also demonstrate the load balance as standarddeviation of utilization [36].

Average Imbalance Level: The average imbalance level is obtained by a series of inductions.Assuming average CPU utilization of a single CPU i is (CPUU

i ) , then the average utilization of

all CPUs in a Cloud datacenter is CPUAu =

∑Ni CPUU

i CPUni∑N

i CPUni

, where CPUni is the total number of

CPUs of server i, then the average utilization of all CPUs on server i is CPUAu =

∑Ni CPUU

i CPUni∑N

i CPUni

,where N is the total number of physical servers in a Cloud datacenter. Similarly, average utilizationof memory, network bandwidth of server i, all memories and all network bandwidth in a Clouddatacenter can be defined as MEMU

i , NETUi respectively. As for the integrated load imbalance

value (ILBi) of server i, by using variance, an integrated load imbalance value (ILBi) of server i isdefined as: (Avgi−CPUA

u )2+(Avgi−MEMAu )2+(Avgi−NETA

u )2

3 , where Avgi = (CPUUi +MEMU

i +NETU

i )/3, and ILBi could be applied to indicate the load imbalance level comparing utilizationof CPU, memory and network bandwidth of a single server itself.

Also using variance, the imbalance value of all CPUs in a data center is defined asIBLCPU =

∑Ni (CPUU

i − CPUAu )2. Similarly, imbalance values of memory (IBLmem)

and network bandwidth (IBLnet) can be calculated. Then total imbalance values of all serversin a Cloud datacenter is given by IBLtot =

∑Ni ILBi, and the average imbalance value of a

physical server i is defined as IBLPMavg = IBLtot

N , where N is the total number of servers. Asits name suggests, this value can be used to measure average imbalance level of all physicalservers. Finally, The average imbalance value of a Cloud datacenter (CDC) is defined asIBLCDC

avg = IBLCPU+IBLmem+IBLnet

N [25] .

Capacity-makespan: In any allocation of VM requests to hosts, let A(i) denote the set of VMrequests allocated to hosti. Under this allocation, hosti will have total load equal to the sum ofproduct of each required capacity and its duration. as capacity-makespani =

∑j∈A(i) djtj , where

dj is the capacity requests of VMj from a host and tj is the span of VM request j [30]. This metricscombines the load and lifecycle compared with traditional metrics without considering lifecycle.

Imbalance Score: An exponential weighting function is defined as IBScore()f, T =∫ 0 if f<T

e(f−T )/T otherwise, where f is the load usage fraction and T is the corresponding threshold for

a resource. The imbalance score for a node u is obtained by summing over the imbalance scores

(2016)

10 M.XU ET AL.

Table I. Metrics Comparison

Metrics Optimization Behavior AlgorithmLoad Variance Minimize [34]Standard Deviation of Utilization Minimize [23] [37] [38]Makespan Minimize [29]Number of Overloaded Hosts Minimize [22]Percent of all VMs to be Located in Host Minimize and Maximize [24]Quadratic Equilibrium Entropy Minimize [35] [34]Throughput Improve [32]Standard Deviation of Connections Minimize [36]Average Imbalance Level Minimize [25]Capacity-makespan Minimize [30] [26]Imbalance Score Minimize [9]Remaining Resource Standard Deviation Minimize [27]Number of Migrations Reduce or Minimize [15] [16]SLA Violations Minimize [16]

along all the dimensions of its vector, IBscore(u) =∑

i IBscore(NLFVeci(u), NTVeci(u)),where NLFVeci(u)is a short for the ith component of NodeLoadFracV ec(u) and NTV eci(u)is short for the same along NodeThresholdV ec(u). The total imbalance score of the system isobtained by summing over all the nodes as TotalIBScore =

∑u IBscore(u). This metric should

be minimized to obtain better load balance effects.

Number of Migrations: This is an auxiliary metric that represents the performance. Too manymigrations may achieve balanced loads but lead to performance degradation, therefore it is atradeoff metric between load balancing and performance.

SLA Violations: This is another auxiliary metric that represents the performance. SLA violationcan be defined as a VM can not fetch enough resources from host, like (CPU mips [16]). Toomany SLA violations would show that the hosts are not balanced well, hence this metric should beminimized.

5. PERFORMANCE EVALUATION APPROACHES COMPARISON

In this section, we will discuss some realistic platforms and simulation toolkits that have beenadopted for VM load balancing performance evaluation as illustrated in Fig. 4.

5.1. Realistic Platforms

Conducting experiments under realistic environment is more persuasive and there exist somerealistic platforms for performance testing.

OpenNebula: It is an open source platform aims to build industry standard open source cloudcomputing tool to manage the complexity and heterogeneity of large and distributed infrastructures.It also offers rich features, flexible ways, and better interoperability to build clouds. By combiningvirtual platforms like KVM, OpenNebula Cloud APIs for VMs operations and Ganymed SSH-2for resource information collection, new VM load balancing algorithm could be implemented andtested [39].

ElasticHosts It is a global cloud service provider containing geographical diverse distributionsthat offer easy-to-use Cloud Servers with instant, flexible computing capacity. As well as CloudServers, Elastichosts also offers Managed Cloud Servers, Cloud Websites, and Reseller Programs,which are easy for developers to do research [40].

EC2: Amazon EC2 is a commercial Web service platform that enables customers to rentcomputing resources from the EC2 cloud. Storage, processing and Web services are offered to

(2016)


Figure 4. Performance Evaluation Platforms for VM Load Balancing

customers. EC2 is a virtual computing environment, which enables customers to use Web serviceinterfaces to launch different instance types with a variety of operating systems [41].

There are some other popular cloud platforms, like Eucalyptus, CloudStack and OpenStack, whilethey are not applied to evaluate VM load balancing in our surveyed papers, thus, we do not introducethem in details.

5.2. Simulation Toolkits

Concerning unpredicted network environment and laboratory resource scale (like hosts), sometimesit is more convenient for developing and running simulation tools to simulate large scaleexperiments. The research on dynamic and large-scale distributed environment can be fulfilled byconstructing data center simulation system, which offers visualized modeling and simulation forlarge-scale applications in cloud infrastructure [42]. Data center simulation system can describe theapplication workload statement, which includes user information, data center position, the amountof users and data centers, and the amount of resources in each data center [43]. Under the simulateddata centers, load balancing algorithms can be easily implemented and evaluated.

CloudSim: CloudSim is an event driven simulator implemented in Java. Because of its object-oriented programming feature, CloudSim allows extensions and definition of policies in all thecomponents of the software stack, thereby making it a suitable research tool that can mimic thecomplexities arising from the environments [44].

CloudSched: CloudSched enables users developers to compare different resource schedulingalgorithms in IaaS regarding both hosts and workloads. It can also help the developer identify andexplore appropriate solutions considering different resource scheduling algorithms [42].

FlexCloud: FlexCloud is a flexible and scalable simulator that enables user to simulate theprocess of initializing cloud data centers, allocating virtual machine requests and providingperformance evaluation for various scheduling algorithms [45].

Table II lists how the experiment environments are configured for each algorithm that wesurveyed, as well as the corresponding performance improvement.

6. ALGORITHMS COMPARISON

In this section, we will discuss a few VM load balancing algorithms with the classificationsdiscussed in the previous section.

6.1. MMA

Song et al. [22] proposed a Migration Management Agent (MMA) algorithm for dynamicallybalancing virtual machine loads by considering computation and communication cost in High LevelApplication federations. The objectives of this algorithm are reducing the load of the overloadedhosts and decreasing the communication cost among hosts. In this algorithm, hosts utilizationthreshold are calculated firstly by considering weighted CPU and weighted memory utilizationat each time interval. The host would be regarded as overloaded if the host utilization surpasses

(2016)

12 M.XU ET AL.

Table II. List of Environment Configuration and Performance Improvement of VM Load BalancingAlgorithms

Algorithm Experiments Configuration Performance Improvement

MMA [22] 10 heterogeneous hosts with CentOS5.6 kernel and Xen hypervisor

It saves 22.25% average execution time compared tostatic distribution algorithm when reaching same load

balancing level.

Ni [23]

Based on OpenNebula, virtual platformis KVM, hosts are 6 IBM BladeCenter

Servers, both CPU resource and memoryresource are considered

When VMs loads increase, it reduces moreimbalance effects for any type of resource compared

with the single type of resource in OpenNebula.

Tordsson [24]Elastic host and EC2 cloud with two data

centers (in the USA and in Europe),containing 4 types of instances

Through configuring the minimum percent of VMsto be placed in each cloud under multi-cloud

environment to balance load, it could save morebudget than single cloud.

DLBA-CAB [37] 4 hosts with OpenVZ for managing VMs The algorithm convergences fast and keep the standarddeviation of load in a low range.

Yang [34] Simulation with 20 hostsCompared with no load balancing and minimum

connection algorithm, it reduces the numberof overloaded hosts.

CLBVM [36] Hosts installed with CentOS and Xenkernel, as well as Apache web server

Tests are conducted on limited capacity and resultsshow that the algorithm improves up to 20%throughput as well as load balancing effects

compared with isolated system.

Jonathan[32]Simulation with more than 100heterogeneous hosts and 1000

heterogeneous VMs

About 10% faster to detect overloaded hosts andsolve the overloaded situation to reach predefined

balanced situation, compared with algorithmwithout its load balancing mechanism.

DAIRS [25]Simulation under CloudSched with

hundreds of heterogeneous hosts andthousands heterogeneous of VMs

It reduces more than 20% average imbalance valuecompared with ZHCJ, 50% less average imbalance

value than ZHJZ and Random.

Prepartition [30]Simulation under CloudSched with

hundreds of heterogeneous hosts andthousands heterogeneous of VMs

It has 8%-13% lower average makespan andcapacity-makespan than PMG and LPT, 40%-50%

lower average makespan and capacity-makespan thanRR.

Thiru [27] Simulation with CloudSim It has lower load imbalance value compared with RR,First Fit, and Best Fit algorithms.

Hu [15]6 hosts based on OpenNebula, virtualplatform is KVM; hosts are connected

with LAN

When the system load variation is evident, it betterguarantees the system load balancing compared with

Least-loaded scheduling algorithm and rotatingscheduling algorithm.

Wen [16]Simulation with CloudSim with 2 types

of hosts and 4 types of VMs underrandom workload

It has lower load variance compared withTHR RS 0.8, MAD RS 2.5, LR RS 0.8 and

IQR RS 1.5 algorithms, by reducingabout 70%, 50%, 40%, and 40%

respectively.

ACOPS [29] Simulation on a personal computer

It can reduce more than 50%, 5%, 5%, 20%,respectively, compared with ACO, ACS(TG), SA,

and GA algorithms, as well as no worse than PRACOand FCFS+RR algorithms.

the threshold. Then loads of hosts are computed based on CPU utilization. The communicationand computation costs are modeled as constraints to achieve the load balance goal by reducing thenumber of overloaded hosts.

In the VM migration stage, the algorithm finds the overloaded host and the VM needed to bemigrated from it as well as the least loaded host. If the least loaded host is not overloaded, it sets thishost as the destination host for migration. Then the maximum accepted load would be calculated.After that, the communication cost happened between the migrated VM and rest VM is calculated.The mapping with minimum communication cost would be confirmed and reserved.

The advantage of MMA is that it considers and models communication costs between migratedand rest VM and it could dynamically balance loads under communication constraints. While

(2016)


its disadvantage is that it only considers homogeneous VMs to simulate results and only CPUutilization is considered as the load of hosts.

6.2. Virtual Machine Initial Mapping based on Multi-resource Load Balancing

Ni et al. [23] presented a VM mapping algorithm based on multi-resource load balancing and aimedto easing load crowding. This algorithm uses probability approach to adapt unbalanced loads causedby concurrent users. In the first step, the weights of each dimensional resource are determined, inwhich the weights are normalized and weights priority could be dynamically adjusted. The secondstep is scoring the node based on the weighted approach in step one. A Proportional Selection beforeRoulette Wheel approach is adopted to compute the selection probability of each node. Finally, theVM would be allocated to the host with the highest probability.

The realistic experiment shows that this approach could efficiently reduce the standard deviationof utilization of all nodes, while this algorithm mainly focuses on the initial placement of VM ratherthan in the running stage.

6.3. Scheme for optimizing virtual machines in multi-cloud environment

The cloud scheduling algorithm proposed in [24] for VM placement optimization aims to multi-objective scheduling including load balancing, performance, cost and etc. The algorithm isembedded in a cloud broker, which is responsible for optimizing VMs placement and managingthe virtual resource. The authors explore a set of algorithms that are based on integer programmingformulations and their formulation is a version of Generalized Assignment Problem. Thesealgorithms mainly focus on performance optimization, like makespan, throughput and networkbandwidth usage. The intensive experiment results show that multi-cloud placement can reducecost under the condition that load balancing constraints are satisfied.

This work has comprehensive experiments and comparisons while it mainly considers the staticscheduling for VMs rather than dynamic. Therefore, the algorithms scalability would be limitedwhen they are applied to the dynamic scenario.

6.4. DLBA-CAB

DLBA-CAB [37] is an algorithm balancing intra-cloud by adaptive live migration of virtualmachines. Its objective is making each host to achieve equilibrium of processor usage and I/O usage.It models a cost function considering weighted CPU usage and IO usage, and each host calculatesthe function values individually. In each monitor interval, two hosts are selected randomly to builda connection to find the cost difference between them. The difference value is regarded as migrationprobability, so the VMs always migrate from physical hosts with a higher cost to those with a lowerone. This algorithm does not need a central coordinator node while the loads information of otherhosts would be stored on shared storage and is updated periodically.

DLBA-CAB is an example showing how distributed load balancing algorithm for VMs isimplemented in intra-cloud with fast convergence speed to reach Nash equilibrium while its modelsimply assumes that host memory usage is always enough.

6.5. Optimized Control Strategy Combining Multi-Strategy and Prediction Mechanism

An optimized control strategy combining multi-strategy that based on prediction mechanism [34]is designed to reduce the number of overloaded hosts and avoid unnecessary migration. Theauthors also adopt a weighted function considering multiple types of resource to calculate theload of hosts and define four status domains: light-load, optimal, warning and overload to presentdifferent utilization domains. To analyze and predict future utilization for resource components,this strategy contains a prediction model that uses a set of recently utilization data series basedon an AR prediction model [46] to obtain the future utilization. Hosts with different utilization liein different domains, and different migration strategies are executed in different domains. In thelight-load domain, no migration is operated; in the optimal domain, if the prediction shows that onehost is potential to be transferred into warning domain, VM would be ready for migration; in the

(2016)

14 M.XU ET AL.

warning domain, the host would be locked and prevented to receive new tasks, migration and unlockoperation would be executed based on future predicted utilization; while in the overloaded domain,host would be locked and the VM with the highest utilization on host would be suspended andmigrated. The VM is reset when the utilization of host falls into the warning domain, and the hostis unlocked when the utilization of host lies in the optimal domain. As for choosing the migrationdestination placement, this strategy considers the characteristic of applications, like CPU intensive,I/O intensive and etc. The migration destination is selected as the host that most suitable for thepredicted resource change, like if the CPU fluctuation trend is the most influential one, the host withthe largest CPU resource is selected as the destination.

In addition to the migration process, to avoid the multiple VMs migrating to the same host andoverloading instantaneously, a three times handshaking protocol is used to confirm the ultimatemigration. With this protocol, each host maintains an acceptance queue containing VM waited to beallocated, which updates host utilization load increment along with time.

6.6. CLBVM

Bhadani et al. [36] proposed a Central Load Balancing Policy for VM (CLBVM) to balanceloads evenly in a distributed cloud computing environment. This policy is based on global stateinformation while the migration operation is combined of distributed and centralized options. Theload information collector on each host collects its load information continuously (hosts would belabeled as Heavy, Moderate, and Light) and exchanges information with a master server, whichperiodically makes decisions for load balancing that heavily loaded host would be balanced firstlywith the lightly loaded host.

This policy advances the existing model for load balancing of VM in distributed environmentand the practice in XEN shows its feasibility to improve throughput. While this policy assumes thatthe network loads are almost constant, this is not very applicable to current Cloud environment. Inaddition, the resource type of memory and I/O are rarely considered in this work.

6.7. Dynamic Load Balancer for Virtual Machine

Jonathan et al. [32] presented a distributed dynamic load balancer for VMs based on P2Parchitecture. Its objectives are reducing the load on a single host, moving a VM to a new hostwith more resources or with specialized resources. To balance the loads, the author adopts a scorefunction composite of the static score and dynamic score to represent the loads. The static scoretakes into account static resources quota reserved for a VM, and the dynamic score mainly considersthe dynamic resources like the amount of free memory. In this load balancer, the algorithm firstlycomputes the static score by multiplying the software quota and hardware quota with respectiveparameters, then it also obtains dynamic part of the score by multiplying free resources on host andresource used on VMs with respective parameters. Finally, the algorithm adds the static part thatmultiplied by a static parameter and the dynamic score to represent the exact load of each host.

After calculating the scores on all hosts, in the placement and migration processes, the algorithmselects the host that fits the static requirement VM to be placed or migrated. In this approach, theload balancers on each host cooperate together to ensure the scalability of its architecture. However,communication cost may increase rapidly with the number of hosts, which is not considered in thisarticle.

6.8. DAIRS

Tian et al. [25] introduced a dynamic and integrated resource scheduling algorithm (DAIRS) forbalancing VMs in Clouds. This algorithm treats CPU, memory and network bandwidth integratedfor both host and VM resources. DAIRS defines four types of queues for scheduling: waiting queues(VMs that are not allocated immediately but have to wait), requesting queue (new VM requests),optimizing queue (VM needs to be migrated) and deleting queue (VM request with due endingtime). Different queues have different priorities: firstly the waiting queue is checked, if it’s notempty, it picks up the VMs in the queue and allocates it to the host with the lowest utilization

(2016)


interval and enough resources; secondly, checking the requesting queue, if it is not empty and thestarting time of VM is due, it allocates VM to the host with the lowest utilization interval and enoughresources; thirdly, the optimizing queue is checked, if it has elements, based on the predefinedintegrated utilization threshold, migration process will be triggered to migrate VM with lowest loadfrom host with highest load to the host with lowest load; finally, it checks the deleting queue anddeleting the VM if its end time is due.

DAIRS is one of the earliest algorithms that explored the multiple types of resources and treatedthem as integrated value. The drawback of DAIRS is that it ignores the communication cost ofmigrations.

6.9. Prepartition

Prepartition algorithm [30] is designed for offline VM allocation considering reservation model.As it is applied to the offline scenario, all VM information would be known before the finalplacement. In this algorithm, it defines a new metric as capacity-makespan, which is computedas VM load multiplies VM capacity. Prepartition algorithm calculates a partition value as the biggervalue between the maximum capacity-makespan and average capacity-makespan of all VMs. Apartition ratio (an integer) is also defined by the user. Then for each VM, Prepartition algorithmwould partition them into multiple VMs with length as the partition value divided by partitionratio. After the new VM group is generated, the VMs are allocated one by one to the host with thelowest capacity-makespan. It is noticed that the regeneration process is before the final placement,therefore, it may not cause the instability and chaos.

Though belonging to the static algorithm, Prepartition is effective to achieve better load balanceas desired. For offline load balancing without migration, the best approach has the approximationratio as 4/3 [47]. With Prepartition, it proves that Prepartition is possible to be very close to theoptimal solution.

6.10. Hybrid Genetic based Host Load Aware Algorithm

Thiruvenkadam et al. [27] presented a hybrid genetic algorithm for scheduling and optimizing forVMs. One of its objectives is minimizing the number of migrations when balancing the VMs. Thealgorithm considers two different techniques to fulfill their goals: the first one is that initial VMpacking is done by checking the load of hosts and user constraints, and the second phase is tooptimize placement of VMs by using a hybrid genetic algorithm based on fitness functions.

For the initial VMs packing, the authors proposed a heuristic approach based on multiplepolicies. This heuristic approach would search hosts according to VM resource requirement andhost available resource, which is quite similar to a best fit algorithm. As for the hybrid geneticalgorithm at the second stage, it mainly includes four parts: the creation of an initial population,fitness function, crossover operator and mutation operator. These four parts cooperate to achieve thepredefined goal, in which firstly a novel algorithm is used to generate initial population, secondlythe solution vectors for VM allocation are selected according to the probability that is proportionalto fitness value, thirdly the algorithm crosses the chosen product vectors based on fitness value,the crossover and mutation operations continue working until the solution reaches to the objectivesituation and finally terminate.

The fitness function in this algorithm is minimizing the standard deviation of the remainingCPU, ram, and bandwidth in each host, which is one of the main objectives of this algorithm. Thecrossover operation crossovers the hosts with more remaining capacity in different solution vectorsto generate better solutions. In addition, the mutation operation removes, interchanges or shifts VMto obtain better solution vector based on the fitness function. This work is an extension for binpacking problem as it considers multiple objectives. It not only considers load balancing but alsotakes the number of hosts, energy consumption, and resource utilization rate into consideration.

(2016)

16 M.XU ET AL.

6.11. VM Scheduling Strategy based on Genetic Algorithm

Another genetic algorithm for VM scheduling is presented in [15], which sets its objectives asfinding the best mapping solutions to achieve best system load balancing and minimizing migrationtimes. Different from the genetic algorithm that uses binary codes in 6.10, this algorithm choosestree structure to mark chromosome of genes. In the initialization of population stage, the authorsfirstly compute the selection probability of every VM according to the VM set, and then based onthe probability, all of the logic disks are allocated to the smallest-loaded node in the PM set toproduce leaf node of the initial tree. A fitness function is also designed to reflect the performance ofeach node in the tree. This function helps make decisions to select node with the highest fitnessinto the child population. In the crossover operation stage, the algorithm chooses two parentalindividuals and combines them together to form a new tree that keeps the same leaf nodes anddisposes of the different leaf nodes, then the disposed leaf nodes are added to the smallest-loadedparental node. These procedures repeat until the required loop counter number is reached. As for themutation operation stage, a variation probability is given and based on it, leaf nodes are interchangedaccording to the variation probability.

This algorithm considers both the historical data and current data when computing the selectionprobability and variation probability, which captures the influence in advance. Hence, the algorithmis able to choose the solution that has least influence on the system after reallocation. Experimentsshow that better load balancing performance can be obtained compared with the least-loadedscheduling algorithm while its algorithm complexity is still open to discuss.

6.12. Distributed VM Migration Strategy based on ACO

Wen et al. [16] introduced a distributed VM migration strategy based on Ant Colony Optimization(ACO). The objectives of this strategy are achieving load balancing and reasonable resourceutilization as well as minimizing number of migrations. In this approach, the distributed localmigration agents autonomously monitors the resource utilization of each host and overcomesshortcomings of previous work with simpler trigger strategy and misuse of pheromone. The authorsredefine the pheromones as positive and negative to mark the Positive Traversing Strategy andNegative Traversing Strategy. Then a load threshold is configured by experience, once the loadthreshold is exceeded, the distributed migration agents on each host would sort all the VMsaccording to their average loads. The VM with higher load, which means it has higher priority,is put into a list. The VMs are continued being put into the list until the average load of the hostis below the threshold, and after that, the distributed agents generate some ants to start to traverse.These ants traverse through different hosts and leave pheromone based on the load on host andbandwidth between hosts. The ants produce more pheromone under the condition that the load ofdestination host is higher or the bandwidth between the hosts is less. With the iteration going on, theants are more likely to traverse through those hosts that are in high load condition according to thePositive Traversing Strategy. However, in the final iteration, these ants may traverse through hostswith the lower load as the Negative Traversing Strategy is also adopted. At the end, a list of PMswith low load condition is obtained and they can be matched with the sorted VMs that are preparedto be migrated.

The experiments under CloudSim toolkit shows that the ACO-based strategy reaches a balancedperformance for multiple objectives, like number of SLA violations, number of migrations and loadbalance level.

6.13. ACOPS

Cho et al. [29] combined ant colony optimization and particle swarm optimization (ACOPS) to dealwith VM load balancing in Clouds. Its objective combines both maximizing the balance of resourceutilization and accepting as many requests as possible. This algorithm adopts an accelerating step asPre-reject, in which the remaining memory of each server is checked to reduce solution dimensions.Before scheduling, if the maximum remaining memory is less than the memory demand of a request,the VM request is rejected. To construct the solutions for all the ants, each ant selects the next path

(2016)


Table III. Algorithm Classification for VM Model

Algorithm VM AllocationDynamicity VM Type Uniformity VM Resource Type Optimization Strategy

MMA [22] Dynamic Homogeneous CPU HeuristicNi [23] Static Homogeneous CPU & Memory HeuristicTordsson [24] Static Heterogeneous Multiple Meta-heuristicDLBA-CAB [37] Dynamic Homogeneous CPU & IO HeuristicYang [34] Dynamic Heterogeneous Multiple HeuristicCLBVM [36] Dynamic Homogeneous CPU HeuristicJonathan [32] Dynamic Heterogeneous CPU & Memory HeuristicDAIRS [25] Dynamic Heterogeneous Multiple HeuristicPrepartition [30] Static Heterogeneous Multiple HeuristicThiru [27] Dynamic Heterogeneous Multiple HybridHu [15] Dynamic Heterogeneous CPU Meta-heuristicWen [16] Static Heterogeneous Multiple Meta-heuristicACOPS [29] Dynamic Heterogeneous Multiple Meta-heuristic

according to a probability defined by the authors in the search module. After the search module, thealgorithm applies the Particle Swarm Optimization (PSO) operator to improve the results by takingadvantage of the personal best solution in PSO to help construct a better solution. After each antfinishes its solution completely, a fitness function is applied to evaluate the scheduling score of eachant, and the ants find the best solution of this iteration and update the global best solution. Insteadof using both global and local pheromone updating, the algorithm only applies global pheromoneupdating so that the paths belong to the best solution may occupy increased pheromone. Finally,ACOPS would terminate when the iteration reaches the maximum number of iteration or the globalbest solution does not change during a given time.

As a complementary for other ACO or PSO algorithms, the time complexity of ACOPS isinduced by the authors. In addition, in the experiments part, the results demonstrate the algorithmeffectiveness in balancing loads. Although the Pre-reject step accelerates the process to obtain asolution, it also rejects a set of VMs, which is not applicable in a realistic environment.

6.14. Summary

This section presents the details of the surveyed algorithms and discusses the strength and weaknessof these algorithms. Table III compares these algorithms according to their VMmodels and TableIV compares them based on the scheduling model.

7. CONCLUSIONS AND FUTURE DIRECTIONS

This paper investigates algorithms designed for resource scheduling in cloud computingenvironment. In particular, it concentrates on VM load balancing, which also refers to algorithmsbalancing VM placement on hosts. This paper presents classifications based on a comprehensivestudy on existing VM load balancing algorithms. The existing VM load balancing algorithms areanalyzed and classified with the purpose of providing readers with a decision suggestion and anoverview of the characteristic of existing related algorithms. Detailed introduction and discussionof various algorithms are provided and they aim to offer a comprehensive understanding of existingalgorithms as well as further insight into the fields future directions.

Unbalanced load from VMs may lead to host performance degradation. Therefore, VM loadbalancing algorithms are crucial to ensure performance and balance loads evenly on hosts. Toachieve the load balancing goal, some related techniques like virtualization, VM migration orVM consolidation are proposed. The first consideration for VM load balancing algorithms is thescenario, which stands for the environment that the algorithm would be applied. The scenariosinclude public cloud, private cloud, and hybrid cloud, and under different scenarios, differentconstraints exist. Another challenge for dealing with VM load balancing algorithm is scalability, as

(2016)

18 M.XU ET AL.

Tabl

eIV

.Alg

orith

mC

lass

ifica

tion

forS

ched

ulin

gM

odel

Alg

orith

mSc

enar

ioE

xper

imen

tPl

atfo

rmC

onst

rain

tsL

ive

Mig

ratio

nM

igra

tion

Cos

tC

onsi

dera

tion

Sche

dulin

gO

bjec

tive

Man

agem

ent

MM

A[2

2]Pu

blic

Clo

udSi

mul

atio

nC

ompu

tatio

n&

Com

mun

icat

ion

Cos

tsY

esC

ompu

tatio

n,C

omm

unic

atio

n

Min

Mig

ratio

nL

aten

cyC

entr

aliz

ed

Ni[

23]

Priv

ate

Clo

udR

ealis

tic(O

penN

ebul

a)L

imite

dR

esou

rce

No

No

Min

Util

.SD

Cen

tral

ized

Tord

sson

[24]

Hyb

rid

Clo

ud(M

ulti)

Rea

listic

(Ela

stic

Hos

ts+

Am

azon

)

Bud

get,

Use

rDefi

ned

No

Com

puta

tion,

Com

mun

icat

ion

Min

Cos

tsC

entr

aliz

ed

DL

BA

-CA

B[3

7]Pr

ivat

eC

loud

(Int

ra)

Rea

listic

(Ope

nVZ

)D

ownt

ime

Yes

No

Zer

oD

ownt

ime

Dis

trib

uted

Yan

g[3

4]Pr

ivat

eC

loud

Sim

ulat

ion)

Mem

ory

Cos

tof

Mig

ratio

nY

esM

emor

yC

opy

Min

Ove

rloa

ded

Cen

tral

ized

CL

BV

M[3

6]Pu

blic

Clo

udR

ealis

ticN

/AY

esM

emor

y,Fa

ult&

Tole

ranc

eIm

prov

eT

hrou

ghou

tC

entr

aliz

ed

Jona

than

[32]

Publ

icC

loud

(P2P

)Si

mul

atio

n)N

/AY

esN

oFa

ster

toSo

lve

Ove

rloa

ded

Hos

tsD

istr

ibut

ed

DA

IRS

[25]

Publ

icC

loud

Sim

ulat

ion

N/A

Yes

Com

puta

tion

Min

Imba

lanc

eL

evel

Deg

ree

Cen

tral

ized

Prep

artit

ion

[30]

Publ

icC

loud

Sim

ulat

ion

N/A

Yes

Com

puta

tion

Min

Cap

acity

-mak

espa

nC

entr

aliz

ed

Thi

ru[2

7]Pr

ivat

eC

loud

Sim

ulat

ion

Ove

rall

Loa

dY

esC

ompu

tatio

nM

inN

umbe

rof

Mig

ratio

nsC

entr

aliz

ed

Hu

[15]

Priv

ate

Clo

udR

ealis

tic(O

penN

ebul

a)A

stri

ngen

cyY

esN

oM

inN

umbe

rof

Mig

ratio

nsC

entr

aliz

ed

Wen

[16]

Priv

ate

Clo

udSi

mul

atio

nA

mou

ntof

Pher

omon

eY

esC

omm

unic

atio

nM

inN

umbe

rof

SLA

Vio

latio

nsD

istr

ibut

ed

AC

OPS

[29]

Priv

ate

Clo

udSi

mul

atio

nN

/AY

esN

oM

inN

umbe

rof

Mig

ratio

nsD

istr

ibut

ed

(2016)


the algorithm management can be either centralized or distributed. The centralized algorithms areeasier to manage but the central node may become the performance bottleneck, while the distributedalgorithms are better for scale up but management overhead like communication costs are increased.When modeling VM load balancing algorithm, some points should be considered, like consideringsingle resource type or multiple resource type, heterogeneous or homogeneous VM type, the VMscan be statically or dynamically allocated, as well as the general scheduling process.

Based on the existing work, the scheduling objectives of VM load balancing algorithms arenot only limited to traditional load balancing metrics, like load variance, standard deviations ofutilization, makespan. Some novel metrics like quadratic equilibrium entropy, capacity-makespanand etc. are also proposed to measure load balancing performance from different dimensions. Asfor performance evaluation platforms, both simulation toolkit and realistic environment are adoptedfor testing.

As for future directions, some directions include: 1) considering multiple types of resource,like CPU, memory, storage and bandwidth together; 2) considering the VM load balancing asa multi-objective problem, including times of migrations, execution, SLA violations and etc.,therefore, multi-objective optimization algorithm is required to be designed; 3) for the meta-heuristic algorithms, like genetic algorithm and ant colony optimization algorithm, their time costneeds improved; 4) evaluating algorithm performance on popular platforms, like OpenStack; 5)applying the proposed algorithms to real cloud environment.

ACKNOWLEDGEMENT

This work is supported by China Scholarship Council (CSC), Australia Research Council Future Fellowshipand Discovery Project Grants.

REFERENCES

1. Daniels J. Server virtualization architecture and implementation. Crossroads 2009; 16(1):8–12.2. Speitkamp B, Bichler M. A mathematical programming approach for server consolidation problems in virtualized

data centers. IEEE Transactions on services computing 2010; 3(4):266–278.3. Gutierrez-Garcia JO, Ramirez-Nafarrate A. Agent-based load balancing in cloud data centers. Cluster Computing

2015; 18(3):1041–1062.4. Kerr A, Diamos G, Yalamanchili S. A characterization and analysis of gpgpu kernels 2009; .5. Randles M, Lamb D, Taleb-Bendiab A. A comparative study into distributed load balancing algorithms for

cloud computing. Advanced Information Networking and Applications Workshops (WAINA), 2010 IEEE 24thInternational Conference on, IEEE, 2010; 551–556.

6. Kansal NJ, Chana I. Cloud load balancing techniques: A step towards green computing. IJCSI International Journalof Computer Science Issues 2012; 9(1):238–246.

7. Coffman Jr EG, Garey MR, Johnson DS. Approximation algorithms for bin packing: a survey. Approximationalgorithms for NP-hard problems, PWS Publishing Co., 1996; 46–93.

8. Vmware distributed resource scheduling 2015. URL http://www.vmware.com/au/products/vsphere/features/drs-dpm.

9. Singh A, Korupolu M, Mohapatra D. Server-storage virtualization: integration and load balancing in data centers.Proceedings of the 2008 ACM/IEEE conference on Supercomputing, IEEE Press, 2008; 53.

10. Clark C, Fraser K, Hand S, Hansen JG, Jul E, Limpach C, Pratt I, Warfield A. Live migration of virtual machines.Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation-Volume 2,USENIX Association, 2005; 273–286.

11. Ye K, Jiang X, Huang D, Chen J, Wang B. Live migration of multiple virtual machines with resource reservationin cloud computing environments. Cloud Computing (CLOUD), 2011 IEEE International Conference on, IEEE,2011; 267–274.

12. Voorsluys W, Broberg J, Venugopal S, Buyya R. Cost of virtual machine live migration in clouds: A performanceevaluation. IEEE International Conference on Cloud Computing, Springer, 2009; 254–265.

13. Armbrust M, Fox A, Griffith R, Joseph AD, Katz RH, Konwinski A, Lee G, Patterson DA, Rabkin A, Stoica I,et al.. Above the clouds: A berkeley view of cloud computing 2009; .

14. Zhao L, Sakr S, Liu A, Bouguettaya A. Cloud Data Management. Springer, 2014.15. Hu J, Gu J, Sun G, Zhao T. A scheduling strategy on load balancing of virtual machine resources in cloud computing

environment. 2010 3rd International symposium on parallel architectures, algorithms and programming, IEEE,2010; 89–96.

16. Wen WT, Wang CD, Wu DS, Xie YY. An aco-based scheduling strategy on load balancing in cloud computingenvironment. 2015 Ninth International Conference on Frontier of Computer Science and Technology, IEEE, 2015;364–369.

(2016)

http://www.vmware.com/au/products/vsphere/features/drs-dpm

http://www.vmware.com/au/products/vsphere/features/drs-dpm

20 M.XU ET AL.

17. Roos G. Enterprise prefer private ccloud: Survey 2013. URL http://www.eweek.com/cloud/enterprises-prefer-private-clouds-survey/.

18. Li A, Yang X, Kandula S, Zhang M. Cloudcmp: comparing public cloud providers. Proceedings of the 10th ACMSIGCOMM conference on Internet measurement, ACM, 2010; 1–14.

19. Zhang Q, Cheng L, Boutaba R. Cloud computing: state-of-the-art and research challenges. Journal of internetservices and applications 2010; 1(1):7–18.

20. Petcu D. Multi-cloud: expectations and current approaches. Proceedings of the 2013 international workshop onMulti-cloud applications and federated clouds, ACM, 2013; 1–6.

21. Red hat: Red hat enterprise virtualization 3.2technical reference guide 22015. URL https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.2/html/Technical_Reference_Guide/index.html.

22. Song X, Ma Y, Teng D. A load balancing scheme using federate migration based on virtual machines for cloudsimulations. Mathematical Problems in Engineering 2015; 2015.

23. Ni J, Huang Y, Luan Z, Zhang J, Qian D. Virtual machine mapping policy based on load balancing in private cloudenvironment. Cloud and Service Computing (CSC), 2011 International Conference on, IEEE, 2011; 292–295.

24. Tordsson J, Montero RS, Moreno-Vozmediano R, Llorente IM. Cloud brokering mechanisms for optimizedplacement of virtual machines across multiple providers. Future Generation Computer Systems 2012; 28(2):358–367.

25. Tian W, Zhao Y, Zhong Y, Xu M, Jing C. A dynamic and integrated load-balancing scheduling algorithm for clouddatacenters. 2011 IEEE International Conference on Cloud Computing and Intelligence Systems, IEEE, 2011; 311–315.

26. Xu M, Tian W. An online load balancing scheduling algorithm for cloud data centers considering real-time multi-dimensional resource. 2012 IEEE 2nd International Conference on Cloud Computing and Intelligence Systems,vol. 1, IEEE, 2012; 264–268.

27. Thiruvenkadam T, Kamalakkannan P. Energy efficient multi dimensional host load aware algorithm for virtualmachine placement and optimization in cloud environment. Indian Journal of Science and Technology 2015; 8(17).

28. Christodoulopoulos K, Sourlas V, Mpakolas I, Varvarigos E. A comparison of centralized and distributed meta-scheduling architectures for computation and communication tasks in grid networks. Computer Communications2009; 32(7):1172–1184.

29. Cho KM, Tsai PW, Tsai CW, Yang CS. A hybrid meta-heuristic algorithm for vm scheduling with load balancingin cloud computing. Neural Computing and Applications 2015; 26(6):1297–1309.

30. Tian W, Xu M, Chen Y, Zhao Y. Prepartition: A new paradigm for the load balance of virtual machine reservationsin data centers. 2014 IEEE International Conference on Communications (ICC), IEEE, 2014; 4017–4022.

31. Talbi EG. Metaheuristics: from design to implementation, vol. 74. John Wiley & Sons, 2009.32. Rouzaud-Cornabas J. A distributed and collaborative dynamic load balancer for virtual machine. European

Conference on Parallel Processing, Springer, 2010; 641–648.33. Yang X, Ma ZT, Sun L. Performance vector-based algorithm for virtual machine deployment in infrastructure

clouds. Jisuanji Yingyong/ Journal of Computer Applications 2012; 32(1):16–19.34. Yang K, Gu J, Zhao T, Sun G. An optimized control strategy for load balancing based on live migration of virtual

machine. 2011 Sixth Annual ChinaGrid Conference, IEEE, 2011; 141–146.35. Wang C, Zhou-Yi Z, Mao XG, Lin SM. A quadratic equilibrium entropy based virtual machine load balance

evaluation algorithm 2015; .36. Bhadani A, Chaudhary S. Performance evaluation of web servers using central load balancing policy over virtual

machines on cloud. Proceedings of the Third Annual ACM Bangalore Conference, ACM, 2010; 16.37. Zhao Y, Huang W. Adaptive distributed load balancing algorithm based on live migration of virtual machines in

cloud. INC, IMS and IDC, 2009. NCM’09. Fifth International Joint Conference on, IEEE, 2009; 170–175.38. Zhou W, Yang S, Fang J, Niu X, Song H. Vmctune: A load balancing scheme for virtual machine cluster using

dynamic resource allocation. 2010 Ninth International Conference on Grid and Cloud Computing, IEEE, 2010;81–86.

39. OpenNebula. Opennebula home page. URL http://www.opennebula.org/.40. Elastichosts. URL https://www.elastichosts.com/.41. Amazon ec2. URL https://aws.amazon.com/ec2/.42. Tian W, Zhao Y, Xu M, Zhong Y, Sun X. A toolkit for modeling and simulation of real-time virtual machine

allocation in a cloud data center. IEEE Transactions on Automation Science and Engineering 2015; 12(1):153–161.43. Tian W, Xu M, Chen A, Li G, Wang X, Chen Y. Open-source simulators for cloud computing: Comparative study

and challenging issues. Simulation Modelling Practice and Theory 2015; 58:239–254.44. Calheiros RN, Ranjan R, Beloglazov A, De Rose CA, Buyya R. Cloudsim: a toolkit for modeling and simulation

of cloud computing environments and evaluation of resource provisioning algorithms. Software: Practice andExperience 2011; 41(1):23–50.

45. Xu M, Li G, Yang W, Tian W. Flexcloud: A flexible and extendible simulator for performance evaluation of virtualmachine allocation. 2015 IEEE International Conference on Smart City/SocialCom/SustainCom (SmartCity), IEEE,2015; 649–655.

46. Gersch W, Brotherton T. Ar model prediction of time series with trends and seasonalities: A contrast withbox-jenkins modeling. Decision and Control including the Symposium on Adaptive Processes, 1980 19th IEEEConference on, IEEE, 1980; 988–990.

47. Graham RL. Bounds on multiprocessing timing anomalies. SIAM journal on Applied Mathematics 1969; 17(2):416–429.

(2016)

http://www.eweek.com/cloud/enterprises-prefer-private-clouds-survey/

http://www.eweek.com/cloud/enterprises-prefer-private-clouds-survey/

https://access.redhat.com/site /documentation/en-US/Red_Hat_Enterprise_Virtualization/ 3.2/html/Technical_Reference_Guide/index.html



http://www.opennebula.org/

https://www.elastichosts.com/

https://aws.amazon.com/ec2/

a survey on load balancing algorithms for virtual machines ... · 2016; 00:1–20 a survey on load...

Documents