Download - de Linares - tauja.ujaen.estauja.ujaen.es/bitstream/10953.1/5204/1/Documentation.pdf · máquinas virtuales en los simuladores CloudSim y RealCloudSim, pero no consideran planificación

Escuela

Polit

écnic

a S

uperior

de L

inare

s

UNIVERSIDAD DE JAÉN Escuela Politécnica Superior de Linares

Master Thesis

______

ENERGY OPTIMIZATION IN CLOUD

COMPUTING SYSTEMS

DE CLOUD COMPUTING

Student: Iván Tomás Cotes Ruiz

Supervisors: Dr. Rocío Pérez de Prado

Dr. Sebastián García Galán

Department: Telecommunication Engineering Department

June, 2016

1

Contents Index of tables ....................................................................................................................... 3

Index of figures ...................................................................................................................... 4

Antecedentes .......................................................................................................................... 5

Objetivos ................................................................................................................................ 7

Conclusiones .......................................................................................................................... 8

1. Background .................................................................................................................. 10

2. Objectives .................................................................................................................... 11

3. Methodology ................................................................................................................ 12

3.1. State of the art ....................................................................................................... 12

3.1.1. Power model .................................................................................................. 12

3.1.2. Dynamic Voltage and Frequency Scaling (DVFS) ....................................... 13

3.1.3. Fuzzy Logic ................................................................................................... 15

3.1.4. Cloud computing types .................................................................................. 23

3.1.5. Power saving techniques in Datacenters ....................................................... 27

3.2. First stage: simulation environment ...................................................................... 32

3.2.1. CloudSim ....................................................................................................... 33

3.2.2. CloudSim with DVFS .................................................................................... 35

3.2.3. WorkflowSim ................................................................................................ 36

3.2.4. Merged simulator ........................................................................................... 43

3.2.5. Changes with WorflowSim ........................................................................... 44

3.2.6. Changes with CloudSim ................................................................................ 45

3.2.7. Modifications and additions to achieve the proposed joint simulator ........... 46

3.2.8. Additional notes to the power model ............................................................. 48

3.3. Second stage: scheduling algorithms .................................................................... 50

3.3.1. Power aware scheduling ................................................................................ 50

3.3.2. VM scheduling .............................................................................................. 51

2

3.3.3. Tasks scheduling ........................................................................................... 53

3.3.4. Bag-of-tasks power aware scheduling ........................................................... 54

3.3.5. Classic schedulers adapted to power ............................................................. 55

3.3.6. Watts per MIPS scheduler for VMs .............................................................. 56

3.3.7. Fuzzy integration in WorkflowSimDVFS ..................................................... 58

3.3.8. VM scheduling FRBS .................................................................................... 61

3.3.9. Tasks scheduler FRBS ................................................................................... 62

3.3.10. Power model analytical .............................................................................. 63

3.3.11. Integration with Matlab ............................................................................. 65

4. Results and discussion ................................................................................................. 67

4.1. DVFS results ......................................................................................................... 67

4.1.1. DVFS savings ................................................................................................ 67

4.1.2. DVFS parameters evolution .......................................................................... 75

4.2. FRBS results ......................................................................................................... 78

4.2.1. FRBS simulation scenario ............................................................................. 78

4.2.2. Rules generated for both FRBS schedulers ................................................... 82

4.2.3. FRBS savings ................................................................................................ 84

5. Conclusions ................................................................................................................. 91

Bibliography ........................................................................................................................ 92

3

Index of tables

Table 1: basic temperature levels ........................................................................................ 16

Table 2: membership functions of air conditioner FRBS .................................................... 18

Table 3: car FRBS example ................................................................................................. 20

Table 4: CloudSim’s basic example .................................................................................... 34

Table 5: WorkflowSim communication messages between entities ................................... 38

Table 6: WorkflowSim tags meaning .................................................................................. 39

Table 7: Frequency multipliers and MIPS ........................................................................... 48

Table 8: Energy estimation example ................................................................................... 54

Table 9: Time summary (s) ................................................................................................. 68

Table 10: Overall power summary (W) ............................................................................... 70

Table 11: Avg power summary (W) .................................................................................... 72

Table 12: Energy summary (Wh) ........................................................................................ 73

Table 13: physical hosts’ configuration input parameters ................................................... 79

Table 14: physical host’s configuration calculated parameters ........................................... 81

Table 15: VM MIPS in FRBS scenario ............................................................................... 82

Table 16: basic experiments ................................................................................................ 85

Table 17: experiments with fuzzy task scheduler ................................................................ 86

Table 18: experiments with fuzzy VM scheduler ................................................................ 88

Table 19: experiments with both fuzzy schedulers ............................................................. 89

4

Index of figures

Figure 1: fuzzy temperature levels ...................................................................................... 17

Figure 2: FRBS objects ....................................................................................................... 20

Figure 3: distance membership functions ............................................................................ 21

Figure 4: speed membership functions ................................................................................ 21

Figure 5: acceleration membership functions ...................................................................... 21

Figure 6: car FRBS test {0.8, 0.2}....................................................................................... 22

Figure 7: CloudSim’s basic example ................................................................................... 34

Figure 8: Montage 25 DAG ................................................................................................. 36

Figure 9: WorkflowSim initialization stage ........................................................................ 42

Figure 10: WorkflowSim main stage................................................................................... 42

Figure 11: WorkflowSim ending stage ................................................................................ 43

Figure 12: Time summary (s) .............................................................................................. 69

Figure 13: Time savings (%) ............................................................................................... 69

Figure 14: Overall power summary (W) ............................................................................. 70

Figure 15: Overall power savings (%) ................................................................................. 71

Figure 16: Avg power summary (W) .................................................................................. 72

Figure 17: Avg power savings (%) ...................................................................................... 72

Figure 18: Energy summary (Wh) ....................................................................................... 73

Figure 19: Energy savings (%) ............................................................................................ 74

Figure 20: Utilization evolution .......................................................................................... 76

Figure 21: Multiplier evolution ........................................................................................... 76

Figure 22: Power evolution ................................................................................................. 77

Figure 23: VmWattsPerMipsMetric result values ............................................................... 85

Figure 24: VmWattsPerMipsMetric result savings ............................................................. 86

Figure 25: VmWattsPerMipsMetric result values (2) ......................................................... 87

Figure 26: VmWattsPerMipsMetric result savings (2)........................................................ 87

Figure 27: VmsFuzzy result values ..................................................................................... 88

Figure 28: VmsFuzzy result savings ................................................................................... 89

Figure 29: VmsFuzzy result values (2) ............................................................................... 90

Figure 30: VmsFuzzy result savings (2) .............................................................................. 90

5

Antecedentes

La industria de Cloud Computing se encuentra en continuo crecimiento en los

recientes años. Cada vez más empresas ofrecen servicios de almacenamiento o

procesamiento en la nube. Esto ofrece a los usuarios la capacidad de acceder a esos servicios

desde cualquier lugar con tan sólo una conexión a Internet, haciendo mucho más sencillas

tareas como copias de seguridad de datos importantes, así como poder obtener acceso a una

capacidad de procesamiento superior para la ejecución de tareas.

Sin embargo, el crecimiento de los Centros de Procesamiento de Datos (CPD),

contiendo cada vez más servidores con mayor número de recursos está incrementando su

consumo energético y, por tanto, el consumo energético global. Además, los sistemas de

refrigeración necesitados para evitar altas temperaturas en los CPD son numerosos y

consumen una alta cantidad de potencia.

Las técnicas que reducen el consumo de potencia también consiguen una reducción

en los costes, así como un incremento en la fiabilidad del sistema, ya que un mayor consumo

energético genera una mayor temperatura y cuanto mayor es la temperatura del sistema se

consiguen mayores tasas de error. Como se manifiesta en [ 27 ], por cada 10ºC de incremento

de la temperatura de un sistema, la tasa de fallo se duplica. Se requiere una solución

inminente para solventar esto.

Como se menciona en varios artículos [ 1 ][ 2 ][ 3 ][ 4 ] el consumo energético debido

a los servidores se eleva al 0.5% mundial en 2008, y se espera que ese consumo se

cuadriplique en 2020. Además, se estima que el consumo de un superordenador equivale a

la suma del consumo de 22000 hogares en los Estados Unidos.

Hay una gran cantidad de trabajo previo relacionado con este proyecto. En [ 43 ], M.

Seddiki et al. introducen un planificador de potencia basado en reglas borrosas para

máquinas virtuales en los simuladores CloudSim y RealCloudSim, pero no consideran

planificación de tareas ni el uso de workflows.

En [ 44 ], García-Galán et al. proponen una nueva estrategia llamada KARP, como

una alternativa a la aproximación de Michigan. KARP está basado en PSO, mientras que

Michigan está basado en Algoritmos Genéticos. Aplican esta nueva estrategia en Grid

Computing, obteniendo mejores resultados que con la alternativa genética. En [ 45 ]

comparan diferentes estrategias de adquisición de conocimiento en sistemas borrosos,

comparando KASIA con la aproximación de Pittsburgh y KARP con la aproximación de

Michigan, y aplicando estas técnicas a Grid Computing. Muestran cómo KASIA y KARP,

6

ambos basados en PSO consiguen mejores resultados que sus alternativas genéticas. La

estrategia KASIA es introducido en [ 46 ], donde muestran las ventajas del algoritmo PSO

sobre los algoritmos genéticos. También trabajan con meta planificadores basados en

algoritmos genéticos en [ 47 ].

7

Objetivos

El objetivo principal de este proyecto consiste en reducir los consumos de potencia

y energía de los CPD. Cada algoritmo implementado necesita ser probado. Para evitar la

necesidad de poseer in CPD, este proyecto está basado en simulaciones. La elección del

simulador es importante, debido al hecho de obtener resultados lo más similares a un CPD

real como sea posible. Si los resultados obtenidos en un simulador no se corresponden con

los que obtendrían en un escenario real, el análisis y las conclusiones no tienen sentido, y no

podremos decir que los algoritmos desarrollados consiguen ahorrar energía en CPD reales.

Teniendo esto en cuenta, la primera etapa de este proyecto consiste en conseguir un

buen escenario de simulación como la base para la segunda etapa, el diseño de los algoritmos

para reducir el consumo energético. De este modo, los resultados obtenidos al final del

proceso serán basados en un buen entorno de simulación. Siempre se ha de tener en cuenta

que los resultados de la simulación no serán perfectos. Un procesamiento real en un CPD

real considera múltiples parámetros, y no todos ellos son considerados en los simuladores.

Sin embargo, los simuladores están consiguiendo cada vez una mejor correspondencia con

el funcionamiento de CPD reales y ofrecen a los investigadores una gran herramienta para

probar diferentes tipos de algoritmos para mejorar el rendimiento de los CPD.

Existen diversas técnicas de ahorro de potencia para CPD. En la sección 3.1.5 se

muestran algunos de las más importantes. Los métodos usados en este proyecto se basan en

una combinación del algoritmo DVFS y el desarrollo de dos sistemas expertos basados en

reglas para proveer planificadores basados en potencia para los ámbitos de planificación de

máquinas virtuales y tareas. El algoritmo DVFS ya está implementado en el simulador

elegido como base del proyecto. En esta documentación se describe su uso y se analizan los

ahorros de potencia y energía conseguidos con él. Los dos sistemas expertos desarrollados

están basados en una optimización con un único fitness, en los cuales se ha considerado la

energía. A lo largo de esta documentación, se explican en detalle los parámetros

considerados en cada sistema experto y se analizan los resultados obtenidos.

8

Conclusiones

La sección 4 de esta documentación muestra los resultados obtenidos en las dos etapas

del proyecto. En primer lugar, se realizan experimentos con el entorno de simulación

obtenido y el algoritmo DVFS. La sección 4.1.1 muestra estos resultados, y la sección 4.1.2

analiza la evolución de tres parámetros del algoritmo en el simulador. La Tabla 12, y Figuras

18 y 19 muestran que los mejores resultados obtenidos para el consumo energético son

conseguidos por los gobernadores dinámicos, como era de esperar. También se analizan los

resultados obtenidos para el tiempo de procesamiento y la potencia consumida, donde se

muestra cómo el gobernador de mínima potencia introduce retardos en la ejecución,

incrementando el tiempo total de procesado y elevando el consumo de energía.

Del mismo modo, se ejecutan 14 experimentos con respecto a la segunda etapa del

proyecto, la adición de dos planificadores sistemas expertos basados en reglas borrosas. La

sección 4.2.1 describe el escenario de simulación, mostrando las características de las

máquinas virtuales y físicas. La sección 4.2.2 muestra los resultados obtenidos en los 14

experimentos, comparando los planificadores mencionados con otros algoritmos no basados

en reglas. Los mejores resultados obtenidos en cuanto a consumo energético se deben al

doble uso de ambos planificadores de máquinas virtuales y tareas basados en reglas,

obteniéndose un ahorro total de un 7.23% como muestra la Tabla 19 ya las Figuras 29 y 30.

También se analizan los resultados con respecto a tiempo de ejecución y consumo de

potencia, pero al no hacer una optimización multi-objetivo los resultados obtenidos con

respecto a tiempo no son optimizados por los planificadores sistemas expertos.

Tras ejecutar y analizar todos los experimentos, podemos concluir sabiendo que los

algoritmos desarrollados consiguen un menor valor del consumo energético, lo que

queríamos conseguir. A pesar de que estos algoritmos y resultados están basados en

experimentos ejecutados en un simulador y no podemos afirmar que los resultados obtenidos

sean exactamente lo que se habría obtenido ejecutando estos algoritmos en un sistema real,

podemos asegurar que estos resultados son similares a los obtenidos en un escenario real,

debido a las características implementadas en el simulador. No incluir el algoritmo DVFS

no haría que los planificadores de lógica borrosa ahorraran más o menos energía, pero los

resultados que obtenemos incluyendo este algoritmo son más cercanos a los obtenidos en un

escenario real, donde los procesadores que contienen los servidores ejecutan este algoritmo

de manera nativa internamente.

9

Adicionalmente, la importancia de la inclusión de los Gráficos Acíclicos Dirigidos

(DAG) nos permiten reafirmar la autenticidad de los resultados y la reducción de la energía

consumida con estos planificadores de lógica borrosa, ya que el orden de las tareas que son

ejecutadas en estos experimentos siguen un patrón real de consecuciones, y además sus

longitudes no son generadas de forma aleatoria.

Desarrollos futuros de este proyecto incluirán una optimización multi-objetivo, basada

en ambos parámetros tiempo y energía como fitness en estos algoritmos. Como es esperado,

los resultados de la Tabla 16 muestran una buena reducción de energía, pero no podemos

conseguir una gran reducción del tiempo de procesamiento. Esto es debido a que todos los

algoritmos lógica borrosa y Pittsburgh están basados en utilizar la energía como fitness

únicamente, no teniendo en cuenta el tiempo de ejecución. Los resultados que obtienen un

menor tiempo de ejecución también obtienen mayor consumo energético. Esta optimización

multi-objetivo obtendría un equilibrio entre estos dos parámetros, consiguiendo valores que

satisfagan ambos requisitos de unos bajos tiempo de ejecución y consumo energético en los

sistemas de Cloud Computing.

En el caso del simulador propuesto, al ser de código libre ofrecemos una gran

oportunidad a investigadores de todo el mundo de poder probar diferentes algoritmos y

comprobar los resultados en tiempo, potencia y energía, sabiendo que puede ejecutar trazas

reales, reproduciendo cargas de trabajo de escenarios reales. Esto puede ahorrar inversión

en las primeras etapas de investigación, cuando se desean probar nuevos algoritmos. Una

vez que el algoritmo obtiene buenos resultados en el simulador, se puede probar en un

escenario real para confirmar su funcionamiento. En todo caso, este simulador de código

libre podrá seguir creciendo por manos de cualquier investigador que desee contribuir al

proyecto.

10

1. Background

The industry of Cloud Computing is in continuous growth in the recent years. More

and more enterprises offer processing or data storage services in their datacenters. This

makes users able to access those services wherever they are, making much easier to backup

and access to their important data, as well as obtaining additional processing resources for

their tasks.

However, the growth of the datacenters, containing more clusters and servers with

more resources is increasing their energy consumption, and hence, the global amount of

energy consumed. Also, the cooling systems needed to avoid high temperatures in

Datacenters are numerous and consume a high amount of power too.

The techniques that reduce power consumption also get a reduction in operational

costs, as well as an increase in system reliability, as higher power means higher temperature

and high temperatures lead to higher failure rates. As stated in [ 27 ], for every 10ºC that

temperature increases in a system, the failure rate is doubled. An imminent solution is needed

to solve this.

As it is mentioned in several papers [ 1 ][ 2 ][ 3 ][ 4 ] the energy consumption due to

datacenters was up to a 0.5% of the global consumption in 2008, and it is expected to

continue raising to quadruple it on 2020. Additionally, estimations show that the energy

consumed by a super computer is equivalent to the energy consumed by 22000 households

in the US.

There is a wide range of previous work related to this project. In [ 43 ]. M. Seddiki

et al. introduce a power-aware FRBS scheduler for VMs in CloudSim and RealCloudSim,

but they do not consider tasks scheduling or the use of workflows.

In [ 44 ], García-Galán et al. propose a new strategy named KARP, as an alternative

to the use of the Michigan approach. KARP is based on PSO, while the Michigan approach

is based on the Genetic Algorithm. They apply this new strategy in Grid Computing, getting

better results than using the genetic counterpart. In [ 45 ], they compare different strategies

of knowledge acquisition in fuzzy systems, comparing KASIA with the Pittsburgh approach

and KARP with the Michigan approach, and applying these techniques to Grid Computing.

They show how KASIA and KARP, both based on PSO achieve better results than their

genetic alternatives. The KASIA strategy is introduced in [ 46 ], where they show the

advantages of the PSO algorithm against Genetic ones. They also cover fuzzy meta

schedulers based on Genetic Algorithms in [ 47 ].

11

2. Objectives

The main goal of this project consists on reducing the power and energy consumption

of datacenter networks. Every algorithm implemented needs to be tested. To avoid the

necessity of owning a datacenter, this project will be based on simulations. The simulator’s

choice is important, due to the fact of obtaining results the most similar as possible to a real

network. If the results got in a simulator don’t match a real scenario at all, the analysis and

conclusions are pointless, and we can’t say the designed algorithms save energy in real

datacenters.

With this in mind, the first stage of this project consists on providing a good

simulation environment as the base for the second stage, the design of the algorithms in order

to reduce the energy consumption. This way, the results got at the end of the process will be

based on a fairly good simulation environment. Always need to note that the simulation

results will not be perfect. A real processing in a real datacenters consider multiple

parameters, and not all of them are considered in the simulators. However, simulators are

getting better matching with real networks and provide researchers a great tool for testing

different types of algorithms to improve the performance of the datacenters.

There are several power-saving techniques in Datacenters. In section 3.1.5 we show

some of the most important of them. The methods we use are based on a combination of the

Dynamic Voltage and Frequency Scaling (DVFS) algorithm and the development of two

rule-based expert systems to provide power-aware schedulers in both VMs and tasks scopes.

The first algorithm is already implemented in the base simulator chosen and we describe its

use and analyze the power and energy savings got with it. The two developed expert systems

are based on a single fitness, considered energy. Throughout the documentation, we explain

in detail the parameters considered in each expert system and analyze the results got.

12

3. Methodology

In this section, the overall process of this project will be explained, from the starting

point until the final results, explaining in detail step by step in order to make the reader able

to reproduce the experiments. It will begin with a state of the art, introducing several

concepts that need to be understood in order to comprehend everything in the project. After

that, the overall process will be explained following execution order, from the initial

situation, describing problems encountered and approaches to solve them, until the

simulation environment is built and explained. Then, the same procedure will be performed

with the second stage, describing step by step everything needed to get the algorithms that

will save energy consumption. Results and analysis will be described in the next section.

3.1. State of the art

In this section we will introduce some concepts that need to be understand in order

to fully comprehend the project.

3.1.1. Power model

For being able of knowing if the techniques developed in this project achieve the

intended reduction of the energy consumption, we need to be able to measure the power

consumed. Each task that will be executed in a resource will last a certain amount of time,

and that particular resource will consume a certain amount of power. The relation between

both parameters allows to estimate the energy that is consumed, following (1).

𝐸 = 𝑃 · 𝑡 ( 1 )

This relation will have a great importance in this project, as the balance between

power consumption and execution time will limit the energy that can be saved. A machine

that can execute a task under a lower time is normally due to having a larger amount of

resources, which tends to consume more power. If the amount of time saved is equivalent to

the power over consumed, there has not been energy savings at all, so this relation will be

the center of this entire project.

Execution time is easily got as a ratio between the task’s length and the performance

of the machine. The power parameter is a more delicate issue. The globally used power

model considers energy consumption as a sum of two parts: a dynamic and a static

component [ 28 ].

𝐸 = 𝐸𝑑 + 𝐸𝑠 ( 2 )

13

The dynamic component depends directly on the processor, while the static

component depends on memory, I/O devices and storage. The dynamic power can be easily

modified, as will be seen in section 3.1.2, but the static part is not. This is why this static

component is normally considered as a relation with the dynamic component. The dynamic

power is estimated as on (3) [ 29 ], where there is a relation to the processor voltage squared,

the frequency of the processor, its capacitance and a constant 𝑘𝑑.

𝑃𝑑𝑦𝑛𝑎𝑚𝑖𝑐 = 𝑘𝑑𝐶𝑉2𝑓 ( 3 )

Power considered in equation (1) is related to this dynamic component, which is the

part that depends on the CPU parameters. So with (1) and (3) we get the dynamic energy.

𝐸𝑑 = 𝑃𝑑 · 𝑡 ( 4 )

The static component of the energy is often considered as a portion of the dynamic

component (5) [ 30 ][ 31 ], where 𝑘𝑠 is a constant.

𝐸𝑠 = 𝑘𝑠𝐸𝑑 ( 5 )

With this consideration, and joining equations (2) to (5) [ 28 ]:

𝐸 = 𝐸𝑠 + 𝐸𝑑 = 𝑘𝑠𝐸𝑑 + 𝐸𝑑 = (1 + 𝑘𝑠)𝐸𝑑 ∝ 𝐸𝑠(𝑉, 𝑓) ( 6 )

This is why the variation of the global energy is considered as a proportion to the

dynamic component, the one that we can vary with the CPU’s parameters: voltage and

frequency, but always taking into account the balance expressed on (1). The technique

introduced in the next section also depends on this balance. Always keep in mind that a

reduction of a machine’s performance reduces power consumed, but increasing the

execution time of the tasks the machine will execute.

3.1.2. Dynamic Voltage and Frequency Scaling (DVFS)

Modern CPUs already include this technique which is used to dynamically vary the

voltage and frequency of the CPU to adapt to the workload, which will be measured as the

utilization of the CPU. Its main objective is trying to reduce the energy consumption. Its

behavior depends on the governor selected. There are two basic types: static and dynamic

governors. While static ones use a fixed value for both voltage and frequency, dynamic

governors are based on thresholds, so the voltage and frequency are varied based on the

current utilization in respect to the configured threshold. It is important to note that the

different available levels of voltage and frequency depend on each processor. Those values

are displayed as multipliers, so that we have a number of discrete values.

14

In this project, we are going to work with 5 different governors. We are introducing

them and explaining their differences here.

Static

Performance: fixes the values of voltage and frequency to achieve the maximum

value of performance. This is the configuration that works as per default, which

means, as if the DVFS algorithm was not active. This always achieves the lowest

execution time in the machine, but incurs the highest power consumption.

PowerSave: the opposite as the Performance governor. Fixes the lowest values

of voltage and frequency, achieving the lowest power consumption as possible,

but enlarging execution time, delaying the end time.

UserSpace: allows the user to select the desired values for both voltage and

frequency. In this project, not many tests have been performed using this

governor, due to its dependency with the different available values of the

multiplier of the processor and the current selection of the user.

Dynamic

OnDemand: works comparing the current value of the utilization of the CPU with

a preselected threshold. In the simulator used, this threshold is configurable. The

default value in the simulator is 95%.

Conservative: varies from the previous governor in considering two thresholds

instead of one, called up_threshold and down_threshold. Their default values in

the simulator are 80% and 20% respectively.

The behavior of the dynamic governors allows to increase the performance when

needed, if the utilization of the CPU exceeds the up_threshold, and also allows to lower it to

save power when utilization falls below the down_threshold. Note that in OnDemand both

values expresses the same one. To avoid a constant variation of the multipliers, an iterator is

introduced. When utilization falls below the threshold in OnDemand, an iterator counts a

number of steps, and, if after it reaches a certain value the utilization continues being under

the threshold, then the multipliers and lowered one step.

A more detailed comparison of these governors is shown in the results, where we

show those that achieve to reduce the energy consumption and different problems that incur

in some of them.

15

The general objective of this algorithm is to reduce both values of voltage and

frequency when the utilization is low. While utilization is below 100% there is no further

delay in the execution’s end time. In other hand, if utilization exceeds the threshold or even

reach 100%, voltage and frequency will be switched up until it reaches its maximum, so to

use the maximum performance of the CPU if it is needed in the execution of those tasks.

Now consider 𝑃𝑖𝑑𝑙𝑒 as the amount of power consumed when the utilization of the

CPU is 0, and 𝑃𝑏𝑢𝑠𝑦 the power consumed when the utilization is 100%. This technique is

based on knowing that even though both 𝑃𝑖𝑑𝑙𝑒 and 𝑃𝑏𝑢𝑠𝑦 depend on CPU’s utilization 𝑢 = 0

and 𝑢 = 1 respectively, power depends on (3), meaning that as this technique decreases both

voltage and frequency, power consumption will be reduced. A CPU can work at a certain

number of different frequencies depending on the multiplier, and voltage is scaled with the

frequency. Lower voltage implies that lower frequencies can be selected on the CPU, as both

parameters are interrelated. The basic power model used on the simulator considers that both

𝑃𝑖𝑑𝑙𝑒 and 𝑃𝑏𝑢𝑠𝑦 values have been got experimentally on a modern CPU that supports DVFS,

meaning that it is possible to measure both parameters at different values of the multiplier,

having different frequencies and voltages, so that an array of values will be gotten for each

of them.

Using DVFS in the simulator we can estimate the power and energy consumption

based on a real CPU’s behavior, as modern processors introduce this technique. Based on

both 𝑃𝑖𝑑𝑙𝑒 and 𝑃𝑏𝑢𝑠𝑦 values and the utilization level of the CPU we can estimate the power

consumed by the CPU of a computer based on a linear increase of the CPU’s utilization.

𝑃(𝑢) = 𝑃𝑖𝑑𝑙𝑒 + (𝑃𝑏𝑢𝑠𝑦 − 𝑃𝑖𝑑𝑙𝑒)𝑢 ( 7 )

while this utilization u is used to check the thresholds of the selected governor and scale

frequency and voltage to minimize the overall energy consumption.

This way, we can estimate power and energy consumption in a simulator also

considering the savings introduced by the DVFS algorithm. This allows us to have a

simulation environment nearer to the behavior of a real processor.

3.1.3. Fuzzy Logic

This type of logic is commonly used in Fuzzy Rule Based Systems (FRBS). The main

concept of this logic is based on knowing that the value of a variable can be a real number

between 0 and 1, instead of true or false. The range of values that get each output for each

16

variable is specified in the Membership Functions. This type of logic gets systems working

nearer to the way humans evaluate the state of a parameter.

A basic example would be measuring the temperature of the water. Classic logic

based on true or false variables would only get discrete values for the input parameters, while

fuzzy logic allows defining a combination of the different membership functions based on

the level of membership with each of them. In the case of water temperature, we could define

different levels of temperature based on the value of the temperature in ºC. For example,

defining 5 membership functions we could have the following values:

Level Temperature

Very cold <10ºC

Cold >10ºC AND <20ºC

Warm >20ºC AND <30ºC

Hot >30ºC AND <40ºC

Very hot >40ºC

Table 1: basic temperature levels

Using classic logic based on IF ELSE statements, a value of temperature of 34ºC

would be defined HOT according to Table 1. However, every value of temperature between

31ºC and 39ºC would be indeed defined as HOT, offering the information that all these

values incur the same type of temperature. This is not how we humans think, as everyone

would agree that 36ºC is hotter than 32ºC. This cannot be achieved using the traditional logic,

and it is why the concept of fuzzy logic was created.

With this new type of logic, we can now define in a system different ranges of values

of the temperature for each specified level. This way, a value of 34ºC would belong to both

WARM and HOT levels as it is an intermediate point between both levels. The width of each

membership function depends on the user that configures them. Figure 1 below shows a

possible configuration for these membership functions. A width of 20 has been set on each

of them, allowing each temperature to belong to two membership functions.

There are several ways of defining the membership functions in a FRBS. Typical

methods include triangular, pyramidal and Gaussian functions among others. The one used

in this temperature example uses triangular functions. Some systems as jFuzzyLogic

consider triangular and pyramidal membership functions as a number of points, while other

17

authors like in [ 5 ] consider pyramidal functions as two points and two slopes. Matlab makes

easy the use of Gaussians. Figure 1 has been generated with jFuzzyLogic, and as a simple

example, we show the code to build this antecedent.

FUZZIFY temperature TERM very_cold := (0, 1) (5, 1) (15, 0); TERM cold := (5, 0) (15, 1) (25, 0); TERM warm := (15, 0) (25, 1) (35, 0); TERM hot := (25, 0) (35, 1) (45, 0); TERM very_hot := (35, 0) (45, 1) (50, 1); END_FUZZIFY

Figure 1: fuzzy temperature levels

Normally in a FRBS, the input value range is not specified as in Figure 1, considering

values of temperature between 0 and 50ºC, but instead are normally normalized between 0

and 1. Input value then, need to be normalized according to the maximum possible input

temperature, which in this case is 50ºC. This is done this way to allow a single FRBS be able

to work in different systems. Imagine that we have two systems that measure temperature.

If one system considers temperatures between 0 and 50ºC, and the other one takes values

between 0 and 100ºC, we can do two different things. As the temperature ranges are different,

we can simply build two FRBS. This, however, feels like having the same engine twice,

which is not really efficient. The other option is building just one FRBS between 0 and 1,

and make both systems use it. The only thing to do is, prior to send the input parameter,

system one need to normalize the temperature by the maximum of 50ºC, and system two

18

with 100ºC. This way we can avoid having multiple FRBS, always that both systems

consider the same number and type of antecedents.

This is the case of the antecedents, which is the name that receive the input

parameters. But fuzzy logic considers other different objects apart from the antecedents.

- Fuzzification:

This is the first stage. Input parameters are received here after they have been

normalized, and this engine is in charge of determining the level of membership that the

normalized input parameter has in each membership function. Following the current

temperature example, a value of 27ºC would belong in a high grade to the membership

function WARM, and would also belong, but in a lower grade to HOT. This engine

determines this grade of membership to each membership function. These grades of

memberships will be used by the inference engine to evaluate the rules.

- Rule base:

The rule base includes all the rules that will be executed in the inference process.

Rules are defined in an IF THEN clause that will take the level of membership of some or

all the antecedents to a specified membership function and use them to obtain a value for the

output consequent in a membership function. This can be easier understood with an example.

Following the FRBS specified in [ 5 ], we get a system with two antecedents, temperature

and variation of the temperature and a consequent, the output response to the air conditioner.

Table 2 shows the membership functions for this system.

Antecedents Consequent

Temperature Variation Output

Cold Slow Very low

Cool Moderate Low

Mild Fast Medium

Warm

High

Hot Very high

Table 2: membership functions of air conditioner FRBS

19

A sample rule would be, if the temperature measured is too low, belonging to Cold,

and the variation of the temperature is high, belonging to Fast, which means that the

temperature is dropping at a high rate, we would want the air conditioner to heat the room

fast, so the output response would be Very high. In fuzzy logic, this rule is considered as

follows:

IF temperature IS cold AND variation IS fast THEN output IS very-high

This is an example of a possible rule. Always keep in mind that there can be several

rules in the FRBS and all of them are evaluated each time the system evaluates the input

parameters to get a value for the output. Then, we need in the system a set of rules that cover

the majority of the situations that our system can find when receiving the input parameters.

If we have rules for both low and high temperature values, it is obvious to think that the

temperature measured and received as input will be either high or low, but never both at the

same time. What the fuzzifier does in the first stage, is obtaining the grade of membership

of each input to each of their membership functions. As stated most of the memberships

grades will be 0, so when evaluating a set of rules, if the input temperature is too low and its

grade of membership to the high membership function is 0, the evaluation of the rules

considering the high membership function will get a null value for the output.

There are two types of rules: those which consider AND clauses in the antecedents

and those which use OR clauses. The main difference between both of them is as follows:

while AND rules set the consequent value as the minimum value of all of the antecedents,

OR rules set the consequent value as the maximum of the antecedents. In the prior rule shown,

if the grade of membership of temperature to cold is 0.8 and the grade of variation to fast is

0.3, the grade that output will be set to very-high will be 0.3, as AND set to minimum.

Modifying the rule to OR with the given antecedent values would get an output grade of 0.8.

- Inference engine:

While the rule base only stores the rules defined by the user, which are the same on

all the evaluations, the input parameters change in each evaluation and the output value will

vary. This evaluation of each rule with the current values of the antecedents is performed in

the inference engine, which receives the grade of membership of each antecedent to their

membership functions from the fuzzifier. After this evaluation, the system has an output

value for each rule.

20

- Deffuzification:

This is where all output values are considered to get a single normalized output value.

The typical way these values are merged into a single one is using the Center of Gravity

(COG) method. When the COG has been got, the deffuzifier will get the normalized output

value based on the membership functions of the consequent. Then, this normalized value

need to be denormalized in the corresponding system to get an appropriate output value from

the FRBS.

Figure 2 shows a diagram of this set of objects used in a FRBS.

Figure 2: FRBS objects

To understand better all these concepts, here we show an example of a FRBS

consisting in a system that evaluate the distance and speed of a car in a given time and returns

the acceleration of the car. The system needs to move the car until a wall set at a certain

distance and stop it in the wall. Table 3 shows the membership functions of the antecedents

and consequent.

Antecedents Consequent

Distance Speed Acceleration

Close Slow Much-brake

Medium Medium Brake

Far Fast Keep-speed

Accelerate

Much-accelerate

Table 3: car FRBS example

21

In this example, both antecedents will have a normalized range of [0 1], as both

distance and speed cannot be negative. A negative distance would mean that the wall has

been surpassed and a negative speed would mean going backwards, and both of them are

unwished situations. However, the output acceleration needs to be negative in those cases

where the car is wanted to brake and reduce its speed, so the normalized range of the

consequent will be [-1 1]. Figures 3, 4 and 5 show the different membership functions

displayed in Table 3

Figure 3: distance membership functions

Figure 4: speed membership functions

Figure 5: acceleration membership functions

22

To test this FRBS, first the rules need to be generated. According to the scenario, it

is wished that the car moves fast when it is far from the wall, but gradually reduces the speed

when it is getting closer to its destination. As a matter of testing, 9 rules are generated that

adjust to this requirements:

1. IF distance IS close AND speed IS slow THEN acceleration IS keep-speed 2. IF distance IS close AND speed IS medium THEN acceleration IS brake 3. IF distance IS close AND speed IS fast THEN acceleration IS much-brake 4. IF distance IS medium AND speed IS slow THEN acceleration IS accelerate 5. IF distance IS medium AND speed IS medium THEN acceleration IS keep-speed 6. IF distance IS medium AND speed IS fast THEN acceleration IS brake 7. IF distance IS far AND speed IS slow THEN acceleration IS much-accelerate 8. IF distance IS far AND speed IS medium THEN acceleration IS accelerate 9. IF distance IS far AND speed IS fast THEN acceleration IS keep-speed

Now to see how these rules work, we will test the FRBS with input values of

{distance, speed} = {0.8, 0.2}.

Figure 6: car FRBS test {0.8, 0.2}

As can be seen, being AND rules, all of them containing close distance and fast speed

get an output of 0. The COG is performed with the rest of the output values and, as expected,

the FRBS indicates to keep accelerating, as the distance is relatively far for the speed the car

is moving.

The output we get in a FRBS depends in a high grade on the rules configured by the

user. As it is difficult for a user to find the best possible rule configuration to accomplish the

23

best result, normally these FRBS are used along another system that optimizes these rules.

This is the case of the Pittsburgh and KASIA approach among others. These systems use

meta-heuristic algorithms like the genetic algorithm and Particle Swarm Optimization (PSO)

respectively to try an initially random generated members of a population or particles and

optimize them through a combination of them oriented to the best solutions found.

Here enter the concepts of exploration and exploitation. If the system is too focused

on exploring, then there will be low chances of getting a good final value, as the system is

not moving towards the good solutions. In the other hand, if the system is too focused on

exploiting, then there are high chances of getting stuck in a local optimum in the case of PSO

or incur in elitism due to a high selective pressure. For these two reasons, it is needed to find

a correct balance between these two approaches.

This way, using one of these systems we can find a set of rules that accomplish a

better result than with the rules we manually tried.

3.1.4. Cloud computing types

After years of researching, the community has developed several different types of

Cloud Computing networks, each of them based on offering a different service to their users.

This way, all different types are named after a service, using the nomenclature XaaS (X as a

Service) [ 23 ]. Among all these types, the most important are SaaS (Software as a Service),

PaaS (Platform as a Service), IaaS (Infrastructure as a Service) and HaaS (Hardware as a

Service), among others. There are many different types of taxonomies based on different

enterprises, as each of them tries to define Cloud Computing in their own way.

The mentioned types follow a layered architecture, being SaaS the upper level and

HaaS the bottom. Their order expresses the level of privileges that the final user gets in each

of them. Here we describe which is done in each of them.

- Application level

Contains the current application that is offered to their users. This will vary

depending on the enterprise. The use of this type of services include several benefits from

the point of view of the final user. First, the final user do not need to have specialized

knowledge for the support of the system. The responsibility of this step depends on the

enterprise which offer the cloud computing service, offering to the final user the availability

of the application at all times and ensuring that the application will work as intended.

24

- Platform level

If the final user needs more privileges in the cloud and prefers more flexibility, this

is the second layer of the stack. In this level, the final user is in charge of managing the whole

operating system instead of just making use of a built application. This allows the flexibility

of being able to work with the cloud server as if it were a local server, with the

responsibilities over the system and the applications.

The final user is given a platform with a certain performance, not being able to decide

on the VMs characteristics or the physical host management. However, this may be an

advantage for some final users, as they do not need to have the required knowledge to

manage these VMs and reduce the complexity of the system on their side. As in SaaS, the

hardware failures continue being responsibility of the IT enterprises that offer this service,

which avoid the final user the need of buying any hardware in case any of them stops

working.

- Infrastructure level

In case the final user wants still more privileges and absolute control over a physical

host, this layer offer a whole server of the demanded characteristics, offering the managing

rights of VMs creation and deciding how to divide the host’s resources. Still, final users get

rid of tasks such as security, backup and data partitioning. Three of the most important

virtualization technologies used at this layer of the cloud computing stack are KVM [ 24 ],

VMware [ 25 ] and XEN [ 26 ]. Virtualization technologies offer the configuration of the

whole performance of the physical host and divide it into different VMs. These VMs are

configurable, able of choosing which the performance capacities of each one of them is. This

allows having a dynamic resource management, with systems modifying the capabilities of

each VM to achieve the optimum working of the whole server and system.

Another great advantage for users that use this is the fact of always been able of using

the latest technology. In case the user prefers a local server to have full control over it, after

some time the hardware will grow deprecated and another server will need to be bought.

However, using these IaaS solutions the final user will always get the performance that is

being paid, not worrying about hardware. This allows users being able to compete at a much

lower cost than if they would need to regularly acquire new hardware.

25

- Hardware level

This level is normally managed by the IT enterprise that offer the cloud service to

the final users. In this layer, the physical resources of the Datacenter are managed, including

the physical servers, routers, switches and systems to provide power and cooling. In this

level, operations like hardware configuration, failure, power, cooling and traffic

management take place.

In the case of Hardware as a Service (HaaS), hardware belonging to the IT enterprise

is installed in the client’s site. A SLA indicates the responsibilities of each part, and the client

pay for the use of that hardware. This way, the final user is able of managing the hardware,

but leaving the replacement of broken parts to the IT owner.

Among the numerous advantages of using this solution, some of them are shared with

the IaaS alternative. For example, using this strategy the final use do not need to worry with

the obsolescence of hardware, as always that the level of performance needed increases, the

IT enterprise can replace the hardware for another one with more resources increasing the

fee the user is paying to the owner of the hardware. This hardware is then able to be used for

another client who does not need as much performance.

Additionally, in case there is some failure with the hardware, the IT is in charge of

replacing it for new one as it is one of their responsibilities. This way, the amount of funds

that need to be invested by the final users each time an upgrade in hardware is needed is

much lower than needing to buy the hardware themselves.

Also, in case of failures, the Managed Services Provider (MSP) is able to do the

troubleshooting instead and help solving the problem. Maintenance can be relied on them

too, making much easier the task of keeping the hardware working as intended.

Another advantage is scalability. If a client needs to increase the amount of hardware

of the Datacenter, the MSP can easily install additional hardware increasing the fee.

Although this is obvious to think, there is another possibility that is also important to note.

In case the client buys the hardware him/herself and the enterprise contracts, the amount of

hardware needed will be lower, but hardware already bought means money already invested.

With HaaS solutions, it is as simple as getting the MSP to take away some hardware and

reducing the free paid.

The information shown concerns the architecture of cloud computing. There is also

another classification of cloud computing as types.

26

- Public clouds

The services offered by the cloud are available to every user. These networks

normally do not incur in an initial investment, but lacks the control of the data managed.

Also, the security offered on their applications or storage services may not be enough for

some clients, who will then consider paying for a private cloud if their data is too important

to sacrifice reliability.

- Private clouds

These clouds are designed by the use of a single organization, which offer a higher

level of privacy over free public solutions. They offer the highest level of control over

reliability, performance and security, as they can be managed by the same organization that

pays for them. In case of Datacenters self-run by the organization, they may incur in high

costs and they will require high amounts of space for allocation of the physical machines

within the organization. They also need to invest funds in management and maintenance,

which increases costs too. In this case, they can even lack the benefit from less focused

management, and in part losing part of the concept that makes cloud computing a great

solution to many organizations.

- Hybrid clouds

Being a combination of both public and private alternatives, these networks try to

solve the problems encountered in both them by themselves. There are two parts of the

infrastructure in both different networks, having one part in a public cloud and another one

in a private cloud, which gets a higher flexibility than running only one of them. This can

offer the possibilities of storing important data of the organization in the private part, while

leaving some applications run in a SaaS solution in the public network.

They offer a higher level of control and data security than solo-public solutions, but

continuing facilitating the scalability of the network by admitting extending the Datacenter

capabilities with the addition of external services in public clouds. Another advantage is

meeting temporary capacities, in those times when a high amount of resources are needed to

process a high amount of data. Then, external resources can be used to solve a peak problem

of burst data. This means that the client is paying for those extra resources only when they

are needed, avoiding the necessity of owning a huge private cloud to meet users SLA at all

times and avoid breaching their agreements. This way, the private Datacenter can be built to

27

support processing average workloads, and letting external resources to deal with additional

traffic.

The difficult component of this mixed solution is deciding and optimizing the best

division of components that run in each of both parts.

- Virtual private clouds

This is another solution to solve the problems found running a cloud solely public or

private. Essentially, it consists on a platform running over a public cloud, with the difference

of using Virtual Private Networks (VPN) that allows defining own topologies and security

settings.

3.1.5. Power saving techniques in Datacenters

There are many different types of techniques to achieve a reduction in the power

consumed by Datacenters. In this section we show a summary of some of the most important

of them and indicate the direction this project moves in the power saving scope.

1- DVFS

As shown before, DVFS [ 8 ][ 9 ] is a technique that, based on the type of governor

used, adjust both frequency and voltage levels of a processor adapting to the workload. This

technique allows a higher performance at a higher power cost and, in the other hand, achieve

power reduction at the cost of performance. This algorithm is included in the Linux kernel

at it allows the execution in servers easily. This technique is not a substitution of other

algorithm in the same scope, but an addition to continue increasing the level of power saving

in datacenters.

As the reduction of the power consumption also incurs in a reduction of the

performance, it is not wise to set power to the minimum, as it would make difficult to

accomplish the agreements set in the SLA of the user. To achieve the maximum power

saving, the levels of frequency and voltage need to be set to the lowest possible level that

accomplish the SLA, keeping both performance and power to the minimum possible levels.

Servers in old days when this technique was not yet implemented used to continuously have

both frequency and voltage levels set to their maximum, consuming at all times the

maximum power depending only on the utilization. This meant an incredibly high amount

of power consumed in idle servers, which now is fairly lower.

28

2- Power Capping and Shifting

This is a similar technique to DVFS. It considers the power/performance balance, but

the algorithm works in a different manner. First, power capping limit the maximum amount

of power that each server can consume. Then, power shifting adjust the power of each cluster

considering the maximum limited in the first stage. The main difference between this

algorithm and DVFS is the network component. While DVFS adjust the power levels

individually in each server, this technique allows to set the power levels in different servers

within a cluster. [ 6 ][ 7 ] show the effectiveness of this algorithm in the power saving scope.

3- Server virtualization

Another typical technique to reduce power is virtualization. This technique allows to

run several virtual machines in a single physical server. Benefits from this approach are

multiple.

First, cost that incur in a single server are lower than running several of them. Even

if we are talking of a server with a high amount of resources in order to be able to run several

virtual machines, the hardware cost of this server is lower than needing several individual

servers.

Another important thing to consider is that the normal state of server is low utilization,

leaving the majority of its resources unused. Knowing this, the sum of the amount of the idle

resources of individual servers is not needed in a big server that keeps them as virtual

machines, as a virtual machine will only need a high amount of resources and a high

performance in rare occasions.

Additionally, the amount of power idly consumed by the single server is way lower

than a group of servers, as the number of processors needed will be lower in the first case.

Also, the power consumed by the cooling systems will be lower, helping to get less amount

of power consumed in this part too.

Another great characteristic of having multiple VMs in a single server is the

possibility of consolidating the workloads of different VMs into a single one without

network traffic delays. Also, being able to migrate VMs between physical machines [ 10 ]

allows that, in a situation having VMs running on two or more physical machines with a

really low utilization, all of those VMs can be migrated onto one of the servers and turn off

the rest of them, always that the sum of workloads can be executed in that single server. This

permits getting rid of the static component of the power consumed on different idle servers,

which is high enough to no being considered negligible. These characteristics offer to

29

virtualized servers a flexibility that helps reducing even more the power consumed in

datacenters [ 11 ][ 12 ].

4- Server consolidation

From the migration characteristic, another technique is derived which helps reducing

the power consumption on a Datacenter by consolidating the workloads of different servers

into the minimum amount of them running at full performance switching the rest of them to

a low power state or even turned off to get rid of most of the idle power of the servers.

The base of this technique is that knowing that the static part of the power consumed

by a server, the part that does not depend on the processor and cannot be lowered with

algorithms such as DVFS, is relatively high enough to between 30% ~ 70% depending on

the server, if we get to cut off this component in, for example, 7 of 10 servers and leave only

3 running at full performance, we are indeed getting a great reduction of power consumed.

The main problem of this technique is that if suddenly, a high amount of tasks arrives

to the Datacenter, it takes high response times and transition costs to turn on the off servers

and migrate VMs between them. One of the proposed solutions to this problem is introduced

by Anagnostopoulou et al. [ 13 ] where they mention a design in which servers are switched

to a low-power barely-alive state, with most of their components in an off state but being

still accessible to petitions and incurring a much lower delay to be turned up when needed.

Another approach by L. Liu et al. [ 14 ] introduces live migration of VMs between physical

machines, so that the server consolidation can continuously be made, migrating VMs even

when being executed without users note their migration. This is a great addition, as the

system needs not stop running the VMs to migrate them, but instead can periodically run the

consolidation process to optimize the power consumption always keeping the less amount

of servers switched on. Leverich et al. approach [ 15 ] include a mechanism to manage

multicore processors, controlling the power supply individually to different cores within the

processor.

5- Load prediction

Based on server consolidation, the main problem mentioned is based on slow

response and high delays on VMs migrations and ON/OFF servers switching. Adding a

system that predicts the workload income would allow the system to ease the effects of those

high delays by switching on the servers with enough time before traffic load is increased

[ 16 ]. Indeed, to achieve this a system able to predict load with high precision is required.

30

Additionally, the load prediction can help deactivating the servers to increase the power

saving when the prediction shows a low load income. However, a bad prediction could make

the system to turn servers off just before the reception of high traffic load, violating the SLA

agreements of the users or, on the other hand, turn all servers on in a low workload stage,

wasting energy.

6- Thermal-aware techniques

Most of the mentioned techniques do not consider the systems temperature at all.

They work taking time and power/energy into consideration in their optimizations, but they

leave temperature out of their parameters. However, it is also important to note that higher

power leads to higher temperature, and hence, higher consumption from the dissipation

hardware. From this, some techniques are derived that seek power reduction but considering

thermal properties as well.

Moore et al. [ 17 ] consider the following. In a Datacenter, there are a number of

temperature sensors scattered around the room. Some of the servers are nearer to these

sensors than others, and this may lead to an inefficient scenario where not all servers generate

the same amount of heat and some of them may overheat due to this fact. Based on this

information, this techniques proposes the use of a heating map, so that the scheduler knows

which server is closer of further to a sensor, taking this into account on the distribution of

tasks made by the scheduler to try to avoid an overheating of some servers over others.

Li et al. [ 18 ] present a model to predict the temperatures near the servers inside the

Datacenter. It is based on measurements of temperature streams and airflows, predicting the

heating in different parts of the room, and hence, the heating of different servers based on

their location. It also uses sensors to take measurements of temperatures and helping the

prediction system to learn different parameters to build a good prediction system.

Patel et al. [ 19 ] also include models to adapt the workload given to the servers based

on seasonal and diurnal temperature, as a method of heating prediction based on the moment

of the day and day of the year. They show an example of two Datacenters set one in the US

and the other in India. The system know that during daytime in India, it is nighttime in the

US, and prefers to send to traffic load there to avoid using a high amount of energy for

thermal dissipation.

7- Workload scheduling

As nowadays, Datacenters tend to be composed of several servers, the workload

received to the Datacenter needs to be scheduled to determine which physical machine will

31

execute each task within the workload. The global power consumed by the Datacenter

depends from this scheduling decision, making this an important step when looking for

power savings. A bad scheduling could not only incur in higher power consumption, but also

in a longer execution time of the tasks within the workload.

There are different types of schedulers. Some are based on selecting which physical

machine will process a certain task, selecting the VM within the host. Other type, called

meta-schedulers, are in charge of optimizing the functioning of these first schedulers. The

first type are used in a local network, where different servers are located and optimize the

operation of these servers. The second type interconnect different networks, deciding where

to send different group of tasks to be processed. This helps balancing the workload in

different networks, keeping all of them in a similar load level to optimize the whole

Datacenter, in those cases where it is big enough to be composed of several networks.

8- Energy aware task scheduling

Tasks schedulers are divided into three different types: offline schedulers takes

decision of tasks prior to the execution. Online schedulers take decisions on a dynamic

method during the execution. Hybrid schedulers combine both approaches, performing a

prior decision and adapting dynamically during the execution. Wang et al. [ 20 ] consider a

model where users admit decreasing the performance specified on their SLA a given

percentage to allow the Datacenter system saving energy.

Some authors show that schedulers should be built according to the type of load that

they will work with. Others mention taking into account load balancing between the different

servers [ 21 ] and take information of the network connections [ 22 ].

From all these techniques, our work focus on both DVFS and scheduling through

rule-based expert systems as FRBS. The first algorithm is implemented considering the five

different governors that have been presented in section 3.1.2. Scheduling has been divided

into two parts, considering VM scheduling in the first part and tasks scheduling in the second

one. These schedulers will be presented in the second stage of this project.

32

3.2. First stage: simulation environment

At the beginning of this project, we part knowing what we want to do and how to do

it, but for being able to begin working on it first we need a good simulation environment.

This is an absolutely necessary stage. Consider an algorithm designed that, after testing

shows that it really achieves savings. If those tests have been performed on a simulator which

behavior is far from a real datacenter, we cannot guarantee that the designed algorithm will

be able to save energy on a real network. This is the strong reason that impose a first stage

which main objective is building a simulation environment as real as possible.

The main requirements in which we base the simulation choice will be simple but

necessary:

1. It must be able to estimate the energy consumption. If the chosen simulator

cannot show an estimation of energy consumed, we are unable to know whether

the designed algorithms can achieve energy savings. This is an essential

characteristic of the simulator wanted.

2. Also, and not less important, the simulator must be able of simulating traces that

represent a real processing. If the tasks are randomly generated, we have the

problem that the tasks to execute don’t match a real execution in a datacenter.

We need this characteristic in the simulator in order to have results as real as

possible.

3. Also, we have a personal preference for free code software, so that everything we

design can afterwards be useful for the community.

With this in mind, and studying the current available simulators with these three

preferences we were unable to find a platform that could satisfy our needs. However, we

found a simulator that, even though it did not cover any of the two main characteristics, there

were two extensions that could offer us what we were looking for. With only one problem:

each extension covered one of the characteristics. Then we had what we needed, but in two

separated simulators. The solution is simple to note, just join them into one scenario. This is

anything but a trivial task.

First we will show here the different simulators and their behavior and functionalities,

similarities and differences.

33

3.2.1. CloudSim

CloudSim [ 32 ] is an open source Cloud Computing simulator written on Java. It

allows simulating several types of scenarios defining different simulation entities. Each

entity will be in charge of a certain number of operations to do and they communicate with

each other using different commands, identified by a tag. The two main entities used in

CloudSim are the Broker and the Datacenter. The Broker acts in the name of the user and

keep all the tasks (named Cloudlets in this simulator) that the user need to compute. The

Datacenter entity models, in fact, a Datacenter, which contains all the physical machines

simulated. As these physical normally have a high amount of resources, tasks are not

assigned directly to them. Virtual Machines (VM) are created so to divide those resources,

allowing several tasks to be executed in parallel on the same machine.

To understand the functioning of this simulator, a basic scenario will be explained

here. This is important to know, as this is the basic behavior of the simulator. Both extensions

include several differences, so for now we will explain the basics.

1. Datacenter is registered in the CIS entity. Then broker can get a list with all

Datacenters registered, in this case only 1.

2. Broker request to each Datacenter of the list its characteristics.

3. Upon receiving the information from all Datacenters, Broker will check which

Datacenter is suitable for its processing requirements and choose it.

4. Broker will solicit to the Datacenter the creation of a certain number of Virtual

Machines, in which the Cloudlets will be executed.

5. Datacenter notifies to Broker of the creation of the Virtual Machines requested, in

addition to inform about any change in VMs capabilities, in case there were not

enough MIPS left for all VMs.

6. Broker then send the Cloudlets step by step, in this example, only 1 Cloudlet is sent.

7. Datacenter process the task only taking into account information related to execution

times.

8. Execution results are given back to the Broker.

9. When there are no more Cloudlets left, Broker solicits the VMs destruction and finish

the simulation.

This example is shown in Table 4 and Figure 7, which contain the basic messages

that are used to communicate both entities. They depend on the tag, so each Tag ID

corresponds to a certain action to do in the destination entity which receives the command.

34

Tags Flow

Number Source Destination Tag ID Tag

1

Datacenter CIS 2 Register Resource

Broker Broker 15 Resource Characteristics

Request

2 Broker Datacenter 6 Resource Characteristics

3 Datacenter Broker 6 Resource Characteristics

4 Broker Datacenter 32 VM_CREATE_ACK

5 Datacenter Broker 32 VM_CREATE_ACK

6 Broker Datacenter 21 CLOUDLET_SUBMIT

7 Datacenter Datacenter 41 VM_DATACENTER_EVENT

8 Datacenter Broker 20 CLOUDLET_RETURN

9 Broker Datacenter 33 VM_DESTROY

Broker Broker -1 End_of_Simulation

Table 4: CloudSim’s basic example

Figure 7: CloudSim’s basic example

This is the basic behavior of the simulator. Changes introduced by the extensions add

complexity and functionality to this scenario. Note that in this case, none of the necessary

characteristics are considered. Datacenter only gives an estimation of execution time at the

end of the simulation, but does not provide a power model which serves as a tool to estimate

35

the global power and energy that would be consumed in those tasks processing. Moreover,

this simulator allows to create the Cloudlets that will be processed freely, that is, allowing

to decide their length and not following a specific order or restriction. This gives the user

enough freedom to specify the desired scenario to simulate, but its behavior does not satisfy

the necessity of a real scenario. Even though this is a simulator, it is desired to be able to

simulate a real processing, rather than allowing random tasks to be executed and measured.

This gives no information regarding the capabilities of energy saving if is not under a real

scenario simulation.

3.2.2. CloudSim with DVFS

This extension [ 33 ] includes the power model and DVFS algorithm explained in

sections 3.1.1 and 3.1.2. Several classes are included in order to add these functionalities to

the basic scenario explained before. The entities used in this simulator are the same as those

used in the basic version of CloudSim. There is no need of showing a table containing the

different messages like in the prior example, as the only variation is related with the tasks

processing in the Datacenter entity. When the process reach event Tag 41, Cloudlet

execution, instead of performing an estimation of the time needed to fully process the

Cloudlet, the event is repeated a number of times estimating the power consumed until the

current time reaches the end time of the Cloudlet. The amount of time elapsed between these

events is called Scheduling Interval, by default set to 0.01 seconds.

The process is as follows. Once the Cloudlets are sent from the Broker to the

Datacenter to be executed in parallel, the Datacenter will estimate the execution time for

each task, which added to the current time of the simulator gets the end time for each task.

Then, the execution enters a loop, which is repeated until the current simulation time exceeds

the end time of one of the tasks. As explained, time elapsed between these loops is the

scheduling interval. The process performed in this loop estimate the power consumed by

each task that is been executed. Using the current value of the utilization of the CPU and

equation (7), the Datacenter estimates the current value of power that is being consumed in

that exact moment. When all tasks have been executed, the system will get the energy value

from the total power and time.

This allows us to know whether the scheduling algorithms we design save energy or

not, which is our main objective in this project. However, as it has already been mentioned

before, this extension does not base the processing tasks into a real trace, and continues

having the same problem that the basic version of CloudSim.

36

3.2.3. WorkflowSim

This simulator developed by Weiwei Chen [ 34 ] extends CloudSim to make it able

to work with Directed Acyclic Graph (DAG) [ 36 ] in a XML format, called DAX. With this

addition, instead of simulating a series of tasks as Cloudlets, we can simulate and experiment

using traces of real Datacenters workflows. These DAX files are generated using a program

called Pegasus [ 35 ], and it can be used to convert these XML files to an image format so

that it is easier to understand how it works. The image contained in Figure 8 shows the graph

that will follow a specific execution order of the workflow called Montage_25 [ 37 ].

The advantages of using workflows are numerous. In one hand, the use of workflows

in the simulator help determining the order that tasks will be processed, including constrains

that avoid a tasks to be processed before its parent tasks. This make sure that the tasks are

processed following the intended order, instead of what happens with random generated

tasks in the case basic Cloudsim. This can be useful when simulating a trace containing a

number of tasks that reproduce the real traffic of a real workload in a real system.

Figure 8: Montage 25 DAG

ID00019

ID00021

ID00018

ID00015

ID00017 ID00016 ID00020

ID00014

ID00011 ID00010ID00013 ID00012

ID00022

ID00008ID00009

ID00024

ID00002ID00003

ID00007

ID00000

ID00006 ID00005

ID00001ID00004

ID00023

Montage::mProjectPP:1.0

Montage::mDiffFit:1.0

Montage::mConcatFit:1.0

Montage::mBgModel:1.0

Montage::mBackground:1.0

Montage::mImgTbl:1.0

Montage::mAdd:1.0

Montage::mShrink:1.0

Montage::mJPEG:1.0

37

Each task have inputs and outputs of a certain size. Being able to process these files,

our simulations are based on real workflows instead of a series of a number of Cloudlets, so

this can establish a base for optimizing Cloud environments.

The main difference with CloudSim consists in the use of additional entities, called

Planner, Clustering Engine, Engine and Scheduler, this last one substituting the Broker.

Planner will be in charge of the highest level planning, and it will call the Parser class to

transform the DAX into tasks so that the simulator can handle them. The Engine will take

care that the simulation is following the desired order established in the DAG, so that no

child task is executed before its parent node, as there would be missing input data. The

Scheduler then, following a similar behavior to the Broker, will receive the Tasks from the

Engine and send them to the Datacenter for their execution.

It is important to know that it is here, in the Scheduler entity, where the decision is

taken of which VM will process each task. This is one of the main core parts that will be

addressed in the second stage of this project, where the algorithms to save energy are

designed.

While this new characteristic of the simulator is getting the simulations nearer to a

real Datacenter behavior, it only considers information related to execution times, going

back to the initial problem of CloudSim.

To explain the different types of messages that entities use to communicate with each

other, we are making the simulator run following the basic Montage_25 DAX file.

Table 5 shows a fragment of that communication, divided in three stages that we call

Initialization stage, from messages 1 to 11, Main stage including steps 12 to 18 and Ending

stage, with the last steps displayed on this table. The different type of messages are also

shown in Table 6 ordered by their Tag number, with an explanation of what each of those

messages mean. Additionally, to help understand all the messages shown in Table 5 and the

three stages mentioned, three Figures are shown similar to Figure 7. Figure 9 shows the

Initialization stage. Figure 10 displayed the entities concerned in the Main stage, processing.

Figure 11 shows how the simulation is finished.

Tag Src entity Dst entity Info

1

2 Datacenter CIS Datacenter registers in the CIS

1000 Planner Planner Parse DAX file

1000 Merger Merger Empty method

38

15 Engine Engine Sends a Tag 15 message to the Scheduler

2 Scheduler CIS Scheduler registers in the CIS

2 1001 Planner Merger Planner sends all the Cloudlets to the Merger

15 Engine Scheduler Sends a Tag 6 message to the Datacenter

3 1001 Merger Engine Merger sends all the Cloudlets to the Engine

6 Scheduler Datacenter Requests characteristics

4 6 Datacenter Scheduler Responds to the request

5 32 Scheduler Datacenter Requests VMs creation

6 32 Datacenter Scheduler Creation acknowledgment

7 21 Scheduler Engine Sends the first Cloudlet, doesn't belong to the

DAG

8 21 Engine Scheduler Sends the Cloudlet to the Scheduler

9 1005 Scheduler Scheduler Only 1 Cloudlet, easy decision

10 21 Scheduler Datacenter Sends the Scheduled Cloudlet to its VM

11 41 Datacenter Datacenter Cloudlet processing in the Datacenter

... See Tag 41 explanation

12 20 Datacenter Scheduler Adds a Cloudlet_update (1005) to the future

queue

13 20 Scheduler Engine Sends Cloudlet back to Engine

1005 Scheduler Scheduler Empty Cloudlet list, no scheduling done

14 21 Engine Engine Gets the following row to be sent to the

Scheduler

15 21 Engine Scheduler Sends the Cloudlet to the Scheduler

16 1005 Scheduler Scheduler Bag of tasks, takes decision of VM for each

Cloudlet

17 21 Scheduler Datacenter Sends the Scheduled Cloudlets to their VMs

18 41 Datacenter Datacenter Cloudlets processing in the Datacenter

... See Tag 41 explanation

159 20 Datacenter Scheduler Returns the last Cloudlet

160 20 Scheduler Engine The Engine sets Tag -1, End of simulation

1005 Scheduler Scheduler No Cloudlets, does nothing

161 -1 Engine Scheduler Proceeds to clear Datacenters

Table 5: WorkflowSim communication messages between entities

39

Tag Message

2 Register_resource

6 Resource_characteristics

15 Res_charact_request

20 Cloudlet_return

21 Cloudlet_submit

32 VM_creation

33 VM_destroy

41 VM_ datacenter_event

1000 Start_simulation

1001 Job_submit

1005 Cloudlet_update

-1 End_of_simulation

Table 6: WorkflowSim tags meaning

To clarify the purpose of each tag message, we add a further explanation here.

2: The Cloud Information Service is the entity where all Datacenters and Schedulers

must be registered before they can be accessed. And so they do, at the beginning of the

simulation.

6: In order for the Scheduler to be able to take decisions about where to schedule a VM

or a Task, it needs to have access to all the information of the Datacenter. This message,

Resource characteristics, make the Datacenter to send its characteristics to the requester,

the Scheduler. This is the reason there is a double message with this tag, being the first

for requesting and the second for replying, indicating the Datacenter's characteristics.

15: This message is used when any entity accesses the Cloud Information Service to get

information of any Datacenter or Scheduler already registered. In this example, it is used

twice. The first time is used by the Engine entity, which requests to the CIS information

about any registered Scheduler. The second time, it is this Scheduler that requests

information about the Datacenter. After the Scheduler knows the availability of one

Datacenter, it will send the message with Tag 6, to get its characteristics information.

20: After the VM has finished processing one Cloudlet, this is returned to the entity that

sent it to the Datacenter, the Scheduler. Then, as the entity in charge of making sure that

40

the simulator is following the workflow order is the Engine, the Scheduler also sends the

Cloudlet back so that the Engine will continue with the workflow order. Each time the

Engine gets a Cloudlet returned, it will add it to the processed list and check whether

there are Cloudlets whose all its parent nodes have already been processed. If there are

any, they will be sent to the Scheduler, take decision of VM, and sent to Datacenter to

be processed.

21: This is the standard message used to pass a Cloudlet between entities. When used by

the Engine, it will select those Cloudlets whose parent nodes have already been

processed and returned, what means that the simulation is following the order established

by the workflow parsed from the DAG and send them to the Scheduler, which will then

use the message with Tag 1005 to make a scheduling decision. After the Scheduler

finishes deciding the VMs which will process each Cloudlet, it will use this message to

send them to the Datacenter, each of them to its corresponding VM.

32: Cloudlets are executed into VMs, but they must have been created before. This is the

message sent from the Scheduler to the Datacenter to request for the VM creation. This

Tag is also used in the reverse direction to express an acknowledgment of this creation.

33: This message is triggered while ending the simulation, Tag -1. It will destroy all VMs

before shutting down the system. This is the last message shown before the end of the

simulation.

41: With this message, the Datacenter shows that at least one Cloudlet is being processed

in any of its VMs. It is important to note that not only one of the Cloudlets submitted is

processed at a time. All different VM are processed in parallel, so that there is a

maximum of Cloudlets that can be executed at the same time equal to the number of

VMs that the Datacenter has. However, as this simulator is not power aware, the process

stage is made in only one step. We need to take into account that WorkflowSim is a

simulator that follows the order of a Workflow, but only consider time in Cloudlets

execution. For this reason, when a Cloudlet is executed, the simulator only need to

estimate the time that that particular Cloudlet takes in its execution, and that calculation

can be made in just one step. But not only is the time spent into processing the Cloudlet

considered. Upon its arrival to the Datacenter, it will check the Cloudlets input files, get

their size and calculate the time it takes to transfer them to the Datacenter. This time is

added to the Cloudlet's processing time, as being fair, the Cloudlet's execution will not

begin before the necessary files are transferred to the Datacenter, so this time must be

41

considered too. However, this Tag 41 message is not this simple in the joined simulator.

After the Cloudlet has been processed, it will be sent back to the Scheduler, using a

message with Tag 20.

1000: Method used to start the simulation. It is here where the DAX file that contains all

the information about tasks is parsed into Cloudlets in order to make the simulator able

to process them.

1001: The Engine is the entity which needs to have access to all the whole set of

Cloudlets, as it is the entity in charge of passing the Cloudlets to the Scheduler making

sure that the process is following the correct workflow's order. However, at the beginning

of the simulation the entity that parses the DAX file is the Planner, making necessary the

transference of all jobs to the Engine. This message is used first to pass all the jobs to

the Merger entity, and later, from the Merger to the Engine. Once the Engine is in

possession of all the Jobs, it will send the Cloudlets that compose those Jobs to the

Scheduler using the Tag 21 following the desired workflow's order.

1005: Each time the Scheduler receives a number of Cloudlets from the Engine entity it

will need to take a decision of which VM is the most suitable to process each of them.

This is the second phase of Cloud scheduling, which consider a bag of Cloudlets to be

sent to a group of VMs. This step is one of the most important in the simulator, and most

of the effort to reduce the overall energy consumed will be performed at this point. There

are many scheduling algorithms; some of them are time based and other power based,

whose optimization will be the main objective of this DVFS-Workflow joined simulator.

After the Scheduler has decided all the matching pairs, Cloudlet-VM, it will again send

a message with Tag 21 to assign them to their respective VM in the Datacenter.

-1: The End of Simulation message is used when there are no more Cloudlets waiting to

be processed and the simulation needs to end. It will erase all the Datacenters, which will

trigger a Tag 33 message per VM to destroy them, as they are no longer needed.

The process shown for steps 12 to 18 is repeated each time the Datacenter has finished

processing a Cloudlet and sends it back. The Engine then checks whether some new

Cloudlets can be executed already, sends them to the Scheduler if there is any, takes decision

of VMs and send them to be processed. Always that the process has reached the Tag 41 part,

it will advance the lapse of time needed to get the next Cloudlet finished. In the case of the

Montage_25, the first row will send 5 Cloudlets, with IDs from 0 to 4. Seeing that the shortest

42

is the Cloudlet with ID 2, the simulator will advance time considering the length of the

Cloudlet and the transfer time of its input files. Then it is sent back and the Engine will check

that the Cloudlet with ID 10 can already be processed too, so it will be scheduled and

delivered to the Datacenter. In this moment, the Datacenter checks which Cloudlet is the

next to be finished and, once more advance time to that value. This time the Cloudlet 0 is

finished and returned too, getting the Cloudlet 8 as the next available to be processed, as its

parent nodes have both been already processed. This process is repeated until all of the

Cloudlets have been executed.

Figure 9: WorkflowSim initialization stage

Figure 10: WorkflowSim main stage

43

Figure 11: WorkflowSim ending stage

3.2.4. Merged simulator

The main purpose consists in having a simulator that is able to compute the consumed

energy, but following the structure of a real workflow. Joining both simulators we get not

only both characteristics accomplished, but it also includes the DVFS technique. This makes

the simulator able to adjust the CPU's frequency to minimize the overall energy consumption.

The available governors are those introduced in section 3.1.2, named Performance,

PowerSave, UserSpace, OnDemand and Conservative. Performance will select and fix the

maximum available frequency, while PowerSave will choose the lowest. UserSpace permits

the user to specify a certain frequency to be fixed, depending on the available multipliers.

The last two of them work with thresholds that are compared with the CPU's utilization.

OnDemand will only use one threshold (user configurable, default 95%). Whenever the

utilization is greater than the selected threshold, the frequency will be scaled to the maximum

multiplier. And vice-versa, should the utilization be lower than the threshold, the frequency

will be scaled down one step to the next lower multiplier. This down-scaling is done once a

given counter runs out, which is defined in the simulator as a variable named

samplingDownFactor, so to avoid the CPU from scaling up and down constantly, similar as

the Smith trigger usage in electronics. The Conservative governor contains two thresholds,

up and down (user configurable, default 80% and 20% respectively). Unlike OnDemand,

this governor could make the frequency to be set to the maximum and, if utilization falls

down to 25%, the frequency would still be set to the maximum. This would mean a non-

optimum behavior when talking about energy consumption, as frequency could be scaled

down and let utilization raise to save energy.

44

As mentioned before, this establishes a good base for future researching on Cloud

Computing using the simulator, as we can estimate the power consumption and the

experiments are based on real workflows. Also, as modern CPUs already include DVFS

technique, it is important that the simulator would include this technique so the CPU's

dynamic energy optimization will be considered. From this point, the simulator can be used

to optimize the Scheduler behavior to save more energy.

While Guérout et al. [ 33 ] also implement DVFS and DAG workflows in CloudSim,

it is a parallel implementation with WorkflowSim, as they mention. This contribution join

both simulators allowing WorkflowSim to be power-aware, so that all Scheduling algorithms

implemented in WorkflowSim can be simulated with DVFS, measuring Datacenters power

and add new Scheduling algorithms to optimize the overall energy consumption.

In the same way, Fei Cao et al. [ 38 ] also propose a simulator including DAG

workflows and DVFS, but their proposal focus more in the Scheduling rather than DVFS

implementation, as in their paper they only mention the use of DVFS to obtain the optimum

frequency, not mentioning anything about governors used of even availability of user

configuration of the governors.

The process to join both simulators is explained in section 3.2.7, but it is important to

know that the base simulator taken is WorkflowSim. Then, from [ 33 ] we have added

everything necessary related with power models and DVFS. Before explaining the joining

process we will show the changes introduced with both base simulators.

3.2.5. Changes with WorflowSim

The entities messages in the proposed simulator are all the same as in its base

simulator, WorkflowSim. However, being this simulator power aware, the step in Tag 41

changes considerably, like introduced in section 3.2.2. First thing to mention is that now this

stage is not performed in just one step, so that only one of these messages will be shown in

a Cloudlet's execution. The Scheduling interval parameter set in the Datacenter creation, in

the main file, is the lapse of time between these messages with Tag 41. The default value for

the Scheduling interval is 0.01. So, each 0.01 seconds there is a message with Tag 41. This

indicates how often the Datacenter checks the utilization, estimates the power consumed and

applies the DVFS algorithm to check whether the Datacenter is being idle and the system

can be scaled down to save energy or, on the contrary, the utilization is too high that

surpasses the high threshold and the system needs to be scaled up to increase its performance.

For a Cloudlet that would last 13 seconds to be totally processed, the number of messages

45

with Tag 41 that the simulator produces would be 1300, with the default value of the

Scheduling interval set to 0.01 seconds.

Similarly as WorkflowSim, the process shown for steps 12 to 18 in Table 5 is

repeated each time the Datacenter has finished processing a Cloudlet and sends it back. The

Engine then checks whether some new Cloudlets can be executed already, sends them to the

Scheduler if there is any, takes decision of VMs and send them to be processed. Always that

the process reaches the Tag 41 part, it will be repeated as explained before until one of the

Cloudlets has been completed again. In the case of the Montage_25 in Figure 8, the first row

will send 5 Cloudlets, with IDs from 0 to 4. All of them are processed in parallel in different

VMs, but due to their different lengths, and seeing that the shortest is the Cloudlet with ID

2, the moment it is completed, it will be sent back without waiting for the rest to be processed.

After the Cloudlet with ID 2 is returned, the Engine will check that the Cloudlet with ID 10

can already be processed too, so it will be scheduled and delivered to the Datacenter. In this

moment, the processing of the 5 Cloudlets will continue, but taking into consideration that

the 4 that had been being processed before during the entire duration equal to the length of

the already returned shortest Cloudlet are nearly completely processed too. This is why, just

a short time after this, the Cloudlet 0 is finished and returned too. This time, it is the Cloudlet

8, whose parent nodes have both been already processed, the one that will be sent to the

Datacenter. This process is repeated until all of the Cloudlets have been executed.

Apart from this point, the rest of the system works equally to WorkflowSim. It is this

step the one that checks and gets the power consumption, leaving the rest of the messages

unaltered.

3.2.6. Changes with CloudSim

WorkflowSim adds 4 new entities with respect to CloudSim. CloudSim Shutdown and

Cloud Information Service are both kept the same. The Datacenter entity is modified to allow

the execution of Workflows and power estimation, but its purpose remains the same. The

major changes are reflected on the Broker. In CloudSim, this entity is the one that requests

the VM creation, is in possession of Cloudlets, sends to and receives from the Datacenter,

initializes the system and destroys it when the simulation is finished. Basically, it is the only

entity in part of the user. However, CloudSim only works with a group of Cloudlets to be

processed, considers time alone and does not really need more entities for that simple

purpose.

46

When the process gets more complicated, as in the case of WorkflowSim, dividing the

problem makes it easier. In this case, 4 new entities are introduced. The Planner is the one

that initializes the system and parses the DAX file to get the tasks. Then, those tasks are

passed onto the Merger (also called Cluster) that is the one that join different Tasks onto a

Job. The default configuration of the simulator considers no clustering, just gets each Task

into a Job, individually. After this, the jobs are sent to the Engine, where they will be selected

following the order specified by the workflow. This is the main difference with the basic

CloudSim simulator, where the different Cloudlets were executed following a simple

sequential order.

As explained in the Section 3.2.3, the Engine is the entity in charge of making sure the

workflow's order is followed, and sending to the Scheduler the Tasks that can be processed

each time a Task is returned, being these Tasks those whose parent nodes have already been

processed beforehand. Finally, the Scheduler is the entity that will choose which VM is the

most suitable for processing each Task. For this purpose, several different scheduling

algorithms can be used, but that will not be study in this paper.

3.2.7. Modifications and additions to achieve the proposed joint simulator

To join both simulators, the chosen method is to use WorkflowSim as the base,

adding the power classes from [ 33 ] and modifying a number of classes. The following steps

must be performed in order to successfully achieve the simulator joint. To avoid modifying

the basic simulators, a new package called dvfs is created into org.workflowsim. New classes

needed will be stored into this package. Also, the main file for simulation is taken from

WorkflowSimBasicExample1 and copied into another file that we called

WorkflowDVFSBasicNoBrite, stored into a new package named dvfs into

org.workflowsim.examples.

Both PowerDatacenter (CloudSim [ 33 ]) and WorkflowDatacenter (WorfklowSim)

extend the base class Datacenter. But we need parameters from both Datacenter types, as the

first one defines parameters related to power and the second one allows working with DAX

files, as well as Tasks and Jobs. The first modification will be making WorkflowDatacenter

to extend PowerDatacenter. As Java does not allow a double inheritance, using this method

we have a suitable Datacenter able to work as we intend to. To avoid modifying the

functioning of the WorkflowDatacenter class, we have created a new class in the new

package, called WorkflowDVFSDatacenter, copying all contents from the

WorkflowDatacanter class and making it to extend from PowerDatacenter as just explained.

47

Additionally to modifying the inheritance of this new class, we need to add several

parameters to this new class to make it power aware. In the main method of the main file,

the creation of the Datacenter object must be changed from WorkflowDatacenter to the new

one, WorkflowDVFSDatacenter and add datacenter.setDisableMigrations(true) after the

datacenter object creation, as we can see in the DVFS base example.

At the end of the same main method, after finishing the simulation, we need to add

some code to print the information related to the simulation result, being these four

parameters about:

1. Total execution time in seconds (not simulation time).

2. Total power sum in watts.

3. Average power dividing the last two parameters, watts and time.

4. The energy consumed in Wh, multiplying the average power by the time in hours.

In the Datacenter creation, we need to add several information about power, as the

default file only were creating a WorkflowDatacenter. From PowerDatacenter we need to

copy all parameters related to DVFS, as the frequencies and governors' information as

thresholds. The Host class objects are changed to PowerHost objects, as we need Hosts that

can handle power. So, in the PowerHosts creation we need to add information about the

power model used, as well as a Boolean indicating if it allows DVFS. Also, at the end of the

method, when creating the WorkflowDVFSDatacenter object we need to indicate the

PowerVmAllocationPolicy used and change the schedulingInterval from 0 to 0.01. If this

interval is not changed, the simulator will not advance in time and will be trapped into an

infinite loop.

As both classes PowerDatacenter and WorkflowDatacenter extends the Datacenter

base class, both of them override a method called updateCloudletProcessing. After changing

the inheritance hierarchy to Datacenter -> PowerDatacenter -> WorkflowDVFSDatacenter

the method that will prevail will be the last. However, the method we need must take power

into account, which is not the case of the WorkflowDatacenter. For this reason, we need to

comment this method in the WorkflowDVFSDatacenter class so that this does not override

the one belonging to the PowerDatacenter class.

Also, inside the WorkflowDVFSDatacenter class there is another method called

processCloudletSubmit. This method is missing a necessary line, that is included in the same

method of the class PowerDatacenter, being this line

48

setCloudletSubmitted(CloudSim.clock()). We must add this code at the end of this method

in our new datacenter class in order to get the simulator to work.

3.2.8. Additional notes to the power model

The power model used in this joined simulator is the power model explained in

section 3.1.1, describing how power varies with voltage and frequency in a normal processor.

But in this simulator, there is no explicit dependence of power on the voltage nor the

frequency. This simulator works as the following explanation.

We know that power consumed varies depending on frequency, and the values of

frequency that a processor can work are increased multiplying the mother board's base

frequency by the processor's frequency multiplier, so that the different available frequency

values that the processor can work at are few and discrete.

The default scenario in CloudSim defines the different values of the frequency

multipliers in the main simulation file. When creating the Datacenter, the user needs to

specify the different multiplier values as a percentage of the total processing capacity. There

is no explicit indication of the maximum frequency that the processor can work, but instead

is all reduced in terms of the MIPS performance. So, in the example of the default scenario

there are 5 different multiplier values added and a MIPS value of 1500. Table 7 shows the

different values of the multiplier as well as the MIPS that the processor would have at each

multiplier. As it is obvious, the last multiplier achieves the maximum MIPS value.

Multiplier (%) 59.925 69.93 79.89 89.89 100

Performance (MIPS) 898.875 1048.95 1198.35 1348.35 1500

Null utilization power (W) 82.75 82.85 82.95 83.10 83.25

Full utilization power (W) 88.77 92.00 95.5 99.45 103.0

Table 7: Frequency multipliers and MIPS

The last multiplier is selected when maximum performance is needed. The variation

of this multiplier value is decided by the DVFS algorithm and the chosen governor, which

is in charge of comparing the current utilization with both upper and lower thresholds to

know whether it needs higher performance or, on the contrary, the processor is rather idle

and can be scaled down to save unused power. Moreover, additionally to frequency, the

49

power consumed also depends on the current utilization. The higher the utilization, the more

power will be consumed.

In the power model class, as the default "PowerModelSpecPower_BAZAR", there

are two arrays of power values. The way this simulator works with power is defining two

values of power consumption for each multiplier, being these values those corresponding to

null and full utilization. Those values are shown in Table 7. To determine the power

consumption at a certain moment, the simulator takes both power values for the current

frequency index. Then, using the utilization value will estimate the power consumption as a

linear interpolation between both null and full utilization values, using Equation 8. Note that

this equation achieves the same results as Equation 7.

𝑝𝑜𝑤𝑒𝑟 = (1 − 𝑢𝑡𝑖𝑙𝑖𝑧𝑎𝑡𝑖𝑜𝑛) ∗ 𝑖𝑑𝑙𝑒𝑝𝑜𝑤𝑒𝑟 + 𝑢𝑡𝑖𝑙𝑖𝑧𝑎𝑡𝑖𝑜𝑛 ∗ 𝑓𝑢𝑙𝑙𝑝𝑜𝑤𝑒𝑟 ( 8 )

As mentioned in the "PowerModelSpecPower_BAZAR" class, the machine used is a

CPU Intel(R) Core(TM)2 Quad CPU Q6700 @ 2.66GHz with 4GB Ram. The power values

are measured using a Plogg wireless electricity meter. Of course, both voltage and frequency

parameters are adjusted with the DVFS algorithm in the physical processor, but in the

simulator, this power model works with linear interpolation, only giving values of null and

full utilization power and estimating the rest.

50

3.3. Second stage: scheduling algorithms

This second stage takes as a base the simulation environment we got in the first stage.

Without that simulator we would not have the necessary characteristics required to correctly

test the scheduling algorithms developed in this stage. These algorithms are based on finding

the optimum scheduling in two phases. The process is as follows.

Every simulation part from having one or more datacenters composed of a number

of physical machines. As stated before, users’ tasks are not assigned directly to them, due to

their high amount of resources, to avoid wasting them. Instead, a number of VMs of different

performance are created in the physical machines providing a heterogeneous simulation

scenario. Then, the different tasks are sent to those VMs to be execute, returning the results

to the user.

This introduces the two phases of the scheduling:

1- First, the system part from a number of physical machines where the VMs will

be created. This first phase consists of finding which physical machine is the most

suitable to hold the VM creation. Depending on how this process is performed,

result parameters as execution time and energy consumption will vary. Later on,

this step will be explained in more detail.

2- Then, once the VMs are created, the tasks can be executed. In this second phase,

the scheduler is in charge of finding which VM will execute each task. Like in

the first phase, the final results got will greatly depend on the degree of

optimization of this scheduling.

In this section, we explain both phases in detail, how they are implemented and we

compare them with other scheduling techniques. Results got are shown in section 4.2.

3.3.1. Power aware scheduling

There are many different techniques used to schedule tasks to different resources in

their correspondent resource. However, classic scheduling algorithms like Min-Min, Max-

Min and Sufferage focus on the minimization of the execution time in the overall processing

of all tasks. In this project, we are focusing on the reduction of the energy consumed rather

than time. Classic Min-Min and Max-Min have been adapted to optimize energy instead of

time to serve as a comparison method with the scheduling developed based on Fuzzy Logic.

WorkflowSim simulator already include a power aware scheduler for VMs. In the

Power package, there is a class called PowerVmAllocationPolicySimpleWattPerMipsMetric

51

which will allocate the VMs in different Hosts, considering as metric the ratio watts/mips in

the different Hosts. So, this algorithm will choose the Host that need the less amount of watts

per MIPS and allocate the VM there. This way, it is minimizing the overall consumption of

the system, as each VM will be consuming the minimum Watts as possible.

This class will compute this metric for a certain VM in all Hosts. Then it will choose

the host with the lowest metric and execute the allocateHostForVm(Vm,choosenHost)

method. In the allocation, it will call the method host.vmCreate(vm) and allocate the

necessary resources, like Storage, RAM and BW.

However, the simulator do not include a power aware scheduler for tasks, leaving the

second scheduling phase to the classic Min-Min or Max-Min, chosen by the user. We want

to add to the simulator a task scheduler able to work with energy rather than just time to try

to get better results and reduce overall energy consumption as much as possible.

Next sections show with more detail both of these scheduling phases, describing

parameters considered by the simulator in different algorithms.

3.3.2. VM scheduling

First scheduling step. In this stage, all the different Virtual Machines (VMs)

requested by the different users are created. The scheduler takes the decision of which

physical machine holds the creation of each VM. The scheduler decisions are based on

different parameters that vary if power is considered or not.

In the case of not power-aware, the scheduler will choose the physical machine for a

certain VM based on which machine has more free resources. This way, VMs will be

distributed in all the available machines so that no machine is over utilized while the rest are

left idle.

This is a great strategy to use, knowing that in DVFS, dynamic governors work up

and down thresholds. This way, the division of VMs in all available machines achieve an

even utilization level in all processors, what means that all physical machines should have a

similar utilization percentage, so that their processors' frequency multiplier should all be near

each other.

Knowing the saving DVFS is used to reduce the processors' energy consumption

keeping a low utilization in all machines will mean in an energy saving in all of them while

assuring that the users' deadline that is established in the Service Level Agreement (SLA)

will be achieved, as the utilization of the machines is intended to be kept far from the 100%.

52

In either way, the scheduling of a VM and its assignment to a physical machine will

be requested by a user and performed by the scheduler. So upon receiving a user request of

VM creation the scheduler will decide and choose which physical machine should receive

the VM creation order. For this decision purpose, in addition to the idle part of the processor

the user will request for a certain amount of MIPS, which is the common method of

measuring the performance of a VM.

The method used in CloudSim's VmAllocationPolicySimple.java class is based on

the idle PEs, which are the Processing Units available in each physical machine. So this class

tries to achieve that no machine has all PEs used while the rest of machines are totally idle.

The higher number of PEs being used means the higher utilization of the CPU of that

machine, and if that utilization surpasses the up_threshold the power consumption of that

CPU would be higher as the DVFS algorithm would increase the voltage and frequency at

which the CPU works, increasing the energy consumed.

In the other hand, the scheduler may be intended to accomplish a power-aware

scheduling in order to achieve an energy consumption as lowest as possible. In this second

scenario, the common algorithm and the one used in the WorkflowSim's

"PowerVmAllocationPolicySimpleWattPerMipsMetric.java" class is based in looking for

which physical machine can get the lowest energy consumption for the required MIPS of the

VM that is wished to schedule. This decision of which machine can achieve the lowest

energy consumption is based on an estimation of the energy consumed in each machine.

This algorithm is developed in different ways depending on the programmer. This

mentioned class defines a “watts per mips” metric. When the scheduler receives the request

of a VM it will estimate which would be the energy consumption of each physical machine,

based on the MIPS that the user requires for that particular VM.

Using the power model from section 3.1.1, the scheduler can estimate the energy

consumption based on the voltage and frequency of each of the machines and then request

the VM creation in the machine in which the estimation is the lowest. As the static energy is

a fraction of the dynamic, a higher dynamic energy means a higher static energy part.

Although the static component is not estimated, being in proportion to the dynamic part we

know that a machine in which the dynamic energy estimated is higher, the static part would

also be higher, which would sum a greater energy than other machines. So the scheduler can

rely on this proportion to estimate the overall consumption and decide which machine will

accomplish the lowest.

53

3.3.3. Tasks scheduling

For the energy estimation explanation we are relying on the energy model discussed

on [ 28 ]. Using a similar approach to the one explained in this documentation, they define

the energy consumption as a sum of a dynamic and a static components. Although they leave

the static consideration as a fraction of the dynamic part, the important part for understanding

this model is focused onto the dynamic component.

Defining this dynamic energy as a proportion of the product of the voltage squared

and the total number of clock cycles of the CPU for that particular task 𝐸𝑑 = 𝛼𝑉2𝑁𝑐𝑐, if we

take into consideration the energy dependence of the voltage alone we can make an

estimation of the energy consumption. Taking as a reference the table 2 of [ 28 ], we can

show an example of how the scheduler takes this decision.

First of all, we need to note that knowing that each task is defined by its length in MI

and deadline in seconds, the first step of this estimation is based on knowing the VMs that

can accomplish the deadline. Each VM is defined by its performance measured in MIPS. So,

dividing the task's length by the performance of each VM we get the time that each VM

would take processing that task, 𝑡𝑚𝑎𝑥 =𝑙𝑒𝑛𝑔𝑡ℎ

𝑝𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒=, [

𝑀𝐼

𝑀𝐼𝑃𝑆] = [

𝑀𝐼𝑀𝐼

𝑠

] = [𝑠]. From this first

step, the scheduler can discard the VMs whose processing time for the scheduled task would

surpass the user's deadline. Additionally, the scheduler can get a minimum value for the

performance that must have the VMs. 𝑝𝑒𝑟𝑓𝑚𝑖𝑛 =𝑙𝑒𝑛𝑔𝑡ℎ

𝑑𝑒𝑎𝑑𝑙𝑖𝑛𝑒, [

𝑀𝐼

𝑠] = [𝑀𝐼𝑃𝑆].

After this first step, the schedule know the VMs that can process the task accomplishing the

deadline. It will then estimate which is the VM with the lowest energy consumption for that

task. So, for each VM, it will calculate the dynamic energy based on the product of the

voltage squared and the time, so taking into account the variable parts of the formula and

leaving the constants.

Imagine a task that need to be scheduled, with a length of 25000 MI and a deadline

of 5s. Considering that there are 4 different VMs with the parameters voltage and

performance shown in Table 8, we can estimate a proportion of the energy that would be

consumed in the execution of that task.

First of all, the task would need a minimum performance of 𝑝𝑒𝑟𝑓𝑚𝑖𝑛 =25000

5=

5000𝑀𝐼𝑃𝑆 to accomplish the user's deadline, but the first VM, having only 4000 MIPS

would not be enough. Alternatively, the execution time for the first VM would be 𝑡𝑚𝑎𝑥 =

25000

4000= 6.25𝑠, which surpass the 5s deadline.

54

Discarded the first VM, the rest of them exceed the minimum of 5000 MIPS, so all

of them could process the task in the time limit. Now the real question is, which of these 3

would get the lowest energy consumption? In Table 8, power is approximated as a proportion

of the voltage squared, time is gotten as a ratio between length and performance and the

energy is calculated as the product of power and time. We can see that the VM working at

the lowest voltage get the lowest energy consumption, thanks to the dependence of voltage

squared.

Voltage (V) MIPS Power (W) Time (s) Energy (J)

0.9 4000 0.81 6.25 (Doesn't accomplish deadline)

1.1 6000 1.21 4.17 5.0457

1.3 8000 1.69 3.125 5.2813

1.5 10000 2.25 2.5 5.625

Table 8: Energy estimation example

3.3.4. Bag-of-tasks power aware scheduling

In this paper, Buyya et al. [ 28 ] define a scheduling method for a bag-of-tasks,

considering the scheduling of all tasks in the different VMs in order to get the lowest power

consumption as possible. They consider the scheduling of bag-of-tasks, what they call "job".

Their energy model consists on the sum of the dynamic energy and static energy, considering

the static part as a percentage of the dynamic part, following the power model introduced in

section 3.1.1.

When scheduling a task, a function checks which VM has the lowest energy

consumption. When checking this, it is important to note that there can be several VMs that

can handle the task's execution, but it will make the VM choice based on an estimation of

the energy consumption of the task on each VM.

Note that, as the power is related to the squared voltage and as a reduction of the

MIPS means in a reduction of the voltage, the energy is reduced in a squared ratio. So, when

selecting the VM for the task's execution, the lowest MIPS will mean the lowest energy

consumption as well.

Each task is defined by its length, measured in Millions of Instructions (MI) and a

deadline, which is the maximum time that can take for the execution of a task, which is fixed

55

by the Quality of Service (QoS) of the Service Level Agreement (SLA) of the user with the

cloud provider.

As an example, a task with a length of 25000 MI and a deadline of 5s would need a

minimum performance of 𝑝𝑒𝑟𝑓𝑚𝑖𝑛 =25000

5= 5000𝑀𝐼𝑃𝑆 to accomplish the deadline

established in the SLA.

Considering 𝐸 = 𝑃 · 𝑡, and 𝑃 is proportional to 𝑉2, in Table 8 we can see that in the

scheduling decision, the lowest MIPS that accomplish the deadline results in the lowest

energy consumption. Less MIPS means longer time, but also lower voltage. As power is

reduced with the square of voltage, the energy product will be reduced with the voltage, even

though time is increased. So, the lowest MIPS that accomplish the deadline tends to get the

lowest energy consumption.

From this paper we learn that choosing the VM with the lowest MIPS we tend to get

the lowest energy consumption due to the squared relation of the power with the voltage. In

the simulator, there is no deadline parameter, but it is important to have time in mind as

energy depends on it.

3.3.5. Classic schedulers adapted to power

As mentioned, we have adapted the Min-Min and Max-Min schedulers as a mean of

comparison with the fuzzy logic schedulers. Here we analyze their behavior and problems.

We part from defining a sequential power aware algorithm, that we have called

PowerAwareSeqSchedulingAlgorithm, in the packet org.workflowsim.scheduling. The

order tasks are scheduled is the same order they are received by the scheduler, hence the

name of sequential. Then, it finds the VM that achieves the lowest power consumption for

each task in the sequential order and assign the execution to that VM. By default, only one

task can be executed in each VM at the same time, so VMs assigned to the first machines

are kept busy processing until the finish executing the task.

This introduces a problem. Consider two tasks of lengths 𝑙1 = 10000𝑀𝐼 and 𝑙2 =

1000𝑀𝐼, and two VMs of performance 𝑝1 = 100𝑀𝐼𝑃𝑆 and 𝑝2 = 1000𝑀𝐼𝑃𝑆. If task 1 is

received before task 2, the first task to be assigned is the biggest one. It would be assigned

to the second virtual machine, leaving task 2 for the first VM. The parallel time of both

executions would be:

𝑚𝑎𝑥 (𝑙1

𝑝2,

𝑙2

𝑝1) = 𝑚𝑎𝑥 (

10000

1000,1000

100) = 𝑚𝑎𝑥(10,10) = 10𝑠

56

However, if the second task is received before the first one:

𝑚𝑎𝑥 (𝑙2

𝑝2,

𝑙1

𝑝1) = 𝑚𝑎𝑥 (

1000

1000,10000

100) = 𝑚𝑎𝑥(1,100) = 100𝑠

This shows the importance of sorting the received tasks before schedule them, as big

tasks should be assigned to big VMs to avoid this situation.

This problem is shared with the Min-Min scheduler. The adaptation of this algorithm

has been named PowerAwareMinMinSchedulingAlgorithm, placed in the simulator in the

org.workflowsim.scheduling packet. The second part of this scheduler acts in the same way

of the prior one, but changes in performing an initial sorting of the tasks by its length, from

shorter to longer. This makes shorter tasks to be assigned to the bigger VMs, wasting

resources and leading to the situation explained before.

The solution to this problem is found in the Max-Min. Named

PowerAwareMaxMinSchedulingAlgorithm and stored in the org.workflowsim.scheduling

packet, this adaptation sort tasks by descending length, assigning longer tasks first.

The problem explained is related to time, but these schedulers work with power. As

each task takes some power in being processed, the total sum is also increased if the

scheduling is not done optimally. However, even though both algorithms find the same

problem, the same configuration of assignments do not optimize both scenarios. A

configuration that optimize time will not get the lowest value of power consumed in a

heterogeneous system, where machines with different types of processors are mixed.

3.3.6. Watts per MIPS scheduler for VMs

As introduced before, WorkflowSim already include a scheduling algorithm that

allocates VMs to physical machines. This is the first phase mentioned, and this algorithm

considers power in the scheduling decision. Below, we introduce a short pseudo code to

explain how this works.

Inside the class that defines the scheduler, there is a method called

allocateHostForVm, which will be the one in charge of actually evaluate, decide and allocate

the VM in a Host. It will use another method called MetricWattPerMips to help estimate this

metric in all Hosts. From the algorithm study, we get the following steps:

57

Selects the PowerHostList.

For each host:

Gets the number of Processing Elements (PEs).

For each PE:

1. Gets the IndexFrequency.

2. Gets the MIPS of that PE at the selected indexFreq.

3. Gets the power of the Host at the selected indexFreq considering maximum

utilization.

4. Gets the power of the Host at the selected indexFreq considering null utilization.

5. Gets the power of the current PE multiplying the maximum of the host by a ratio.

6. Gets the power of the current PE multiplying the minimum of the host by a ratio. The

ratio in both 5 and 6 is calculated by dividing the MIPS of the PE by the total MIPS

of the Host. If there is only 1 PE in the Host, both PE and Host powers will coincide.

7. Gets the utilization of the PE as a ratio of the total allocated MIPS and the MIPS of

this PE. At the beginning, when there are no VMs allocated in the current Host, the

total MIPS allocated is 0, making this ratio null, and so, the PE's utilization.

8. Gets the total MIPS of the Host, considering the indexFreq.

9. Gets the VMList in the current Host. For this list, calculates the total SumMaxMips

of all the VMs.

10. Finally, calculates the metric as the difference of the maximum and minimum power

for the PE multiplied by a ratio of the last SumMaxMips and the maximum MIPS of

the Host: (𝑝𝑚𝑎𝑥 − 𝑝𝑚𝑖𝑛) − (𝑉𝑀𝐿𝑖𝑠𝑡𝑠𝑢𝑚𝑀𝑎𝑥𝑀𝑖𝑝𝑠

𝐻𝑜𝑠𝑡𝑚𝑖𝑝𝑠)

The metric of all MIPS is an accumulated sum of all metrics for

all PEs. When finishing with all the PEs metric, the sum is divided

by the number of PEs to make a mean.

After all the mean metrics are gotten, it will choose the minimum metric

which will mean the machine with the lowest ratio between the utilization

and the difference of power between the maximum and minimum

utilization.

To understand this metric we need to consider both parameters. If a host is set to

0 utilization and the difference between PE_Pmax and PE_Pmin is low,

understanding these parameters as the power consumption in both maximum and

minimum utilization, then we can assure that an allocation of a VM in this Host

58

would mean a low difference between the current and later power consumption

after the allocation, so that this Host would mean a good candidate for the VM

allocation.

The second parameter to take into account is the current utilization. This is shown

here in the equation as the ratio between the MIPS allocated and the total MIPS.

If a host has null MIPS allocated, it means that this Host is completely idle, so

that the allocation of the VM in this host would be far from raise its utilization to

100%. Again, we search the Host with the minimum of this ratio, as we try to

avoid leaving idle Hosts while over utilizing others.

And so, this metric is composed by these two parameters, and its multiplication

gets the final metric used in this algorithm.

After the minimum metric is found, the Host is selected as the allocation target. It will

then proceed with the allocation of the VM in this Host.

Using this method, the scheduler is able to find a good allocation of the VMList in

the HostList. As a summary of the process, it will take each VM in a sequential order and

find the host that achieves the minimum amount of watts consumed for the MIPS

performance of that VM. Although this is a good scheduler, it suffers from the same problem

of the PowerAwareSeqSchedulingAlgorithm introduced in section 3.3.5, as the host with the

best ratio of “watts per mips” is used for the first VM in the VMList, and this VM may not

be the biggest one. So the host with the best ratio is wasted instead of used in the VM with

the highest value of MIPS, reducing the power consumed for that VM, meaning a similar

problem as the last section.

As this scheduler is included in WorkflowSim, it provides a good method of

comparison with the fuzzy scheduler developed.

3.3.7. Fuzzy integration in WorkflowSimDVFS

Apart from the classical time scheduling algorithms, the addition of power models to

this simulator allows us to include power-aware scheduling algorithms with which we can

optimize the overall energy consumed in the different machines that compose the Datacenter.

However, the functioning of classical power-aware algorithms tend to focus in the

optimization based on a single calculated parameter. This is the case of MinMin that chooses

the VM that can process a Task in the lowest time, not considering other important

59

parameters like the utilization of the machine. In the case of power-aware scheduling

algorithms, they tend to focus in selecting the VM that can achieve the lowest energy

consumption for the execution of a certain Task through an estimation of the energy that

would be consumed in each of the available VMs. This estimation is based on the power

consumed at current utilization and frequency levels and the time that would take to totally

process the Task, got as a ratio of the Task's length and the performance of the VM. While

values like utilization, MIPS and time are present on the energy calculation itself, they are

not considered on the final decision of which VM should process each Task, only the final

energy value calculated.

There is another alternative that can perform that decision based on multiple

parameters. This is the case of FRBS, where the system considers multiple input parameters,

called antecedents, and an output parameter, called consequent. This way, we can consider

all utilization, time, power, energy, MIPS and whatever relevant value in this decision. The

consequent of the system will provide a degree of selection for the current VM to process

that Task. The higher this consequent is, the more suitable this VM is to process that Task.

This is performed for a Task with each available VM, and the highest consequent will result

in the selected VM to process that task. The process is repeated for all Tasks pending to be

scheduled.

The process explained considers the second stage of scheduling in

WorkflowSimDVFS. The Tasks are processed in VMs, but these must have been created

prior to their use. The two stages involved in the scheduling of the system are:

1. VM scheduling: considers a number of VMs to be created in a number of physical

machines that compose the Datacenter. This first stage contains the decision of which is

the most suitable machine to allocate each VM. Useful parameters are utilization of each

machine, MIPS requested and availability among others.

2. Tasks scheduling: considers a number of Tasks to be executed in a number of VMs.

This second stage contains the decision of which is the most suitable VM to process

each Task. Useful parameters are task's length, execution time and power consumed

among others.

FRBS need a number of antecedents and consequents to work. It's obvious to think

that the consequent will be used to take the decision of the machine to Schedule the VM or

task. For the antecedents, various parameters will be considered to help evaluate the

60

resources. In the simulator, two FRBS will be developed as schedulers for both VM and

tasks allocation. The antecedents used in each of them will be different, as they need to use

parameters that takes part in the decision of the scheduler, and those parameters differ from

VMs to tasks.

In the simulator, we use jFuzzyLogic [ 39 ][ 40 ] to fuzzify the antecedents, hold the

rule base, and defuzzify the consequent, being able of having a decision based on both the

input values taken from the simulator and the rules stored. The rules considered are not tried

at random values, as the final value will depend on these rules and random rules could mean

a really bad scheduler. Instead, we use Matlab in addition to provide a system based on the

Pittsburgh approach to find a good configuration of rules that gets the best result as possible,

in both FRBS. The integration of Matlab will be explained in another section.

After the Matlab program has got the rule base that optimizes the energy consumption,

the rules are sent to the simulator, where they are used in the schedulers. The simulator needs

then an object to store the rule base until they are used, as well as a method of obtaining each

rule to pass them onto the jFuzzyLogic class to provide the FRBS with the rules it needs to

work properly. For this purpose, we have defined a number of classes in a package named

org.workflowsim.fuzzy. The classes added to WorkflowSimDVFS are:

FuzzyVariable: expresses an antecedent or a consequent. Contains a name and the weight

from the selected membership functions.

FuzzyRule: expresses a group of antecedents and one consequent. Contains a list of

FuzzyVariables as the antecedent list and another FuzzyVariable as the consequent.

FuzzyEntity: expresses the rule base. Contains the list of FuzzyRules that have to be

evaluated to obtain an output.

Two rule base are sent from Matlab to the simulator, one containing the rules for the

VM scheduler and another for the tasks scheduler. Each rule base received needs to be stored

in a different object, as both will be used in different schedulers. We explain the parameters

taken as antecedents in different sections for each FRBS.

61

3.3.8. VM scheduling FRBS

We have created a package named “vms” inside the package

“org.workflowsim.fuzzy”. Inside, two classes are added. WorkflowSimVmsFUZZY is the

class that will communicate with Matlab to receive the rule base passed by and store it in an

object from the class FuzzyEntity. As the rule base is encoded in JSON format, this class is

in charge of decoding the data before its storage. For this scheduler we have considered 4

variables:

MIPS requested by the VM.

Total MIPS of the host (physical machine).

Utilization of the host.

Power consumed in the host.

These parameters need to be normalized to avoid getting these values out of the range

of the Membership Function (MF). The class GeneralParametersVmsFUZZY is a static class

that will store the necessary parameters to normalize the parameters. These values are:

MIPS

Minimum MIPS requested by VMs.

Maximum MIPS requested by VMs.

Total MIPS

Minimum MIPS of all hosts.

Maximum MIPS of all hosts.

Utilization

Minimum = 0.

Maximum = 1.

Power

Minimum power of all hosts.

Maximum power of all hosts.

The formula used to normalize is as follows:

𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟 − min 𝑣𝑎𝑙𝑢𝑒

max 𝑣𝑎𝑙𝑢𝑒 − min 𝑣𝑎𝑙𝑢𝑒∗ 𝑚𝑎𝑥 𝑟𝑎𝑛𝑔𝑒 𝑜𝑓 𝑀𝐹

62

3.3.9. Tasks scheduler FRBS

Similarly to the VMs, a package is created named “tasks” inside the package

“org.workflowsim.fuzzy”. The class WorkflowSimTasksFUZZY will also communicate

with Matlab and store the rule base received in a FuzzyEntity object, decoding the JSON

data first. In this case, there are 5 parameters considered as antecedents.

MIPS of the VM.

Power consumed for that host.

Length of the task.

Time spent on processing.

Energy consumed in the total execution.

Also, the class GeneralParametersTasksFuzzy stores the parameters to normalize

these antecedents.

MIPS

MinMIPS of all VMs.

MaxMIPS of all VMs.

Power

MinPower of all the hosts.

MaxPower of all the hosts.

Length

MinDaxLength (minimum length from all tasks within the DAG file).

MaxDaxLength.

Time

Minimum time = MinDaxLength / MaxMIPS.

Maximum time = MaxDaxLength / MinMIPS.

Energy

Minimum energy = MinPower · MinTime.

Maximum energy = MaxPower · MaxTime.

The same formula is used to normalize the parameters as in the VMs scheduler.

63

3.3.10. Power model analytical

The initial environment found in the basic simulation class provides a homogeneous

scenario where all hosts are the same one, with the same capabilities and all VMs are the

same too, with the same amount of MIPS. This initial environment do not provide a good

scenario for scheduling. When needing to find which VM will execute a certain task, there

is not much decision option when all VMs are the same one, so no matter which VM the

algorithm chooses, the task will last the same time in finishing its execution. Moreover, no

matter which host holds the VM, as being all hosts the same the power and energy

consumption will be identical. Of course, this is a necessary problem to solve.

Going in a reverse order, we begin with the second stage, tasks scheduling. In a real

scenario, a group of VMs with different value of MIPS performance will be found.

Modifying this is rather simple, as the amount of MIPS that a VM will have is configurable

by the user. A simple algorithm will let us have a group of different VMs. Here we set a

pseudo code to explain how it works.

begin

set maxMips;

set minMips;

set vmNum;

set downScale = (maxMips – minMips) / vmNum;

for (i = 0; i < vmNum; i++)

set mips = maxMips – (vmNum - i) * downScale;

end for

end

With this, we make the VM group have a range of MIPS from a maximum to a

minimum value, adapting to the number of VMs available.

Done this, now we have a heterogeneous scenario to schedule our tasks, being able

to choose which VM will process each one having different processing times for each VM,

as a task with a fixed length will take a different time in being executed in different VMs

with a different value of the MIPS performance.

However, moving on to the first stage, we have the problem of the similar hosts. This

makes the first scheduler encounter a similar problem, but with a more difficult solution.

64

The power model from WorkflowSim uses a certain type of processor, with two arrays of

powers, for null and full utilization, as explained. Modeling a single type of processor the

power consumption cannot be modified, hence the homogeneous scenario consisting on

identical hosts with the same power consumption. Needing a group of hosts with different

power requires another more configurable power model.

And so, we have designed a power model named “PowerModelAnalytical”, which

has been placed in the “org.cloudbus.cloudsim.power.model” package. This power model

uses an analytical model, as its name indicates. It uses multiple parameters to calculate two

basic values, power consumption and MIPS performance of the host. For this purpose, we

use two major formulas: the already explained dynamic power one and another equation to

calculate the value of MIPS that the host will have as performance. Below we show both of

them and indicate which parameters need to be considered for its calculation.

𝑃𝑑 = 𝑘𝑑𝐶𝑉2𝑓 ( 9 )

This calculates the maximum dynamic power consumed by the processor. It uses

parameters that express maximum values.

𝑘𝑑: dynamic constant: number of active gates of the processor.

𝐶: capacitance of the processor.

𝑉: maximum voltage of the processor.

𝑓: maximum frequency of the processor.

𝑃 =𝑃𝑑

1−𝑘𝑠 ( 10 )

Gets the total power consumed, sum of the dynamic and static components.

𝑘𝑠: static constant. Percentage of the power not consumed by the processor.

𝑀𝐼𝑃𝑆 =𝑓·𝐼𝑃𝐶

106 ( 11 )

Obtains the performance of the processor using the Instructions per Cycle (IPC) and

frequency. This new parameter depends on the processor, and indicating this we can model

correctly the MIPS performance of the processor.

𝑓: maximum frequency of the processor.

65

𝐼𝑃𝐶: Instructions per Cycle.

An additional array named percentages is passed by, taken from the main file. This

is the array that indicates the different multipliers of the frequency, shown as percentages of

the maximum frequency. Using this array, we get the value of the power consumption and

MIPS for each value of frequency. This way, we adapt this power model to the simulator,

maintaining the rest of the simulator with the null and full power arrays, so nothing out of

this file is modified.

Now, we are able to get a group of hosts with different value of maximum MIPS and

power consumption, values that were fixed in the previous power model. With this, we can

have a heterogeneous scenario for both VMs and tasks scheduling, being able to test our

fuzzy schedulers.

3.3.11. Integration with Matlab

The next step is using Matlab to get the rule base for each scheduler. To integrate it

with Matlab, the simulator is exported into a jar file. This file is used in Matlab to evaluate

the rules got by the Pittsburgh algorithm [ 41 ]. The method used to communicate both of

them is encoding the rule base in JSON format and passing it by through an argument to the

simulator jar file. The main class in the simulator must then receive the arguments and use

the fuzzy classes prior explained to decode and store the rules.

The Pittsburgh approach is widely known and used. Here we use a pseudo code to

explain the important parameters used and how the algorithm uses them to get the final rule

base. The antecedents have three membership functions, while the consequent has five of

them to provide a more precise output. Note that each member of the population is a

complete rule base, and the evaluation consists in running the simulator with each of them.

The typical Pittsburgh algorithm consists as follows:

initialize population;

evaluate population, get fitness;

sort by fitness;

loop

1) Select parents for crossover. Avoid elitist. Selection

dependence with its fitness.

2) Crossing parents. Gets children rules.

66

3) Children mutation. Improves exploration.

4) Children evaluation. Join children with parents. Sort

by fitness.

end loop

At the end of the loop, the joint parent and children array is sort, so at the end of the

execution the first position of the array will contain the rule base that achieves the lowest

energy consumption.

Below we introduce the files we use in Matlab with the Pittsburgh algorithm to get

the rule base and send it to the simulator.

- Pittsburgh algorithm. The main file to execute. There are two of them, one for VMs and

another one for tasks.

- WorkflowSimDVFS: called in the Pittsburgh algorithm when evaluating the population.

In this file, the jar file containing the simulator is indicated, so the path has to be modified

to the real path of the computer. It builds the command text to call the simulator sending

the rule base as an argument.

- encode: the file that takes the rule base matrix as an input parameter and encodes it in

JSON format. The output string is used in the command text from the previous file.

- executeWorkflowSimDVFS: it takes the command text and execute a system command.

The simulator is called here and it returns the result.

- decode: takes the result from the simulator that contains the information related to the

resulting execution time and energy consumption separated by a semicolon (;). The main

file, containing the Pittsburgh algorithm will use the second parameter, as the fitness

considered is the energy consumption.

- orderSelection: to avoid elitism in the Pittsburgh algorithm, the probability of selection

of the parents is based on their fitness. This function is used after the parents have been

selected, and chooses the order the crossing between them, following a linear order.

These files have been divided in two folders, separating tasks and VMs algorithms.

Using both these Matlab programs, we get learning algorithms for both our schedulers. These

algorithms are used to find an optimum rule set to for each scheduler, allowing the simulator

to get an optimum result for saving energy.

67

4. Results and discussion

In this section we show all results got in the two stages of this project. To understand

all the procedure that gets these results, please read section 3 of this documentation. We also

analyze the results and compare the different techniques that gets them.

4.1. DVFS results

This section will show the results got from the experiments run in the simulator using

the DVFS algorithm. We divide these results in two parts. The first part will show the savings

in time, power and energy parameters got by using this algorithm. The second part will

analyze the evolution of other three parameters of the simulator considered in the DVFS

algorithm, named utilization, multiplier and power.

4.1.1. DVFS savings

Here we show the results got from applying different governors using the DVFS

technique. The topology used in the basic main file counts up to 20 Hosts, each of them

consisting of 1 PE of 1500 MIPS each one. In these machines, 20 VMs are created each one

with a performance of 1000 MIPS. The scenario is left as the default one, all machines and

VMs with the same performance. This is not a suitable scenario for optimizing the schedulers,

as there is not much choice between machines of identical resources. However, as the aim

of this paper is to show the functioning of the DVFS algorithm, the only important

parameters are the maximum MIPS and the different frequency multiplier values. It is

important to note that the maximum value of the MIPS performance cannot be set to

whatever value if we want to see the algorithm behavior. A too high value of MIPS would

make that even though the multiplier is scaled down to its minimum, the utilization would

not surpass the threshold and the system would remain in the lowest frequency saving energy,

making us unable to see the DVFS algorithm behavior. If on the contrary the value of the

maximum MIPS is set too low with respect to the MIPS of the VM then the multiplier would

not be able to be scaled down, as the utilization would be too near the threshold even in the

maximum multiplier. With the values indicated before, we get a balance of these parameters

allowing us to check the DVFS algorithm's behavior.

The simulator has been tested using twelve different workflows, of three different types,

named Montage, Sipht and Inspiral. There are four workflows of each type, increasing

number of tasks to compute in each of them. Each of the twelve scenarios are simulated

68

using the governors Performance, PowerSave, OnDemand and Conservative, leaving

parameters at default values. The UserSpace governor has not been used in the comparison,

as its behavior will depend on the specified frequency and the available number of

multipliers of each CPU. From each governor, four parameters have been taken: Time,

Overall power consumption, Average power consumption and Energy consumption. For

each of these parameters, we have computed the saving percentage of the PowerSave,

OnDemand and Conservative governors with respect to the Performance governor, which

express the maximum consumption without delaying the total simulation time.

Simulation times have been expressed in Table 9. As expected, time needed for each

of the workflow types will raise with the number of tasks. As can be seen, OnDemand and

Conservative governors do not delay the total simulation time, as they will scale up the

frequency to its maximum when the up_threshold is surpassed. PowerSave, however, will

reach 100% of CPU's utilization for the base frequency and will remain in that frequency,

being able to compute less amount of data than in a higher frequency. This will involve a

delay of tasks that would not be needed if frequency was scaled up.

Time summary (s)

Governors % savings

DAX Perform PowSv OnDem Cons PowSv OnDem Cons

Mont_25 57,72 63,03 57,72 57,73 -9,20 0,00 -0,02

Mont_50 83,14 90,68 83,14 83,15 -9,07 0,00 -0,01

Mont_100 125,39 137,05 125,39 125,40 -9,30 0,00 -0,01

Mont_1000 1079,27 1182,19 1079,27 1079,27 -9,54 0,00 0,00

Sipht_30 4448,43 4950,44 4448,43 4448,44 -11,29 0,00 0,00

Sipht_60 4681,28 5209,60 4681,28 4681,28 -11,29 0,00 0,00

Sipht_100 4519,47 5029,53 4519,47 4519,47 -11,29 0,00 0,00

Sipht_1000 11363,74 12648,20 11363,74 11363,74 -11,30 0,00 0,00

Inspiral_30 1344,36 1496,35 1344,36 1344,36 -11,31 0,00 0,00

Inspiral_50 1420,24 1580,85 1420,24 1420,24 -11,31 0,00 0,00

Inspiral_100 1592,32 1771,44 1592,32 1592,32 -11,25 0,00 0,00

Inspiral_1000 11888,48 13229,81 11888,48 11888,48 -11,28 0,00 0,00

Table 9: Time summary (s)

69

In figure 12 we can see easier than in the table how time needed tends to be higher

in the PowerSave governor. This time difference grows with the time needed, so this

difference is bigger in Sipht_1000 and Inspiral_1000.

Figure 12: Time summary (s)

Figure 13 shows better this difference, expressed as a negative saving of time. More

complex workflows, Sipht and Inspiral have a bigger time difference, and hence, a lower

save for this governor.

Figure 13: Time savings (%)

In Table 10 we can see the overall power needed for the Datacenter to process all of

the workflow's tasks.

0,00

2000,00

4000,00

6000,00

8000,00

10000,00

12000,00

14000,00

25/30 50/60 100 1000

Time summary (s)

Mont_PowSv

Mont_OnDem

Mont_Cons

Mont_Perf

Sipht_PowSv

Sipht_OnDem

Sipht_Cons

Sipht_Perf

Insp_PowSv

-12,00

-10,00

-8,00

-6,00

-4,00

-2,00

0,00

25/30 50/60 100 1000

Time savings (%)

Mont_PowSv

Mont_OnDem

Mont_Cons

Sipht_PowSv

Sipht_OnDem

Sipht_Cons

Insp_PowSv

Insp_OnDem

Insp_Cons

70

Overall power summary (W)

Governors % savings


Mont_25 1,11E+05 1,12E+05 1,10E+05 1,10E+05 -0,53 1,24 1,22

Mont_50 1,60E+05 1,61E+05 1,59E+05 1,58E+05 -0,42 0,75 1,22

Mont_100 2,42E+05 2,43E+05 2,38E+05 2,39E+05 -0,63 1,36 1,22

Mont_1000 2,08E+06 2,10E+06 2,06E+06 2,06E+06 -0,85 1,06 1,23

Sipht_30 8,58E+06 8,79E+06 8,47E+06 8,47E+06 -2,46 1,28 1,23

Sipht_60 9,03E+06 9,25E+06 8,91E+06 8,92E+06 -2,46 1,29 1,23

Sipht_100 8,72E+06 8,93E+06 8,60E+06 8,61E+06 -2,46 1,28 1,23

Sipht_1000 2,19E+07 2,25E+07 2,16E+07 2,16E+07 -2,47 1,28 1,23

Inspiral_30 2,59E+06 2,66E+06 2,56E+06 2,56E+06 -2,48 1,29 1,23

Inspiral_50 2,74E+06 2,81E+06 2,70E+06 2,70E+06 -2,48 1,28 1,23

Inspiral_100 3,07E+06 3,14E+06 3,03E+06 3,03E+06 -2,42 1,28 1,23

Inspiral_1000 2,29E+07 2,35E+07 2,26E+07 2,26E+07 -2,46 1,29 1,23

Table 10: Overall power summary (W)

Again, as in the time case, Figure 14 shows this comparison in a similar shape.

Overall power needed grows larger in more complex scenarios and with more tasks to

compute. This also shows that as the overall power counts through the full simulation, the

longer the simulation is, the larger the power consumption will be.

Figure 14: Overall power summary (W)

0,00E+00

5,00E+06

1,00E+07

1,50E+07

2,00E+07

2,50E+07

25/30 50/60 100 1000

Overall power summary (W)

Mont_PowSv

Mont_OnDem

Mont_Cons

Mont_Perf

Sipht_PowSv

Sipht_OnDem

Sipht_Cons

Sipht_Perf

Insp_PowSv

71

Figure 15 shows a similar saving than time, based on being this overall power

dependent on time. This would mean that using PowerSave governor, as it takes longer to

finish the full execution, it will need more power.

Figure 15: Overall power savings (%)

If we calculate the average power needed for the execution, based on a time

normalization, then we can see in Table 11 that this relation of power dependent on time is

satisfied. Once we normalize the power with the needed time, there are not major changes

in these values.

Avg power summary (W)

Governors % savings


Mont_25 19,28 17,75 19,04 19,05 7,93 1,24 1,23

Mont_50 19,28 17,75 19,14 19,05 7,93 0,75 1,23

Mont_100 19,28 17,75 19,02 19,05 7,93 1,36 1,23

Mont_1000 19,28 17,75 19,08 19,05 7,93 1,06 1,23

Sipht_30 19,28 17,75 19,04 19,05 7,93 1,28 1,23

Sipht_60 19,28 17,75 19,04 19,05 7,93 1,29 1,23

Sipht_100 19,28 17,75 19,04 19,05 7,93 1,28 1,23

Sipht_1000 19,28 17,75 19,04 19,05 7,93 1,28 1,23

Inspiral_30 19,28 17,75 19,03 19,05 7,93 1,29 1,23

Inspiral_50 19,28 17,75 19,04 19,05 7,93 1,28 1,23

-3,00

-2,50

-2,00

-1,50

-1,00

-0,50

0,00

0,50

1,00

1,50

2,00

25/30 50/60 100 1000

Overall power savings (%)

Mont_PowSv

Mont_PowSv

Mont_OnDem

Mont_Cons

Sipht_PowSv

Sipht_OnDem

Sipht_Cons

Insp_PowSv

Insp_OnDem

72

Inspiral_100 19,28 17,75 19,04 19,05 7,93 1,28 1,23

Inspiral_1000 19,28 17,75 19,04 19,05 7,93 1,29 1,23

Table 11: Avg power summary (W)

Looking at Figure 16 it seems that the governor which needs less power is

PowerSave.

Figure 16: Avg power summary (W)

Also, Figure 17 shows the larger savings to this governor. While OnDemand and

Conservative governors only get an average power savings of 1.2%, PowerSave gets it

around 8%.

Figure 17: Avg power savings (%)

16,50

17,00

17,50

18,00

18,50

19,00

19,50

25/30 50/60 100 1000

Avg power summary (W)

Mont_PowSv

Mont_OnDem

Mont_Cons

Mont_Perf

Sipht_PowSv

Sipht_OnDem

Sipht_Cons

Sipht_Perf

Insp_PowSv

0,00

1,00

2,00

3,00

4,00

5,00

6,00

7,00

8,00

9,00

25/30 50/60 100 1000

Avg power savings (%)

Mont_PowSv

Mont_OnDem

Mont_Cons

Sipht_PowSv

Sipht_OnDem

Sipht_Cons

Insp_PowSv

Insp_OnDem

Insp_Cons

73

However, average power is not time dependent. When we calculate the energy

consumed in Table 12, knowing that 𝐸 = 𝑃 · 𝑡, then longer time means larger energy.

Energy summary (Wh)

Governors % savings


Mont_25 30,92 31,08 30,53 30,54 -0,53 1,24 1,22

Mont_50 44,53 44,72 44,20 43,99 -0,42 0,75 1,22

Mont_100 67,17 67,59 66,25 66,34 -0,63 1,36 1,22

Mont_1000 578,10 583,01 571,95 570,98 -0,85 1,06 1,23

Sipht_30 2382,79 2441,35 2352,31 2353,45 -2,46 1,28 1,23

Sipht_60 2507,52 2569,16 2475,29 2476,63 -2,46 1,29 1,23

Sipht_100 2420,84 2480,36 2389,79 2391,02 -2,46 1,28 1,23

Sipht_1000 6086,96 6237,57 6008,76 6011,99 -2,47 1,28 1,23

Inspiral_30 720,10 737,94 710,80 711,23 -2,48 1,29 1,23

Inspiral_50 760,75 779,61 751,04 751,37 -2,48 1,28 1,23

Inspiral_100 852,92 873,60 842,04 842,41 -2,42 1,28 1,23

Inspiral_1000 6368,04 6524,40 6286,17 6289,60 -2,46 1,29 1,23

Table 12: Energy summary (Wh)

Figure 18 once more shows a similar shape as Figures 12 and 13.

Figure 18: Energy summary (Wh)

0,00

1000,00

2000,00

3000,00

4000,00

5000,00

6000,00

7000,00

25/30 50/60 100 1000

Energy summary (Wh)

Mont_PowSv

Mont_OnDem

Mont_Cons

Mont_Perf

Sipht_PowSv

Sipht_OnDem

Sipht_Cons

Sipht_Perf

Insp_PowSv

74

This can be seen easier in Figure 19, where we can see that using the PowerSave

governor would mean on a greater energy consumption than using one of the both threshold

dependent governors.

Figure 19: Energy savings (%)

After this comparison, we can see that using fixed frequency governors is not the best

option. Using one of the two threshold dependent governors will have a dynamic behavior,

what will be able to adjust the frequency to a higher value when needed, avoiding a total

time delay, but scaling down when possible to save energy. At first, it could seem that using

the PowerSave governor is the best option for energy saving, as it will restrict the frequency

to the lowest multiplier. It really is the governor that achieves the lowest average power, but

it is important to note that all experiments will take a certain amount of time, what means

that energy is more important that power. As PowerSave will delay the execution it will

make the Power and Time product to exceed the energy consumption of the dynamic

governors.

-3,00

-2,50

-2,00

-1,50

-1,00

-0,50

0,00

0,50

1,00

1,50

2,00

25/30 50/60 100 1000

Energy savings (%)

Mont_PowSv

Mont_OnDem

Mont_Cons

Sipht_PowSv

Sipht_OnDem

Sipht_Cons

Insp_PowSv

Insp_OnDem

Insp_Cons

75

4.1.2. DVFS parameters evolution

Now we will analyze the evolution of three parameters of the simulator: the

utilization of the machine, the value of the frequency multiplier and the power consumption,

the three of them as a function of time. The purpose of showing these parameters is to justify

the proper functioning of the DVFS algorithm. To avoid a large number of data for the graph,

we have used the basic Montage_25 workflow. The use of a static governor does not consider

adjustments in the frequency multiplier, so the power only varies depending on the

utilization of the machine. To show the adjustment of the frequency we have to use a

dynamic governor. In this example, the simulation has used the on_demand governor, with

its two parameters kept as the default configuration, with the threshold set to a value of 95%

and the sampling down factor set to 100 iterations. The functioning of this algorithm consists

as the following explanation. Having just one threshold it is considered as the up_threshold

and down_threshold. However, to use it as a down_threshold it first need to count a number

of iterations equal to the value of the sampling down factor parameter configured. Whenever

the utilization of the system gets over the threshold (in this case, over 0.95), as the system is

near the 100% of utilization and that will may cause a delay in the processing of tasks, the

system will scale the multiplier up to the highest value, so that the performance is set to its

maximum. One this happens, it will count a number of iterations equal to the sampling down

factor and then check the utilization value. If the utilization is lower than the threshold, the

multiplier will descend one level. This algorithm is repeated until the utilization is increased

over the threshold, increasing the multiplier to its maximum and beginning again.

In this example, we can see how at the beginning the multiplier is set to its maximum

of 4. The current load of tasks means a utilization of 60%, and a power near 95W. After 100

iterations, the utilization is checked and as its value is just 60%, which is lower than 95%,

the multiplier is reduced to 3. The process is repeated until the moment the multiplier is

reduced to 1, when the utilization is increased over 0.95 and the DVFS algorithm sets the

multiplier back to 4. Note that, during the period of time that the multiplier is below 4, the

power consumption is reduced. This does not occurs in the selected governor is performance,

leaving the multiplier to its maximum and wasting power.

Figure 20 shows the utilization of one machine. Consider the same load in the

machine during the full simulation. Knowing that a lower multiplier means lower MIPS, the

utilization will increase each time the multiplier is scaled down. However, the power

consumption will decrease, so lowering the multiplier when the system is idle is a method

of reducing the overall power consumption through the simulation.

76

Figure 20: Utilization evolution

Figure 21 shows the evolution of the frequency multiplier. As explained, it will be

set to its maximum whenever the utilization exceeds the threshold, and scaled down to save

power when it is below, taking into account the sampling down factor. This is the method of

scaling the consumption and performance of the system as a double-edged sword. A high

multiplier means more MIPS, what makes sure that there is no unnecessary delay in the tasks

execution, but the power consumption is higher, what could be wasted if the utilization is

not high enough. A lower value of the multiplier saves energy, but the system must make

sure that the utilization does not reach 100% at a low multiplier, as that would mean that the

tasks would be delayed unnecessarily.

Figure 21: Multiplier evolution

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0.0

1

1.5

3

3.0

5

4.5

7

6.0

9

7.6

1

9.1

3

10.6

5

12.1

7

13.6

9

15.2

1

16.7

3

18.2

5

19.7

7

21.2

9

22.8

1

24.3

3

25.8

5

29.5

2

31.0

4

32.5

6

34.0

8

35.6

0

37.1

2

38.6

4

Time(s)

Utilization

1

1.5

2

2.5

3

3.5

4

0.0

1

1.4

7

2.9

3

4.3

9

5.8

5

7.3

1

8.7

7

10.2

3

11.6

9

13.1

5

14.6

1

16.0

7

17.5

3

18.9

9

20.4

5

21.9

1

23.3

7

24.8

3

28.4

4

29.9

0

31.3

6

32.8

2

34.2

8

35.7

4

37.2

0

38.6

6

Time(s)

Multiplier

77

Figure 22 shows the different values of power at each time the multiplier is changed.

Again, the objective is try to get the overall energy as low as possible, but never letting

unnecessary delays as explained.

Figure 22: Power evolution

With this merged simulator we get a base for future research to optimize energy

consumption, but considering real workflows and being power-aware. We can also choose

different types of governors in DVFS and compare the simulator's behavior when optimizing,

knowing the advantages of dynamic governors.

91

92

93

94

95

96

97

0.0

1

1.4

7

2.9

3

4.3

9

5.8

5

7.3

1

8.7

7

10.2

3

11.6

9

13.1

5

14.6

1

16.0

7

17.5

3

18.9

9

20.4

5

21.9

1

23.3

7

24.8

3

28.4

4

29.9

0

31.3

6

32.8

2

34.2

8

35.7

4

37.2

0

38.6

6

Time(s)

Power

78

4.2. FRBS results

As stated, the second stage of this project considers the development of two schedulers,

both based on an expert system. Similarly to the DVFS results section, we divide this section

in to parts. The first part will show the configuration of the simulator with respect to the

scenario of simulation, showing the characteristics of hosts, VMs and tasks that are executed.

In the second part, we show the results of the experiments got using these FRBS schedulers

in combination with other scheduling algorithms and analyze their efficiency.

4.2.1. FRBS simulation scenario

In this section we show the configuration of the elements of the simulator, including

characteristics of physical hosts, VMs and tasks to execute.

The scenario of these experiments contains 20 physical hosts, as in the case of the

DVFS scenario. However, as we are considering scheduling and the default power model in

the base simulator do not allow to model different types of processors, using the default

power model we would have a scenario where all physical machines have the same

characteristics, offering no possibilities of decision of which physical machine should

contain each VM that need to be created. To solve this, an alternative power model has been

added to the simulator, as explained in section 3.3.10. This power model needs configuring

multiple parameters, but allows having different physical machines with different resources

and performance, offering the experimental scenario that we need.

In Table 13, the different parameters needed to be configured in the physical hosts

are displayed. Additionally to those parameters, other three are required, which are set to the

same value in all hosts in this scenario. These additional parameters are the dynamicConstant,

set to 0.5 and required in equation (3) to calculate the dynamic component of the power

consumed by each host; staticConstant, set to 0.7 and required to calculate the static

component and total power and the array of percentages, which show the different

multipliers in percentages of the maximum values set. This array has been set to

{59.925, 69.93, 79.89, 89.89, 100.0} in all hosts. Applying these percentages to

maxFrequency and maxVoltage gets the different frequency and voltage values when the

multiplier of the processor is down set from the maximum.

Values shown in Table 13 are:

1. Maximum value of frequency for the processor.

2. Maximum value of voltage of the power supply.

79

3. Capacitance of the processor.

4. Instructions per cycle, used to get the value of MIPS performance.

Host maxFreq (GHz) maxVolt (V) Capt (F) * 𝟏𝟎−𝟖 IPC (Inst / Cycle)

1 3 3 0.5 0.5

2 2.9 2.95 0.625 0.575

3 2.8 2.9 0.75 0.65

4 2.7 2.85 0.875 0.725

5 2.6 2.8 1 0.8

6 2.5 2.75 1.125 0.875

7 2.4 2.7 1.25 0.95

8 2.3 2.65 1.375 1.025

9 2.2 2.6 1.5 1.1

10 2.1 2.55 1.625 1.175

11 2 2.5 1.75 1.25

12 1.9 2.45 1.875 1.325

13 1.8 2.4 2 1.4

14 1.7 2.35 2.125 1.475

15 1.6 2.3 2.25 1.55

16 1.5 2.25 2.375 1.625

17 1.4 2.2 2.5 1.7

18 1.3 2.15 2.625 1.775

19 1.2 2.1 2.75 1.85

20 1.1 2.05 2.875 1.925

Table 13: physical hosts’ configuration input parameters

The previous values refer to the input parameters, the values that need to be

configured in the simulator for each host. From these values, the power model gets other set

of parameters from the different equations shown in sections 3.1.1 and 3.3.10. The calculated

values are:

1. Frequencies array, which contains the different frequencies corresponding at each

multiplier.

80

2. Voltages array, containing the different values of the power supply depending on

the multiplier.

3. DynamicPower array, indicating the dynamic component of the power consumed

by the host, the part corresponding to the processor, applied to the different values

of the multiplier.

4. StaticPower, the component of the power that does not depend on the processor.

5. idlePower array, containing the different values of the static power depending on

the multiplier.

6. maxPower, the total power consumed by the host.

7. fullPower array, with the different values of the max power applying the different

multipliers.

8. maxMIPS, indicating the maximum performance that the host can achieve.

9. Mips array, considering the different multipliers applied to the maxMIPS.

To avoid listing the full arrays of each host, Table 14 shows the maximum values of

the dynamicPower, staticPower, totalPower and MIPS. From the parameters calculated, we

can see that there is a wide range of different values, obtaining a heterogeneous scenario

suitable for scheduling experiments.

Host dynPow (W) stcPow (W) totPow (W) maxMIPS

1 67.5 157.5 225 1500

2 78.87 184 262.89 1667

3 88.3 206 294.35 1820

4 95.94 223.87 319.82 1957

5 101.92 237.81 339.73 2080

6 106.35 248.14 354.5 2187

7 109.35 255.15 364.5 2280

8 111.04 259.1 370.14 2357

9 111.54 260.26 371.8 2420

10 110.95 258.88 369.83 2467

11 109.37 255.21 364.58 2500

12 106.92 249.48 356.4 2517

13 103.68 241.92 345.6 2520

81

14 99.75 232.75 332.5 2507

15 95.22 222.18 317.4 2480

16 90.17 210.41 300.58 2437

17 84.7 197.63 282.33 2380

18 78.87 184.03 262.9 2307

19 72.76 169.78 242.55 2220

20 66.45 155.05 221.51 2117

Table 14: physical host’s configuration calculated parameters

In the case of VMs, we set 20 of them again, allocating one per physical host. In this

case, there is no problem in selecting the amount of MIPS performance that each VM will

have, as a virtual machine is a software program running on the host and is configurable.

However, if we want the DVFS algorithm running properly on the different machines, we

cannot just select the values in a random manner. Instead, the values of MIPS selected will

be a 97.5% of the minimum MIPS each host has.

For example, if the first host has 1500 MIPS, like Table 14 shows, applying the first

multiplier of 59.925% we get the minimum MIPS of the host, which accounts for 898.87

MIPS. Applying a 97.5% we get a final value of 876.40 MIPS. This way, if VM number 1

is configured with this value and is allocated in host number 1, when the DVFS algorithm

scales down the multiplier to the minimum value, the utilization of the machine will go up

to 97.5%, surpassing the up_threshold and triggering the up scaling. This way, we get the

DVFS algorithm working at all times, instead of leaving the multiplier set to the minimum

and not using this algorithm.

However, even though each VM performance is calculated with respect to each host’s

MIPS, this does not mean that they will be allocated following that order. The VM scheduler

will take care of this stage, but we configure the VMs performance values to make them

similar to the hosts’ performance, taking into account that only one VM is created in each

host. In doing this, we are trying to make the DVFS algorithm to be executed in the

maximum hosts as possible, but that depends on the scheduler in the end. Table 15 show

these values.

82

VM 1 2 3 4 5

MIPS 525 583 637 685 728

VM 6 7 8 9 10

MIPS 765 798 825 847 863

VM 11 12 13 14 15

MIPS 875 881 882 877 868

VM 16 17 18 19 20

MIPS 853 833 807 777 741

Table 15: VM MIPS in FRBS scenario

Once all VMs have been scheduled to the physical hosts, it is the turn of the tasks

scheduling. In all experiments run, which results are shown in the next section, the workflow

used has been Montage_25, as it was in the case of the DVFS experiments in the first stage

of this project. The tasks scheduler takes care of following the order established in the

workflow, reproducing the process of the workflow, as if it were a real processing in a real

environment.

4.2.2. Rules generated for both FRBS schedulers

Here we show an example of some of the rules that were optimized by using the

Pittsburgh approach in Matlab. Antecedents consider 3 Membership Functions (MF),

{𝑙𝑜𝑤, 𝑛𝑜𝑟𝑚𝑎𝑙, ℎ𝑖𝑔ℎ} and the consequent is configured with 5, to get a higher range of

possibilities, {𝑣𝑒𝑟𝑦_𝑙𝑜𝑤, 𝑙𝑜𝑤, 𝑛𝑜𝑟𝑚𝑎𝑙, ℎ𝑖𝑔ℎ, 𝑣𝑒𝑟𝑦_ℎ𝑖𝑔ℎ}. Both Matlab functions based on

the Pittsburgh approach have got a total of 28 rules for each scheduler.

This scheduling stage considers the next process. First, the scheduler contains the

group of VMs that need to be created in a group of physical hosts. Then, for each VM, the

scheduler will check the selection rate with each host. The highest selection means the host

that will allocate that VM creation. The process is repeated until all VMs are created. The

antecedents considered are introduced in section 3.3.8, being vmMIPS the value of the MIPS

requested by the current VM and hostMIPS, utilization and power parameters belonging to

the host. Beginning with the VM scheduler, we show a sample of 5 rules.

83

1. If (vmMIPS IS low) AND (hostMIPS IS normal) AND (utilization IS low)

AND (power IS normal) THEN selection IS very_high

2. If (vmMIPS IS high) AND (hostMIPS IS high) AND (utilization IS normal)

AND (power IS low) THEN selection IS very_high

3. If (vmMIPS IS normal) AND (hostMIPS IS normal) AND (utilization IS

low) AND (power IS high) THEN selection IS very_low

4. If (vmMIPS IS high) AND (hostMIPS IS normal) AND (utilization IS high)

AND (power IS normal) THEN selection IS low

5. If (vmMIPS IS low) AND (hostMIPS IS normal) AND (utilization IS normal)

AND (power IS normal) THEN selection IS normal

Rules 1 and 2 show a very high selection, with low utilization and power in each of

them. Rule 3 show that a high power consumption gets a very low selection. In Rule 4 the

MIPS of the VM to create are high compared with the MIPS of the host. Rule 5 shows a

normal consequent rule.

The next five rules are a sample of the rule base belonging to the Tasks scheduler.

Antecedents are introduced in section 3.3.9, being MIPS a parameter of the current selected

VM and length representing the number of Millions of Instructions (MI) of the task. Power,

time and energy parameters consider both the current task and VM.

1. If (MIPS IS high) AND (power IS normal) AND (length IS low) AND (time

IS low) AND (energy IS low) THEN selection IS very_high

2. If (MIPS IS high) AND (power IS normal) AND (length IS high) AND (time

IS low) AND (energy IS low) THEN selection IS high

3. If (MIPS IS normal) AND (power IS high) AND (length IS low) AND (time

IS high) AND (energy IS high) THEN selection IS very_low

4. If (MIPS IS normal) AND (power IS low) AND (length IS high) AND (time

IS normal) AND (energy IS low) THEN selection IS normal

5. If (MIPS IS low) AND (power IS low) AND (length IS low) AND (time IS

high) AND (energy IS normal) THEN selection IS normal

Rules 1 and 2 show how low powers of energy and time get a high selection of the

VM to process this task. Rule 3 shows the opposite, high values of power, time and energy

get a very low selection score for that VM to process that task. Rules 4 and 5 show a

84

combination of high and low values, which consider a normal score in the selection of the

current VM.

4.2.3. FRBS savings

After both fuzzy schedulers have been programmed and added to the simulator, we

have tested their results and compared them to the other algorithms explained in this

document. First we list the algorithms tested in both VMs and Tasks schedulers and the

combinations run.

VMs

VmWattsPerMetricMips.

VmFuzzy.

Tasks

MinMin.

MaxMin.

PowerMinMin.

PowerMaxMin.

PowerFuzzySeq.

PowerFuzzyMinMax.

PowerFuzzyMaxMax.

All possible combinations have been tested, resulting in a total of 14 simulations, all

of them running following the tasks order established in the Montage 25 DAG. Four

parameters are got as the result of each experiment: execution time, total power consumption,

average power consumption and energy consumption, the same parameters considered in the

results of the first stage, effects of DVFS algorithm in the simulator. Each experiment is

named in the following way: VMScheduler / TasksScheduler. Being the default scenario in

the WorkflowSim simulator VmWattsPerMetricMips / MinMin, results include a percentage

of savings with respect to this configuration. As we did on section 4.1, we show here graphs

to facilitate the understanding of the results shown in the tables. Also, as overall power

values are much higher than the rest, we do not include these values on the graphs as they

would make more difficult to see the other results.

First, we begin showing the results got running the experiments with the VM

scheduler already implemented in WorkflowSim, based on a Watts per MIPS metric. We

85

use here the tasks schedulers MinMin and MaxMin, and the power-aware variants

implemented. In all the experiments, we are considering the heterogeneous scenario

explained in section 3.2.10, instead of the homogeneous scenario by default, where all Hosts

and VMs were identical.

The results got from these 4 initial experiments before testing any Fuzzy scheduler

are shown in Table 16. Here we can see that, as explained, MaxMin solves the problem of

the non-optimum scheduling, wasting the best resources with rather short tasks. For this

reason, the values of time and energy are lower using MaxMin than MinMin. Opposite as

expected, the Power-aware variations achieve much worse results than their time based ones.

WmWattsPerMipsMetric

Results Savings (%)

Time (s) Total (W) Avg (W) Energy (Wh) Time Total Avg Energy

MinMin 65.11 81820.60 12.57 22.73 - - - -

MaxMin 64.19 81390.27 12.68 22.61 1.41 0.53 -0.90 0.53

PowerMinMin 90.90 116921.40 12.86 32.48 -39.61 -42.90 -2.36 -42.90

PowerMaxMin 89.53 112911.19 12.61 31.36 -37.50 -38.00 -0.36 -38.00

Table 16: basic experiments

Figure 23 show the results from these 4 first experiments. As displayed in the table,

the loweset values of energy are got by using the MaxMin algorithm.

Figure 23: VmWattsPerMipsMetric result values

0,00

10,00

20,00

30,00

40,00

50,00

60,00

70,00

80,00

90,00

100,00

Time (s) Avg (W) Energy (Wh)

VmWattsPerMipsMetric values

MinMin MaxMin PowerMinMin PowerMaxMin

86

In case of displayed as percentage of savings with respect to the basic default

WattsPerMips / MinMin configuration, Figure 24 attribite the highest savings to the MaxMin

algorithm as well.

Figure 24: VmWattsPerMipsMetric result savings

In the second stack of experiments, we introduce the fuzzy tasks scheduler along with

the default VM scheduler. In this case, we cannot find major improvements in the tasks

scheduling with respect to the classic MinMin scenario, which proves to provide a robust

algorithm able of optimizing different scenarios. Table 17 shows that PowerFuzzyMaxMax

only gets a 0.71% reduction in the energy consumption, while PowerFuzzyMinMin gets

even worse results.

WmWattsPerMipsMetric

Results Savings (%)


PowerFuzzySeq 65.11 81822.62 12.57 22.73 0.00 0.00 0.00 0.00

FuzzyMinMax 64.74 82077.07 12.68 22.80 0.57 -0.31 -0.89 -0.31

FuzzyMaxMax 64.07 81240.43 12.68 22.57 1.60 0.71 -0.90 0.71

Table 17: experiments with fuzzy task scheduler

Figure 25 shows these results, but it is somehow difficult to distinguish which is

higher.

-50,00

-45,00

-40,00

-35,00

-30,00

-25,00

-20,00

-15,00

-10,00

-5,00

0,00

5,00

Time Avg Energy

%

VmWattsPerMipsMetric savings

MaxMin PowerMinMin PowerMaxMin

87

Figure 25: VmWattsPerMipsMetric result values (2)

However, Figure 26 shows in finer detail the better results got by

PowerFuzzyMaxMax in both time and energy values. In this case, the power consumed by

this algorithm is the worst of the fuzzy schedulers, but the high savings got in both time and

energy make up for this power worsening thanks to the time / power balance.

Figure 26: VmWattsPerMipsMetric result savings (2)

In this third round of experiments, the VM fuzzy scheduler proves to get much better

results than the default Watts per MIPS ratio based VM scheduler. Using this in combination

0,00

10,00

20,00

30,00

40,00

50,00

60,00

70,00


VmWattsPerMipsMetric values

PowerFuzzySeq PowerFuzzyMinMax PowerFuzzyMaxMax

-1,50

-1,00

-0,50

0,00

0,50

1,00

1,50

2,00

Time Avg Energy

%

VmWattsPerMipsMetric savings


88

with MaxMin achieves a 7.07% of reduction in the energy consumption, which is not a

negligible improvement.

VmsFuzzy

Results Savings (%)


MinMin 65.11 77127.06 11.85 21.42 0.00 5.74 5.74 5.74

MaxMin 64.19 76037.23 11.85 21.12 1.41 7.07 5.74 7.07

PowerMinMin 73.18 86693.08 11.85 24.08 -12.40 -5.96 5.74 -5.96

PowerMaxMin 75.10 88967.51 11.85 24.71 -15.35 -8.73 5.74 -8.74

Table 18: experiments with fuzzy VM scheduler

In Figure 27 we can see how the power-aware alternatives of MinMin and MaxMin

get higher values than their classic siblings.

Figure 27: VmsFuzzy result values

Figure 28 shows how the difference is higher in time than in energy, but still

preferring the use of the classic scheduling algorithms.

0,00

10,00

20,00

30,00

40,00

50,00

60,00

70,00

80,00


VmsFuzzy values


89

Figure 28: VmsFuzzy result savings

In this last step, both fuzzy schedulers have been used. As the results related to the

PowerFuzzyMaxMax scheduler were expected to be the best, this is the one that has been

further tested to achieve the best results. As Table 19 shows, the combination of both

VmsFuzzy scheduler and PowerFuzzyMaxMax gets to achieve a 7.23% of reduction in the

energy consumption. This result and the rules got by both Matlab functions have been run

in an iterative process until the enhancements were too low for continuing. At first, a rule

base for tasks was got. Then, using that result, used the vmPittsburgh to get a rule base for

the VMs scheduler. And so, repeating this process keeping the rule base from the previous

step until there were not important improvements.

VmsFuzzy

Results Savings (%)


FuzzySeq 64.27 76132.00 11.85 21.15 1.29 6.95 5.74 6.95

FuzzyMinMax 66.78 79105.33 11.85 21.97 -2.56 3.32 5.74 3.32

FuzzyMaxMax 64.08 75906.93 11.85 21.09 1.58 7.23 5.74 7.23

Table 19: experiments with both fuzzy schedulers

In Figure 29, such as in Figure 25 is not easy to distinguish the differences between

the results got with each of the three fuzzy schedulers.

-20,00

-15,00

-10,00

-5,00

0,00

5,00

10,00

Time Avg Energy

%

VmsFuzzy savings


90

Figure 29: VmsFuzzy result values (2)

Once more, savings make easier in Figure 30 to see the better result of

PowerFuzzyMaxMax over the other two alternatives.

Figure 30: VmsFuzzy result savings (2)

Results show that the combination of both expert systems accomplish the best results

compared to other combinations. These results depend to a large extent on the rules

optimized with the Matlab functions running the Pittsburgh algorithm. A different set of

rules may be found using other algorithms, such as KASIA [ 42 ] and these results might be

different.

0,00

10,00

20,00

30,00

40,00

50,00

60,00

70,00

80,00


VmsFuzzy values


-4,00

-2,00

0,00

2,00

4,00

6,00

8,00

Time Avg Energy

%

VmsFuzzy savings


91

5. Conclusions

After all experiments have been run and analyzed, we can conclude acknowledging

that the algorithms developed achieves a lower value of the energy consumption, as we were

looking for. Although these algorithms and results are based on experiments executed in a

simulator and we cannot affirm that the results got are exactly what we would get by running

these algorithms in a real system, we can assure that these results are similar to the real

scenario due to the characteristics implemented in the simulator. Not including the DVFS

algorithm would not make the fuzzy schedulers to save more or less energy, but the results

that we get including this algorithm are nearer to a real scenario, where processors running

on the servers do execute the DVFS algorithm internally.

Additionally, the importance of the DAG inclusion allows us to reassure the

authenticity of the results and the reduction of energy consumed with these fuzzy schedulers,

as the order of tasks that are executed in these experiments follow a real pattern of

consecutions, as well as their lengths are not randomly generated.

Future developments in this project will include a double side optimization, based on

both time and energy parameters as fitness in these algorithms. As expected, results in Table

19 show a good energy reduction, but cannot achieve a great time lowering. This is due to

all algorithms and Pittsburgh are based on soloing energy as fitness, not taking times into

consideration. Results that get lower execution times also get higher energy consumption.

This double side optimization would get a balance between these two parameters, achieving

values that satisfy both the requirements of low time and energy in the Cloud Computing

systems.

In the case of the merged simulator, being open source we offer a great opportunity for

researchers all over the world of being able of testing different algorithms and checking the

results in time, power and energy, knowing that they are able of executing real traces,

reproducing traffic workloads of real Datacenters. This can save high amounts of funds in

early stages of researching, when new algorithms want to be tested. Once the algorithm gets

good results in the simulation environment, it can be test on a real scenario to confirm its

effectiveness. In all cases, this open source simulator can continue growing in characteristics

by any researcher who wishes to contribute in the project.

92

Bibliography

[ 1 ] J. G. Koomey, ―Estimating total power consumption by servers in the US and the

world, Oakland, CA: Analytics Press. February 15, 2007.

[ 2 ] A. Gara, M. A. Blumrich, D. Chen, G. L.-T. Chiu, P. Coteus, M. Giampapa, R. A.

Haring, P. Heidelberger, D. Hoenicke, G. V. Kopcsay, T. A. Liebsch, M. Ohmacht, B. D.

Steinmacher-Burow, T. Takken, and P. Vranas, “Overview of the blue gene/l system

architecture,” IBM Journal of Research and Development, vol. 49, no. 2-3, pp. 195–212,

2005.

[ 3 ] K. Li, “Performance analysis of power-aware task scheduling algorithms on

multiprocessor computers with dynamic voltage and speed,” IEEE Trans. Parallel Distrib.

Syst., vol. 19, no. 11, pp. 1484–1497, 2008.

[ 4 ] W. Forrest, “How to cut data centre carbon emissions,” Website, December 2008.

[Online]. Available: http://www.computerweekly.com/Articles/2008/12/05/233748/how-

to-cut-data-centre-carbon-emissions.htm

[ 5 ] Vuong, P. T., Madni, A. M., & Vuong, J. B. (2006, July). VHDL implementation for a

fuzzy logic controller. In Automation Congress, 2006. WAC'06. World (pp. 1-8). IEEE.

[ 6 ] X. Wang and M. Chen, “Cluster-level feedback power control for performance

optimization,” in Proc. IEEE 14th Int. Symp. High Performance Computer Architecture

HPCA 2008, 2008, pp. 101–110.

[ 7 ] X. Wang and Y. Wang, “Coordinating power control and performance management for

virtualized server clusters,” IEEE Trans. Parallel Distrib. Syst., vol. 22, no. 2, pp. 245–259,

2011.

[ 8 ] S. Mittal, “A survey of architectural techniques for improving cache power efficiency,”

Sustainable Computing: Informatics and Systems, 2013.

[ 9 ] Q. Wu, P. Juang, M. Martonosi, L.-S. Peh, and D. W. Clark, “Formal control techniques

for power-performance management,” IEEE Micro, vol. 25, no. 5, pp. 52–62, 2005.

[ 10 ] C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield,

“Live migration of virtual machines,” in Proc. 2nd conference on Symposium on Networked

Systems Design & Implementation - Volume 2, ser. NSDI’05. Berkeley, CA, USA: USENIX

Association, 2005, pp. 273–286.

[ 11 ] A. Beloglazov and R. Buyya, “Adaptive threshold-based approach for energy-efficient

consolidation of virtual machines in cloud data centers, ”in Proc. 8th International Workshop

93

on Middleware for Grids, Clouds and e-Science, ser. MGC ’10. New York, NY, USA: ACM,

2010, pp. 4:1–4:6.

[ 12 ] A. Beloglazov and R. Buyya, “Energy efficient allocation of virtual machines in cloud

data centers,” in Proc. 10th IEEE/ACM Int Cluster, Cloud and Grid Computing (CCGrid)

Conf, 2010, pp. 577–578.

[ 13 ] V. Anagnostopoulou, S. Biswas, A. Savage, R. Bianchini, T. Yang, and F. T. Chong,

“Energy conservation in datacenters through cluster memory management and barely-alive

memory servers,” in Proceedings of the 2009 Workshop on Energy Efficient Design, 2009.

[ 14 ] L. Liu, H. Wang, X. Liu, X. Jin, W. B. He, Q. B. Wang, and Y. Chen, “GreenCloud:

a new architecture for green data center,” in 6th international conference industry session on

Autonomic computing and communications industry session. ACM, 2009, pp. 29–38.

[ 15 ] J. Leverich, M. Monchiero, V. Talwar, P. Ranganathan, and C. Kozyrakis, “Power

management of datacenter workloads using per-core power gating,” Computer Architecture

Letters, vol. 8, no. 2, pp. 48–51, 2009.

[ 16 ] S. Wang, J.-J. Chen, J. Liu, and X. Liu, “Power saving design for servers under

response time constraint,” in Proc. 22nd Euromicro Conf. Real-Time Systems (ECRTS),

2010, pp. 123–132.

[ 17 ] J. D. Moore, J. S. Chase, P. Ranganathan, and R. K. Sharma, “Making scheduling cool:

Temperature-aware workload placement in data centers.” In USENIX annual technical

conference, General Track, 2005, pp. 61–75.

[ 18 ] L. Li, C.-J. M. Liang, J. Liu, S. Nath, A. Terzis, and C. Faloutsos, “Thermocast: a

cyber-physical forecasting model for datacenters,” in 17th ACM SIGKDD international

conference on Knowledge discovery and data mining. ACM, 2011, pp. 1370–1378.

[ 19 ] Patel, C., Sharma, R., Bash, C., and Graupner, S.2002. Energy Aware Grid: Global

Workload Placement based on Energy Efficiency. Tech. rep., HP Laboratories.

[ 20 ] Wang, L., von Laszewski, G., Dayal, J., and Wang, F. 2010. Towards Energy Aware

Scheduling for Precedence Constrained Parallel Tasks in a Cluster with DVFS. In

Conference on Cluster, Cloud and Grid Computing (CCGrid). 368-377.

[ 21 ] Merkel, A. and Bellosa, F. 2006. Balancing power consumption in multiprocessor

systems. ACM SIGOPS Operating Systems Review 40, 4, 403-414.

[ 22 ] Chen, G., He, W., Liu, J., Nath, S., Rigas, L., Xiao, L., and Zhao, F. 2008. Energy-

aware server provisioning and load dispatching for connection-intensive internet services.

In USENIX Symposium on Networked Systems Design and Implementation (NSDI). 337-

350.

94

[ 23 ] Gathering Clouds of XaaS! http://www.ibm.com/developer.

[ 24 ] Kernel Based Virtual Machine, www.linux-kvm.org/page/MainPage

[ 25 ] VMWare ESX Server, www.vmware.com/products/esx

[ 26 ] XenSource Inc, Xen, www.xensource.com

[ 27 ] P. Hale, “Acceleration and time to fail,” Quality and Reliability Engineering

International, vol. 2, no. 4, 1986.

[ 28 ] K. H. Kim, R. Buyya, and J. Kim, “Power aware scheduling of bag-of-tasks

applications with deadline constraints on dvs-enabled clusters,” in CCGRID, 2007, pp. 541–

548.

[ 29 ] R. Ge, X. Feng, and K. Cameron, “Performance-constrained distributed dvs scheduling

for scientific applications on power-aware clusters,” in Proceedings of the 2005 ACM/IEEE

conference on Supercomputing. IEEE Computer Society Washington, DC, USA, 2005.

[ 30 ] J. Li and J. F. Martínez, “Dynamic power-performance adaptation of parallel

computation on chip multiprocessors,” in HPCA, 2006, pp. 77–87.

[ 31 ] C. Piguet, C. Schuster, and J. Nagel, “Optimizing architecture activity and logic depth

for static and dynamic power reduction,” in Circuits and Systems, 2004. NEWCAS 2004. The

2nd Annual IEEE Northeast Workshop on, 2004, pp. 41–44.

[ 32 ] Calheiros, R. N., Ranjan, R., Beloglazov, A., De Rose, C. A., & Buyya, R. (2011).

CloudSim: a toolkit for modeling and simulation of cloud computing environments and

evaluation of resource provisioning algorithms. Software: Practice and Experience, 41(1),

23-50.

[ 33 ] Tom Guerout, Thierry Monteil, Georges Da Costa, Rodrigo N. Calheiros, Rajkumar

Buyya, Mihai Alexandru. Energy-aware simulation with DVFS. Simulation Modelling

Practice and Theory, Volume 39, pages 76-91, December 2013.

[ 34 ] Chen, W., & Deelman, E. (2012, October). Workflowsim: A toolkit for simulating

scientific workflows in distributed environments. In E-Science (e-Science), 2012 IEEE 8th

International Conference on (pp. 1-8). IEEE.

[ 35 ] Pegasus, workflow generator. Available at: https://pegasus.isi.edu/

[ 36 ] Thulasiraman, K., & Swamy, M. N. S. (1992). 5.7 Acyclic Directed Graphs. Graphs:

Theory and Algorithms, 118.

[ 37 ] Pegasus montage workflow. Available at:

https://confluence.pegasus.isi.edu/display/pegasus/Montage

[ 38 ] Fei Cao, Michelle M. Zhu, Chase Q. Wu, “Energy-Efficient Resource Management

for Scientific Workflows in Clouds”, in 2014 IEEE 10th World Congress on Services.

95

[ 39 ] Cingolani, Pablo, and Jesús Alcalá-Fdez. "jFuzzyLogic: a Java Library to Design

Fuzzy Logic Controllers According to the Standard for Fuzzy Control Programming".

[ 40 ] Cingolani, Pablo, and Jesús Alcalá-Fdez. "jFuzzyLogic: a robust and flexible Fuzzy-

Logic inference system language implementation." Fuzzy Systems (FUZZ-IEEE), 2012

IEEE International Conference on. IEEE, 2012.

[ 41 ] Smith S.F.: A learning system based on genetic adaptive algorithms. PhD thesis,

University of Pittsburgh.

[ 42 ] Prado, R. P., Garcia-Galan, S., & Expósito, J. M. (2011, April). KASIA approach vs.

Differential evolution in fuzzy rule-based meta-schedulers for grid computing. In Genetic

and Evolutionary Fuzzy Systems (GEFS), 2011 IEEE 5th International Workshop on (pp.

87-94). IEEE.

[ 43 ] Seddiki, M., de Prado, R. P., Munoz-Expósito, J. E., & García-Galán, S. (2014). Fuzzy

Rule-Based Systems for Optimizing Power Consumption in Data Centers. In Image

Processing and Communications Challenges 5 (pp. 301-308). Springer International

Publishing.

[ 44 ] García-Galán, S., Prado, R. P., & Expósito, J. M. (2015). Rules discovery in fuzzy

classifier systems with PSO for scheduling in grid computational infrastructures. Applied

Soft Computing, 29, 424-435.

[ 45 ] García-Galán, S., Prado, R. P., & Expósito, J. E. M. (2014). Swarm Fuzzy Systems:

Knowledge Acquisition in Fuzzy Systems and Its Applications in Grid Computing. IEEE

Transactions on Knowledge and Data Engineering, 26(7), 1791-1804.

[ 46 ] Prado, R. P., Expósito, J. M., & Yuste, A. J. (2010). Knowledge acquisition in fuzzy-

rule-based systems with particle-swarm optimization. IEEE Transactions on Fuzzy Systems,

18(6), 1083-1097.

[ 47 ] Prado, R. P., García-Galán, S., Yuste, A. J., & Expósito, J. M. (2010). A fuzzy rule-

based meta-scheduler with evolutionary learning for grid computing. Engineering

Applications of Artificial Intelligence, 23(7), 1072-1082.

Download - de Linares - tauja.ujaen.estauja.ujaen.es/bitstream/10953.1/5204/1/Documentation.pdf · máquinas virtuales en los simuladores CloudSim y RealCloudSim, pero no consideran planificación

Top Related