come4acloud: an end-to-end framework for …...come4acloud: an end-to-end framework for autonomic...

31
HAL Id: hal-01762716 https://hal.archives-ouvertes.fr/hal-01762716 Submitted on 10 Apr 2018 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. CoMe4ACloud: An End-to-End Framework for Autonomic Cloud Systems Zakarea Al-Shara, Frederico Alvares, Hugo Bruneliere, Jonathan Lejeune, Charles Prud’Homme, Thomas Ledoux To cite this version: Zakarea Al-Shara, Frederico Alvares, Hugo Bruneliere, Jonathan Lejeune, Charles Prud’Homme, et al.. CoMe4ACloud: An End-to-End Framework for Autonomic Cloud Systems. Future Generation Computer Systems, Elsevier, 2018, 86, pp.339-354. 10.1016/j.future.2018.03.039. hal-01762716

Upload: others

Post on 12-Jul-2020

43 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

HAL Id: hal-01762716https://hal.archives-ouvertes.fr/hal-01762716

Submitted on 10 Apr 2018

HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, estdestinée au dépôt et à la diffusion de documentsscientifiques de niveau recherche, publiés ou non,émanant des établissements d’enseignement et derecherche français ou étrangers, des laboratoirespublics ou privés.

CoMe4ACloud: An End-to-End Framework forAutonomic Cloud Systems

Zakarea Al-Shara, Frederico Alvares, Hugo Bruneliere, Jonathan Lejeune,Charles Prud’Homme, Thomas Ledoux

To cite this version:Zakarea Al-Shara, Frederico Alvares, Hugo Bruneliere, Jonathan Lejeune, Charles Prud’Homme, etal.. CoMe4ACloud: An End-to-End Framework for Autonomic Cloud Systems. Future GenerationComputer Systems, Elsevier, 2018, 86, pp.339-354. �10.1016/j.future.2018.03.039�. �hal-01762716�

Page 2: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

CoMe4ACloud: An End-to-end Framework for AutonomicCloud Systems

Zakarea Al-Sharaa, Frederico Alvaresa, Hugo Brunelierea, Jonathan Lejeuneb,Charles Prud’Hommea, Thomas Ledouxa

aIMT Atlantique-Inria-LS2N, 4 rue Alfred Kastler, 44307, Nantes, FrancebSorbonne Universite-Inria-CNRS 4 place Jussieu, 75005 Paris, France

Abstract

Autonomic Computing has largely contributed to the development of self-manageableCloud services. It notably allows freeing Cloud administrators of the burden ofmanually managing varying-demand services, while still enforcing Service-LevelAgreements (SLAs). All Cloud artifacts, regardless of the layer carrying them, sharemany common characteristics. Thus, it should be possible to specify, (re)configureand monitor any XaaS (Anything-as-a-Service) layer in an homogeneous way. Tothis end, the CoMe4ACloud approach proposes a generic model-based architecturefor autonomic management of Cloud systems. We derive a generic unique Autono-mic Manager (AM) capable of managing any Cloud service, regardless of the layer.This AM is based on a constraint solver which aims at finding the optimal configu-ration for the modeled XaaS, i.e. the best balance between costs and revenues whilemeeting the constraints established by the SLA. We evaluate our approach in twodifferent ways. Firstly, we analyze qualitatively the impact of the AM behaviour onthe system configuration when a given series of events occurs. We show that theAM takes decisions in less than 10 seconds for several hundred nodes simulating vir-tual/physical machines. Secondly, we demonstrate the feasibility of the integrationwith real Cloud systems, such as Openstack, while still remaining generic. Finally,we discuss our approach according to the current state-of-the-art.

Keywords: Cloud Computing; Autonomic Computing; Model Driven Engineering;Constraint Programming

1. Introduction

Nowadays, Cloud Computing is becoming a fundamental paradigm which is wi-dely considered by companies when designing and building their systems. The num-ber of applications that are developed for and deployed in the Cloud is constantlyincreasing, even in areas where software was traditionally not seen as the core ele-5

ment (cf. the relatively recent trend on Industrial Internet of Things and Cloud

Email addresses: [email protected] (Zakarea Al-Shara),[email protected] (Frederico Alvares),[email protected] (Hugo Bruneliere), [email protected](Jonathan Lejeune), [email protected] (Charles Prud’Homme),[email protected] (Thomas Ledoux)

Preprint submitted to Future Generation Computer Systems April 10, 2018

Page 3: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

Manufacturing [1]). One of the main reasons for this popularity is the Cloud’s pro-visioning model, that allows for the allocation of resources in an on-demand basis.Thanks to this, consumers are able to request/release compute/storage/network re-sources, in a quasi-instantaneous manner, in order to cope with varying demands [2].10

From the provider perspective, a negative consequence of this service-based mo-del is that it may quickly lead the whole system to a level of dynamicity that makesit difficult to manage (e.g., to enforce Service Level Agreements (SLAs) by keepingQuality of Service (QoS) at acceptable levels). From the consumer perspective, thelarge amount and the variety of services available in the Cloud market [3] may turn15

the design, (re)configuration and monitoring into very complex and cumbersometasks. Despite of several recent initiatives intending to provide a more homoge-neous Cloud management support, for instance as part of the OASIS TOSCA [4]initiative or in some European funded projects (e.g., [5][6]), current solutions stillface some significant challenges.20

Heterogeneity. Firstly, the heterogeneity of the Cloud makes it difficult for these ap-proaches to be applied systematically in different possible contexts. Indeed, Cloudsystems may involve many resources potentially having various and varied natures(software and/or physical). In order to achieve well-tuned Cloud services, admi-nistrators need to take into consideration specificities (e.g., runtime properties) of25

several managed systems (to meet SLA guarantees at runtime). Solutions thatcan support in a similar way resources coming from all the different Cloud layers(e.g., IaaS, PaaS, SaaS) are thus required.

Automation. Cloud systems are scalable by definition, meaning that Cloud systemmay be composed of large sets of components and hence complex software structures30

to be handled manually in an efficient way. This concerns not only the base configu-ration and monitoring activities, but also the way Cloud systems should behave atruntime in order to guarantee certain QoS levels and expected SLA contracts. Asa consequence, solutions should provide means for gathering and analyzing sensordata, making decision and re-configuring (to translate taken decisions into actual35

actions on the system) when relevant.

Evolution. Cloud systems are highly dynamic: clients can book and release “elastic”virtual resources at any moment at time, according to given SLA contracts. Thus,solutions need to be able to reflect and support transparently the elastic and evolu-tionary aspects of services. This may be non trivial, especially for systems involving40

many different services.In this context, the CoMe4ACloud collaborative project 1 relies on three main

pillars: Modeling/MDE [7], Constraint Programming [8], and Autonomic Compu-ting [9]. Its primary goal is to provide a generic and extensible solution for theruntime management of Cloud services, independently from the Cloud layer(s) they45

belong to. We claim that Cloud systems, regardless of the layer in the Cloud servicestack, share many common characteristics and goals, which can serve as a basisfor a more homogeneous model. In fact, systems can assume the role of both con-sumer/provider in the Cloud service stack, and the interactions among them aregoverned by SLAs. In general, Anything-as-a-Service (XaaS) objectives are very50

similar when generalizing it to a Service-Oriented Architecture (SOA) model: (i)

1https://come4acloud.github.io/

2

Page 4: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

finding an optimal balance between costs and revenues, i.e., minimizing the costsdue to other purchased services and penalties due to SLA violation, while maxi-mizing revenues related to services provided to customers; (ii) meeting all SLA orinternal constraints (e.g., maximal capacity of resources) related to the concerned55

service.In previous work, we relied on the MAPE-K Autonomic Computing reference

architecture as a means to build generic an Autonomic Manager (AM) capable ofmanaging Cloud systems [10] at any layer. The associated generic model basicallyconsists of graphs and constraints formalizing the relationships between the Cloud60

service providers and their consumers in a SOA fashion. From this model, weautomatically generate a constraint programming model [8], which is then used asa decision-making and planning tool within the AM.

This paper adds on our previous work in that we provide further details on theconstraint programming models and translation schemes. Above all, in this work, we65

show how the generic model layer seamlessly connects to the runtime layer, i.e., howmonitoring data from the running system are reflected to the model and how changesin the model (performed by the AM or human administrators) are propagated tothe running system. We provide some examples showing this connection, notablyover an infrastructure based on the OpenStack [11]. We evaluate experimentally70

the feasibility of our approach by conducting a quantitative study over a simulatedIaaS system. The objective is to analyze the AM behaviour in terms of adaptationdecisions as well as to show how well it scales, considering the generic nature ofthe approach. Concretely, the results show the AM takes decisions in less than10 seconds for several hundred nodes simulating virtual/physical machines, while75

remaining generic.The remainder of the paper is structured as follows. In Section 2, we provide the

background concepts for the good understanding of our work. Section 3 presents anoverview of our approach in terms of architecture and underlying modeling support.In Section 4, we provide a formal description of the AM and explain how we designed80

it based on Constraint Programming. Section 5 gives more details on the actualimplementation of our approach and its connection to real Cloud systems. In Section6, we provide and discuss related performance evaluation data. We describe indetails the available related work in Section 7 and conclude in Section 8 by openingon future work.85

2. Background

The CoMe4Acloud project is mainly based on three complementary domains ofComputer science.

2.1. Autonomic Computing

Autonomic Computing [9] emerged from the necessity to autonomously manage90

complex systems, in which the manual human-like maintenance becomes infeasiblesuch as those in context of Cloud Computing. Autonomic Computing provides a setof principles and reference architecture to help the development of self-manageablesoftware systems. Autonomic systems are defined as a collection of autonomic ele-ments that communicate with each other. An autonomic element consists of a single95

autonomic manager (AM) that controls one or many managed elements. A mana-ged element is a software or hardware resources similar to its counterpart found

3

Page 5: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

in non-autonomic systems, except for the fact that it is adapted with sensors andactuators so as to be controllable by autonomic managers.

An autonomic manager is defined as a software component that, based on high-100

level goals, uses the monitoring data from sensors and the internal knowledge ofthe system to plan and execute actions on the managed element (via actuators) inorder to achieve those goals. It is also known as a MAPE-K loop, as a reference toMonitor, Analyze, Plan, Execute, Knowledge.

As previously stated, the monitoring task is in charge of observing the data105

collected by software or hardware sensors deployed in the managed element. Theanalysis task is in charge of finding a desired state for the managed element bytaking into consideration the monitored data, the current state of the managedelement, and adaptation policies. The planning task takes into consideration thecurrent state and the desired state resulting from the analysis task to produce a set110

of changes to be performed on the managed elements. Those changes are actuallyperformed in the execution task within the desired time with the help of the actu-ators deployed on the managed elements. Last but not least, the knowledge in anautonomic system assemble information about the autonomic element (e.g., systemrepresentation models, information on the managed system’s states, adaptation po-115

licies, and so on) and can be accessed by the four tasks previously described.

2.2. Constraint Programming

In the context of autonomic computing systems to manage the dynamics of cloudsystems, in order to take into consideration goals or utility functions, it is necessaryto implement some methods. In this work, the goals and utility functions are defined120

in terms of constraint satisfaction and optimization problems. To this end we relyon Constraint Programming to model and solve these kind of problems.

Constraint Programming (CP) is a paradigm that aims to solve combinatorialproblems defined by variables, each of them associated with a domain, and con-straints over them [8]. Then a general purpose solver attempts to find a solution,125

that is an assignment of each variable to a value from its domain which meet theconstraints it is involved in. Examples of CP solvers include free open-source li-braries, such as Choco solver [12], Gecode [13] or OR-tools [14] and commercialsoftwares, such as IBM CPLEX CP Optimizer [15] or SICStus Prolog [16].

2.2.1. Modeling a CSP130

In Constraint Programming, a Constraint Satisfaction Problem (CSP) is definedas a tuple 〈X,D,C〉 and consists of a set of n variables X = {X1, X2, . . . , Xn}, theirassociated domains D, and a collection of m constraints C. D refers to a functionthat maps each variable Xi ∈ X to the respective domain D(Xi). A variableXi can be assigned to integer values (i.e., D(Xi) ⊆ Z), a set of discrete values135

(i.e. D(Xi) ⊆ P(Z)) or real values (i.g. D(Xi) ⊂ R). Finally, C correspondsto a set of constraints {C1, C2, . . . , Cm} that restrain the possible values variablescan be assigned to. So, let (v1, v2, . . . , v

jn) be a tuple of possible values for subset

Xj = {Xj1 , X

j2 , . . . , X

jnj} ⊆ X. A constraint Cj is defined as a relation on set Xj

such that (v1, v2, . . . , vjn) ∈ Cj

⋂(D(Xj

1) x D(Xj2) x . . . x D(Xj

nj )).140

2.2.2. Solving a CSP with CPIn CP, the user provides a CSP and a CP solver takes care of solving it. Solving

a CSP 〈X,D,C〉 is about finding a tuple of possible values (v1, v2, . . . , vn) for each

4

Page 6: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

variable Xi ∈ X such that ∀i ∈ [[1..n]], vi ∈ D(Xi) and all the constraints Cj ∈ Care met. In the case of a Constraint Optimization Problem (COP), that is, when145

a optimization criterion have to be maximized or minimized, a solution is the onethat maximizes or minimizes a given objective function f : D(X) 7→ Z.

A constraint model can be achieved in a modular and composable way. Eachconstraint expresses a specific sub-problem, from arithmetical expressions to morecomplex relations such as AllDifferent [17] or Regular [18]. A constraint not only150

defines a semantic (AllDifferent: variables should take distinct value in a solution,Regular : an assignment should respect a pattern given by an automaton) but alsoembeds a filtering algorithm which detects values that cannot be extended to asolution. Modeling a CSP consists hence in combining constraints together, whichoffers both flexibility (the model can be easily adapted to needs) and expressiveness155

(the model is almost human readable). Solving a CSP consists in an alternation ofa propagation algorithm (each constraint removes forbidden values, if any) and aDepth First Search algorithm with backtrack to explore the search space.

Overall, the advantages of adopting CP as decision-making modeling and solvingtool is manifold: no particular knowledge is required to describe the problem, adding160

or removing variables/constraints is easy (and thus useful when code is generated),the general purpose solver can be tweaked easily.

2.3. Model-driven Engineering

Model Driven Engineering (MDE) [19, 20], more generally also referred to asModeling, is a software engineering paradigm relying on the intensive creation, ma-165

nipulation and (re)use of various and varied types of models. In a MDE/Modelingapproach, these models are actually the first-class artifacts within related design,development, maintenance and/or evolution processes concerning software as wellas their environments and data. The main underlying idea is to reason as muchas possible at a higher level of abstraction than the one usually considered in more170

traditional approaches, e.g. which are often source code-based. Thus the focus isstrongly put in modeling, or allowing the modeling of, the knowledge around thetargeted domain or range of problems. One of the principal objectives is to capita-lize on this knowledge/expertise in order to better automate and make more efficientthe targeted processes.175

Since several years already, there is a rich international ecosystem on approa-ches, practices, solutions and concrete use cases in/for Modeling [21]. Among themost frequent applications in the industry, we can mention the (semi-)automateddevelopment of software (notably via code generation techniques), the support forsystem and language interoperability (e.g. via metamodeling and model transforma-180

tion techniques) or the reverse engineering of existing software solutions (via modeldiscovery and understanding techniques). Complementarily, another usage that hasincreased considerably in the past years, both in the academic and industrial world,is the support for developing Domain-Specific Languages (DSLs) [22]. Finally, thegrowing deployment of so-called Cyber-Physical Systems (CPSs), that are becoming185

more and more complex in different industry sectors (thanks to the advent of Cloudand IoT for example), has been creating new requirements in terms of Modeling [23].

3. Approach Overview

This section provides an overview of the CoMe4ACloud approach. First, wedescribe the global architecture of our approach, before presenting the generic me-190

5

Page 7: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

tamodels that can be used by the users (Cloud Experts and Administrators) tomodel Cloud systems. Finally, to help Cloud users to deal with cross-layers andSLA, we provide a service-oriented modeling extension for XaaS layers.

3.1. Architecture

The proposed architecture is depicted in Figure 1 and is based on two main kinds195

of models, which conform to two different but complementary metamodels. On onehand, the topology metamodel is dedicated to the specification of the different to-pologies (i.e., types) of Cloud systems. It is generic because this can be realized in asimilar way for systems concerning any of the possible Cloud layers (e.g., IaaS, PaaSor SaaS). On the other hand, the configuration metamodel is intended to the repre-200

sentation of actual configurations (i.e., instances) of such systems at runtime. Thisis realized by referring to a corresponding (and previously specified) topology. Onceagain, this Configuration metamodel is generic because it is independent from anyparticular Cloud layer and topology/type of Cloud system. These two metamodelsare the core elements of the XaaS modeling language we proposed in CoMe4ACloud205

(cf. next Section 3.2).

parameterize

refers to

CloudExpert

CloudAdministrator

Runtime

Design time

Executor

Cloud System s

Analyzer Planner

MonitorKnowledge(models)

sensors actuators

ConfigurationModel c

represents

ConfigurationMetamodel

TopologyMetamodel

conforms to

TopologyModel t

conforms to

1

2 3

4 7

5 6

Figure 1: Overview of the model-based architecture in CoMe4ACloud.

In order to better grasp the fundamental concepts of our approach, it is impor-tant to reason about the architecture in terms of life-cycle. The life-cycle associatedwith this architecture involves both kinds of models. A topology model t has to bedefined manually by a Cloud expert at design time (step 1). The objective is to210

specify a particular topology of system to be modeled and then handled at runtime,e.g., a given type of IaaS (with Virtual/Physical Machines nodes) or SaaS (withwebserver and database components). The topology model is used as the input of

6

Page 8: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

a specific code generator that parameterizes a generic constraint program that isintegrated into the Analyzer (step 2) of the generic AM.215

The goal of the constraint program is to automatically compute and propose anew suitable system configuration model from an original one. Hence, in the be-ginning, the Cloud Administrator must provide an initial configuration model c0(step 3), which, along with the fore-coming configurations, is stored in the AM’sKnowledge base. The state of the system at a given point in time is gathered by220

the Monitor (step 4) and represented as a (potentially new) configuration modelc0′. It is important to notice that this new configuration model c0′ reflects therunning Cloud system s current state (e.g., a host that went down or a load va-riation), but it does not necessarily respect the constraints defined by the CloudExpert/Administrator, e.g., if a PM crashed, all the VMs hosted by it should be225

reassigned, otherwise the system will be in a inconsistent state. To that effect, theAnalyzer is launched (step 5) whenever a new configuration model exists, whether itresults from modifications that are manually performed by the Cloud Administratoror automatically performed by the Monitor. It takes into account the current (new)configuration model c0′ and related set of constraints encoded in the CP itself. As a230

result, a new configuration model c1 respecting those constraints is produced. ThePlanner produces a set of ordered actions (step 6) that have to be applied in orderto go from the source (c0′) to the target (c1) configuration model. More details onthe decision-making process, including the constraint program is given in Section 4.

Finally, the Executor (step 7) relies on actuators deployed on the real Cloud235

System to apply those actions. This whole process (from steps 4 to 7) can be re-executed as many times as required, according to the runtime conditions and theconstraints imposed by the Cloud Expert/Administrator.

It is important to notice that configuration models are meant to be represen-tations of actual Cloud systems at given points in time. This can be seen with240

configuration model c (stored within the Knowledge base) and Cloud system s inFigure 1, for instance. Thus, the content of these models has to always reflect thecurrent state of the corresponding running Cloud systems. More details on how weensure the required synchronization between the model(s) and the actual systemare given in Section 5.2.245

3.2. Generic Topology and Configuration Modeling

One of the key features of the CoMe4ACloud approach is the high-level languageand tooling support, whose objective is to facilitate the description of autonomicCloud systems with adaptation capabilities. For that purpose, we strongly rely onan MDE approach that is based on two generic metamodels.250

As shown in Figure 2, the Topology metamodel covers 1) the general descriptionof the structure of a given topology and 2) the constraint expressions that can beattached to the specified types of nodes and relationships. Starting by the structuralaspects, each Cloud system’s Topology is named and composed of a set of NodeTypesand corresponding RelationshipTypes that specify how to interconnect them. It can255

also have some global constraints attached to it.Each NodeType has a name, a set of AttributeTypes and can inherit from another

NodeType. It can also have one or several specific Constraints attached to it. Cloudexperts can declare the impact (or “cost”) of enabling/disabling nodes at runtime(e.g., a given type of Physical Machine/PM node takes a certain time to be switched260

on/off).

7

Page 9: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

name : String

Topology

name : StringimpactOfEnabling : IntegerimpactOfDisabling : Integer

NodeType

name : Stringtype : StringimpactOfUpdating : Integer

AttributeType

1

nodeTypes1..*

operator : ComparisonOperator

Constraint

Expression

0..1

expressions

2

value : Integer

IntegerValueExp

AttributeExp

operator : AlgebraicOperator

BinaryExp

operator : AggregationOperatordirection : DirectionKind

AggregationExp

direction : DirectionKind

NbConnectionExp

name : StringimpactOfLinking : IntegerimpactOfUnlinking : Integer

RelationshipType

1

rela

tio

nsh

ipTy

pes

0..*

1

attributeTypes0..*

target

1

0..*

source 1

0..*

0..1

expressions

20

..*

attributeType

1

0..

*

0..

*

inheritedType

0..1

ConstantAttributeType

SumProdMinusDiv

«enumeration»AlgebraicOperator

SumMinMax

«enumeration»AggregationOperator

Predecessor/SourceSuccessor/Target

«enumeration»DirectionKind

CalculatedAttributeType

0..1

con

stra

ints

0..*

0..1

expression1

LesserGreaterEqualLesserOrEqualGreaterOrEqual

«enumeration»ComparisonOperator

0..1

con

stra

ints

0..*

0..

1

constraints

0..*

expression : String

CustomExp

Figure 2: Overview of the Topology metamodel - Design time.

Each AttributeType has a name and value type. It allows indicating the impactof updating related attribute values at runtime. A ConstantAttributeType stores aconstant value at runtime, a CalculatedAttributeType allows setting an Expressionautomatically computing its value at runtime.265

Any given RelationshipType has a name and defines a source and target Node-Type. It also allows specifying the impact of linking/unlinking corresponding nodesvia relationships at runtime (e.g., migrating a given type of Virtual Machine/VMnode from a type of PM node to another one can take several minutes). One orseveral specific Constraints can be attached to a RelationshipType.270

A Constraint relates two Expressions according to a predefined set of comparisonoperators. An Expression can be a single static IntegerValueExpression or an Attri-buteExpression pointing to an AttributeType. It can be a NbConnectionExpressionrepresenting the number of NodeTypes connected to a given NodeType or Relati-onshipType (at runtime) as predecessor/successor or source/target respectively. It275

can also be a AggregationExpression aggregating the values of a AttributeType fromthe predecessors/successors of a given NodeType, according to a predefined set ofaggregation operators. It can be a BinaryExpression between two (sub)Expressions,

8

Page 10: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

according to a predefined set of algebraic operators. Finally, it can be a CustomEx-pression using any available constraint/query language (e.g., OCL, XPath, etc.),280

the full expression simply stored as a string. Tools exploiting corresponding modelsare then in charge of processing such expressions.

As shown in Figure 3, the Configuration part of the language is lighter anddirectly refers to the Topology one. An actually running Cloud system Configurationis composed of a set of Nodes and Relationships between them.285

identifier : String

Configuration

identifier : Stringactivated : Boolean

Node

1

nodes1..*

name : Stringvalue : String

Attribute

identifier : Stringconstant : Boolean

Relationship

1

attributes0..*

1

relationships0..*

name : String

NodeType

type 1

0..*

name : String

Topology

topology 1 0..*

source 1 0..*

target1 0..*

name : String

RelationshipTypetype

1

0..*

Figure 3: Overview of the Configuration metamodel - Runtime.

Each Node has an identifier and is of a given NodeType, as specified by thecorresponding topology. It also comes with a boolean value indicating whether itis actually activated or not in the configuration. This activation can be reflecteddifferently in the real system according to the concerned type of node (e.g., a givenVirtual Machine (VM) is already launched or not). A node contains a set of Atti-290

ributes providing name/value pairs, still following the specifications of the relatedtopology.

Each Relationship also has an identifier and is of a given RelationshipType, asspecified again by the corresponding topology. It simply interconnects two allowedNodes together and indicates if the relationship can be possibly changed (i.e., re-295

moved) over time, i.e., if it is constant or not.

3.3. Service-oriented Topology Model for XaaS layers

The Topology and Configuration metamodels presented in the previous sectionprovide a generic language to model XaaS systems. Thanks to that, we can modelany kind of XaaS system that can be expressed by a Direct Acyclic Graph (DAG)300

with constraints having to hold at runtime. However, this level of abstraction canalso be seen as an obstacle for some Cloud Experts and Administrators to modelelements really specific to Cloud Computing. Thus, in addition to the generic mo-deling language presented before, we also provide in CoMe4ACloud an initial set ofreusable node types which are related to the core Cloud concepts. They constitute305

a base Service-oriented topology model which basically represents XaaS systems interms of their consumers (i.e., the clients that consumes the offered services), theirproviders (i.e., the required resources, also offered as services) and the Service LevelAgreements (SLA) formalizing those relationships. Figure 4 shows an illustrativegraphical representation of an example configuration model using the pre-defined310

node types.

9

Page 11: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

SLAClienttotalCost

SLAClienttotalCost

RootClientSysRev

ServiceClient

ServiceClient

ServiceClient

ServiceClient

SLAProvider

totalCost

RootProvider

SysExp

SLAProvider

totalCost

ServiceProvider

ServiceProvider

ServiceProvider

SLAProvider

totalCost

InternalComponent

InternalComponent

InternalComponent

InternalComponent

InternalComponent

InternalComponent

InternalComponent

InternalComponent

InternalComponent

Pro

vid

ed s

erv

ices

S

ervi

ces

bo

ug

ht

fro

m o

ther

Xaa

SIn

tern

al c

om

po

nen

ts

Legend:

SLAClienttotalCost

Node

Nodetype

Attributetype

RelationShip

Figure 4: Example of configuration model using base Service-oriented node types (illustrativerepresentation).

Root node types. We introduce two types of root nodes: RootProvider andRootClient. In any configuration model, it can only exist one node of each root nodetype. These two nodes do not represent a real component of the system but they canbe rather seen as theoretical nodes. A RootProvider node (resp. RootClient node)315

has no target node (resp. source node) and is considered as the final target (resp.initial source). In other words, a RootProvider node (resp. RootClient node) noderepresents the set of all the providers (resp. the consumers) of the managed system.This allows grouping all features of both provider and consumer layers, especiallythe costs due to operational expenses of services bought from all the providers (re-320

presented by attribute SysExp in a RootProvider node) and revenues thanks toservices sold to all the consumers (represented by attribute SysRev in a RootClientnode).

10

Page 12: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

SLA node types. We also introduce two types of SLA nodes: SLAClient andSLAProvider. In a configuration model, SLA nodes define the prices of each service325

level that can be provided and the amount of penalties for violations. Thus, bothtypes of SLA nodes provide different attributes representing the different prices,penalties and then the current cost or revenue (total cost) induced by current set ofbought services (cf. the Service node types below) associated with it. A SLAClientnode (resp. SLAProvider node) has a unique source (resp. target) which is the330

RootClient node (resp. RootProvider node) in the configuration. Consequently,an attribute SysRev (resp. SysExp) is equal to the sum of all attribute total costof its sources node (resp. target nodes).

Service node types. A SLA defines several Service Level Objectives (SLO) foreach provided service [24]. Thus, we have to provide base Service node types: each335

service provided to a client (resp. received from a provider) is represented by a nodeof type ServiceClient (resp. ServiceProvider). The different SLOs are attributesof the corresponding Service nodes (e.g., configuration requirements, availability,response time, etc.). Since each Service node is linked with a unique SLA node ina configuration model, we define an attribute that designate the SLA node relating340

to a given service node. For a ServiceClient node (resp. ServiceProvider node),this attribute is named sla client (resp. sla prov) and its value is a node ID whichmeans that the node has a unique source (resp. target) corresponding to the SLA.

Internal Component node type. InternalComponent represents any kind ofnode of the XaaS layer that we want to manage with the Generic AM (contrary345

to the previous node types which are theoretical nodes and provided as core Cloudconcepts). Thus, it is kind of a common super-type of node to be extended by usersof the CoMe4ACloud approach within their own topologies (e.g., cf. Listing 1 fromSection 5.1). A node of this type may be used by another InternalComponent nodeor by a ServiceClient node. Conversely, it may require another InternalComponent350

node or a ServiceProvider node to work.

4. Decision Making Model

In this section, we describe how we formally modeled the decision making part(i.e., the Analyzer and Planner) of our generic Autonomic Manager (AM) by relyingon Constraint Programming (CP).355

4.1. Knowledge (Configuration Models)

As previously mentioned, the Knowledge contains models of the current and pastconfigurations of the Cloud system (i.e., managed element). We define formal nota-tions for a configuration at a given instant according to the XaaS model describedin Figure 3.360

4.1.1. The notion of time and configuration consistencyWe first define T , the set of instants t representing the execution time of the

system where t0 is the instant of the first configuration (e.g., the very first configu-ration model initialized by the Cloud Administrator, cf. Figure 1).

The XaaS configuration model at instant t is denoted by ct , organized in a365

Directed Acyclic Graph (DAG), where vertices correspond to nodes and edges torelationships of the configuration metamodel (cf. Figure 3). CSTRct denotes the

11

Page 13: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

set of constraints of configuration ct . Notice that these constraints refer to thosedefined in the topology model (cf. Figure 2).

The property satisfy(cstr, t) is verified at t if and only if the constraint cstr ∈370

CSTRct is met at instant t. The system is satisfied (satisfy(ct)) at instant t, if andonly if ∀cstr ∈ CSTRct , satisfy(cstr, t). Finally, function H(ct) gives the score ofthe configuration c at instant t : the higher the value, the better the configuration(e.g., in terms of balance between costs and revenues).

4.1.2. Nodes and Attributes375

Let nt be a node at instant t. As defined in Section 3.2 it is characterized by:

� a node identifier (idn ∈ IDt), where IDt is the set of existing node identifiersat t and idn is unique ∀t ∈ T ;

� a type (typen ∈ TY PES)

� a set of predecessors (predsnt ∈ P(IDt)) and successors (succsnt ∈ P(IDt))

nodes in the DAG. Note that ∀nta, n

tb ∈ct , idnt

b6= idnt

a

∃idntb∈ succsnt

a⇔ ∃idnt

a∈ predsnt

b

. It is worth noting that the notion of predecessors and successors here is380

implicit in the notion of Relationship of the configuration metamodel.

� a set of constraints CSTRnt specific to the type (cf. Figure 2).

� a set of attributes (attsnt ) defining the node’s internal state.

An attribute attt ∈ attsnt at instant t is defined by:

� name nameatt , which is constant ∀t ∈ T ,385

� a value denoted valattt ∈ R∪IDt (i.e., an attribute value is either a real valueor a node identifier)

4.1.3. Configuration EvolutionThe Knowledge within the AM evolves as configuration models are modified

over the time. In order to model the transition between configuration models, the390

time T is discretized by the application of a transition function f on ct such thatf(ct) = ct+1. A configuration model transition can be triggered in two ways by:

� an internal event (e.g., the Cloud Administrator initializes (add) a softwarecomponent/node, a PM crashes) or an external event (e.g., a new client ar-rival), which in both cases alters the system configuration and thus results395

in a new configuration model (cf. function event in Figure 5). This functionmodels typically the Monitor component of the AM.

� the AM that performs the function control. This function ensures that satisfy(ct+1)is verified, while maximizing H(ct+1) 2 and minimizing the transition cost to

2Since the research of optimal configuration (a configuration where the function H() has themaximum possible value) may be too costly in terms of execution time, it is possible to assumethat the execution time of the control function is limited by a bound set by the administrator.

12

Page 14: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

change the system state between ct and ct+1. This function characterizes the400

execution of the Analyzer, Planner and Executor components of the AM.

Figure 5 illustrates a transition graph among several configurations. It showsthat an event function potentially moves away the current configuration from anoptimal configuration and that a control function tries to get closer an new optimalconfiguration while respecting all the system constraints.405

Optimal Configurations

Satisfiable Configurations

Conf0Conf1

Conf2 Conf3

Conf4

Conf5

Conf6

Contol loop

Event

Controlloop

Control

loop

Event

Event

Set of all possible system configurations

Figure 5: Examples of configuration transition in the set of configurations.

4.2. Analyzer (Constraint Model)

In the AM, the Analyzer component is achieved by a constraint solver. A Con-straint Programming Model [8] needs three elements to find a solution: a staticset of problem variables, a domain function, which associates to each variable itsdomain, and a set of constraints. In our model, the graph corresponding to the410

configuration model can be considered as a composite variable defined in a domain.For the constraint solver, the decision to add a new node in the configuration isimpossible as it implies the adding of new variables to the constraint model duringthe evaluation. We have hence to define a set N t corresponding to an upper boundof the node set ct , i.e., ct ⊆ N t . More precisely, N t is the set of all existing nodes415

at instant t. Every node nt /∈ ct is considered as deactivated and does not take partin the running system at instant t.

Each existing node has consequently a boolean attribute called ”activated” (cf.Node attribute activated in Figure 3). Thanks to this attribute the constraint solvercan decide whether a node has to be enabled (true value) or disabled (false value).420

The property enable(nt) verifies if and only if n is activated at t. This propertyhas an incidence over the two neighbor sets predsnt and succsnt . Indeed, when

enable(nt) is false nt has no neighbor because n does not depend on other node andno node may depend on n. The set N t can only be changed by the Administratoror by the Monitor when it detects for instance a node failure or a new node in the425

running system (managed element), meaning that a node will be removed or addedin N t+1.

13

Page 15: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

Figure 6 depicts an example of two configuration transitions. At instant t, thereis a node set N t = {n1, n2, . . . , n8} and ct = {n1, n2, n5, n6, n7}. Each node colorrepresents a given type defined in the topology (cf. Figure 3). The next configuration430

at t + 1, the Monitor component detects that component n6 of a green type hasfailed, leading the managed system to an unsatisfiable configuration. At t + 2, thecontrol function detects the need to activate a deactivated node of the same type inorder to replace n6 by n8. This scenario may match the configuration transitionsfrom conf1 to conf3 in Figure 5.435

n1

n2 n

7

n5

n6

n8

n4

n3

(a) Node set at t

n1

n2 n

7

n5

n8

n4

n3

(b) Node set at t + 1

n1

n2 n

7

n5

n8

n4

n3

(c) Node set at t + 2

Figure 6: Examples of configuration transitions.

4.2.1. Configuration ConstraintsThe Analyzer should not only find a configuration that satisfies the constraints.

It should also consider the objective function H() that is part of the configura-tion constraints. The graph representing the managed element (the running Cloudsystem) has to meet the following constraints:440

1. any deactivated node nt at t ∈ T has no neighbor: nt does not depend onother nodes and there is no node that depends on nt . Formally,

¬enable(nt)⇒(succsnt = ∅ ∧ predsnt = ∅

)2. except for root node types (cf. Section 3.3), any activated node has at least

one predecessor and one successor. Formally,

enable(nt)⇒(| succsnt |> 0 ∧ | predsnt |> 0

)3. if a node nti is enabled at instant ti, then all the constraints associated with

na (link and attribute constraints) will be met in a finite time. Formally,

enable(nti)⇒ ∃tj ≥ ti,∀cstr ∈ CSTRnti

∧cstr ∈ CSTRntj ∧ enable(ntj ) ∧ satisfy(cstr, tj)

4. the functionH() is equal to the balance between the revenues and the expensesof the system (cf. Figure 4). Formally,

H(ct) = atttrev − atttexp

whereatttrev ∈ attsnt

RC∧ atttrev = SysRev

14

Page 16: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

andatttexp ∈ attsnt

RP∧ atttexp = SysExp

4.2.2. Execution of the AnalyzerThe Analyzer needs four inputs to process the next configuration:

a) The current configuration model which may be not satisfiable (e.g., c0′ inSection 3.1);

b) The most recent satisfiable configuration model (e.g., c0 in Section 3.1);445

c) An expected lower bound of the next balance. This value depends on thereason why the AM has been triggered. For instance, we know that if thereason is a new client arrival or if a provider decreases its prices, the expectedbalance must be higher than the previous one. Conversely, if there is a clientdeparture, we can estimate that the lower bound of the next balance will be450

smaller (in this case, the old balance minus the revenue brought by the client);This forces the solver to find a solution with a balance greater than or equalto this input value.

d) A boolean that indicates whether to use the Neighborhood Search Strategy,which is explained bellow.455

1 Analyse (CurrentConf , SatisfiableConf , MinBalance, withNeighborhood)Result: a satisfiable Configuration

2 begin3 solver ← buildConstraintModelFrom(CurrentConf);4 initializer ← buildInitializer(SatisfiableConf , MinBalance, withNeighgorhood);5 while not found solution and not error do6 solver.reset();7 initializer.nextVariableInitialization();8 if variables correctly initialized then9 solver.findsolution();

10 if a solution s found then11 return s;

12 else13 error(”impossible to initialize variables”)

Algorithm 1: Global algorithm of the Analyzer

Algorithm 1 is the global algorithm of the Analyzer which mimics Large Neig-hborhood Search [25]. This strategy consists in two-step loop (lines 5 to 13) whichis executed after the constraint model is instantiated in the solver (line 3) from thecurrent configuration (i.e., variables and constraints are declared).460

First, in line 7, some variables of the solver model are selected to be fixed totheir value in the previous satisfiable configuration ( in our case, the b) parameter).This reduces the number of values of some variables Xs which can be assigned to,D0(Xj) ⊂ D(Xj),∀Xj ∈ Xs.

Variables not selected to be fixed represent a variable area (V A) in the DAG.465

It corresponds to the set of nodes in the graph the solver is able to change theirsuccessors and predecessors links. Such a restriction makes the search space toexplore by the solver smaller which tends to reduce solving time. The way variablesare selected is managed by the initializer (line 4). Note that when the initializer isbuilt, the initial variable area, V Ai where i = 0, contains all the deactivated nodes470

and any nodes whose state has changed since the last optimal configuration (ex :attribute value modification, disappearance/appearance of a neighbour).

15

Page 17: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

Then, the solver tries to find a solution for the partially restricted configuration(line 9). If a solution is found, the loop breaks and the new configuration is returned(line 11). Otherwise, the variable area is extended (line 7). A call for a new initi-475

alization at iteration i means that the solver has proved that there is no solutionin iteration i − 1. Consequently, a new initialization leads to relax the previousV Ai−1, Di−1(Xj) ⊆ Di(Xj) ⊆ D(Xj),∀Xj ∈ Xs. At iteration i, V Ai is equal toV Ai−1 plus the sets of successors and predecessors of all nodes in V Ai−1. Finally,if no solution is found and the initializer is not able to relax domains anymore,480

Di(Xj) = D(Xj),∀Xj ∈ Xs, the Analyzer throws an error.This mechanism brings three advantages: (1) it reduces the solving time because

domains cardinality is restrained, (2) it limits the set of actions in the plan, achievingthus one of our objective and (3) it tends to produce configurations that are closeto the previous one in term of activated nodes and links.485

Note that without the Neighborhood Search Strategy, the initial variable areaVA0 is equal to the whole graph leading thus to a single iteration.

4.2.3. Planner (Differencing and Match)The Planner relies on differencing and match algorithms for object-oriented

models [26] to compute the differences between the current configuration and the490

new configuration produced by the Analyze. From a generic point of view it existsfive types of action : enable and disable node; link and unlink two nodes; and updateattribute value.

5. Implementation Details

In this section, we provide some implementation details regarding the modeling495

languages and tooling support used by the users to specify Cloud systems as wellas the mechanisms of synchronization between the models within the AutonomicManager and the actual running Cloud systems.

5.1. A YAML-like Concrete Syntax

We propose a notation to allow Cloud experts quickly specifying their topo-500

logies and initializing related configurations. It also permits sharing such modelsin a simple syntax to be directly read and understood by Cloud administrators.We first built an XML dialect and prototyped an initial version. But we observedthat it was too verbose and complex, especially for newcomers. We also thoughtabout providing a graphical syntax via simple diagrams. While this seems appro-505

priate for visualizing configurations, this makes more time-consuming the topologycreation/edition (writing is usually faster than diagramming for Cloud technicalexperts). Finally, we designed a lightweight textual syntax covering both topologyand configuration specifications.

To provide a syntax that looks familiar to Cloud users, we considered YAML and510

its TOSCA version [27] featuring most of the structural constructs we needed (fortopologies and configurations). We decided to start from this syntax and comple-ment it with the elements specific to our language, notably concerning expressionsand constraints as not supported in YAML (cf. Section 3.2). We also ignored someconstructs from TOSCA YAML that are not required in our language (e.g., related515

to interfaces, requirements or capabilities). Moreover, we can still rely on otherexisting notations. For instance, by translating a configuration definition from our

16

Page 18: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

language to TOSCA, users can benefit from the GUI offered by external toolingsuch as Eclipse Winery [28].

As shown on Listing 1, for each node type the user gives its name and the node520

type it inherits from (if any) (cf. Section 3.3). Then she describes its differentattribute types via the properties field, following the TOSCA YAML terminology.Similarly, for each relationship type the expert gives its name and then indicates itssource and target node types.

As explained before (and not supported in TOSCA YAML), expressions can be525

used to indicate how to compute the initial value of an attribute type. For instance,the variable ClusterCurConsumption of the Cluster node type is initialized at confi-guration level by making a product between the value of other variables. Expressionscan also be used to attach constraints to a given node/relationship type. For exam-ple, in the node type Power, the value of the variable PowerCurConsumption has530

to be lesser or equal to the value of the constant PowerCapacity (at configurationlevel).

As shown on Listing 2, for each configuration the user provides a unique iden-tifier and indicates which topology it is relying on. Then, for each actual node/-relationship, its particular type is explicitly specified by directly referring to the535

corresponding node/relationship type from a defined topology. Each node descri-bes the values of its different attributes (calculated or set manually), while eachrelationship describes its source and target nodes.

5.2. Synchronization with the running system

We follow the principles of Models@Runtime [29], by defining a bidirectional540

causal link between the running system and the model. The idea is to decouple thespecificities of the causal link, w.r.t. the specific running subsystems, while keepingthe Autonomic Manager generic, as sketched in Figure 7. It is important to recallthat the configuration model is a representation of the running system and it can bemodified in three different situations: (i) when the Cloud administrator manually545

changes the model; (ii) when it is the time to update the current configuration withdata coming from the running system, which is done by the Monitor component;and (iii) when the Analyzer decides for a better configuration (e.g., with higherbalance function), in which case the Executor performs the necessary actions onthe running Cloud systems. Therefore, the causal link with the running system is550

defined by two different APIs, which allows to reflect both the changes performedby the generic AM to the actual Cloud systems, and the changes that occur on thesystem at runtime to the generic AM. To that effect, we propose the implementationof an adaptor for each target running system (managed element).

From the Executor component perspective, the objective is to translate generic555

actions, i.e., enable/disable, link/unlink nodes, update attribute values, into con-crete operations (e.g., deploy VM at a given PM) to be invoked over actuators ofthe different running subsystems (e.g., Openstack, AWS, Moodle, etc.). From theMonitor point of view, the adaptors’ role is to gather information from sensors de-ployed at the running subsystems (e.g., a PM failure, a workload variation) and560

translate it into the generic operations to be performed on the configuration mo-del by the Monitor, i.e., add/remove/enable/disable node, link/unlike nodes andupdate attribute value.

It should be noticed that the difference between the two APIs is the possibilityto add and remove nodes to the configuration model. In fact, the resulting configu-565

ration from the Analyzer does not imply the addition or removal of any node, since

17

Page 19: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

Listing 1: Topology excerpt.

1 Topology: IaaS2

3 node_types:4

5 InternalComponent:6 ...7

8 PM:9 derived_from: InternalComponent10 properties:11 impactOfEnabling: 4012 impactOfDisabling: 3013 ...14

15 VM:16 derived_from: InternalComponent17 properties:18 ...19

20 Cluster:21 derived_from: InternalComponent22 properties:23 constant ClusterConsOneCPU:24 type: integer25 constant ClusterConsOneRAM:26 type: integer27 constant ClusterConsMinOnePM:28 type: integer29 variable ClusterNbCPUActive:30 type: integer31 equal: Sum(Pred , PM.PmNbCPUAllocated

)32 variable ClusterCurConsumption:33 type: integer34 equal: ClusterConsMinOnePM * NbLink(

Pred) + ClusterNbCPUActive *ClusterConsOneCPU +ClusterConsOneRAM * Sum(Pred ,PM.PmSizeRAMAllocated)

35

36 Power:37 derived_from: ServiceProvider38 properties:39 constant PowerCapacity:40 type: integer41 variable PowerCurConsumption:42 type: integer43 equal: Sum(Pred , Cluster.

ClusterCurConsumption)44 constraints:45 PowerCurConsumption less_or_equal:

PowerCapacity46

47 ...48

49 relationship_types:50

51 VM_To_PM:52 valid_source_types: VM53 valid_target_types: PM54

55 PM_To_Cluster:56 valid_source_types: PM57 valid_target_types: Cluster58

59 Cluster_To_Power:60 valid_source_types: Cluster61 valid_target_types: Power62

63 ...

Listing 2: Configuration excerpt.

1 Configuration:2 identifier: IaaSSystem_03 topology: IaaS4

5 Node Power0:6 type: IaaS.Power7 activated: ’true’8 properties:9 PowerCapacity: 150010 PowerCurConsumption: 011

12 ...13

14 Node Cluster0:15 type: IaaS.Cluster16 activated: ’true’17 properties:18 ClusterCurConsumption:

019 ClusterNbCPUActive: 020 ClusterConsOneCPU: 121 ClusterConsOneRAM: 022 ClusterConsMinOnePM: 523

24 ...25

26 Node PM0:27 type: IaaS.PM28 activated: ’true’29 properties:30 ...31

32 ...33

34 Relationship PM_To_Cluster0:35 type: IaaS.PM_To_Cluster36 constant: true37 source: PM038 target: Cluster039

40 ...

18

Page 20: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

the constraint solver may not add/remove variables during the decision-making pro-cess, as already explained in Section 4.2. The Cloud Administrator and the Monitor,on the contrary may modify the configuration model (that is given as input to theconstraint solver) by removing and adding nodes as a reflect of both the running570

Cloud system (e.g., a PM that crashed) or new business requirements or agreements(e.g., a client that arrives or leaves). Notice also that both adaptors and the Moni-tor component are the entry-points of the running subsystems and the generic AM,respectively. Thus, the adaptors and the Monitor are the entities that actually haveto implement the APIs.575

Executor

Adaptor 1 ...

Analyzer Planner

MonitorConfiguration

Model c

Adaptor 2 Adaptor 3

Modify

sensorsactuators

Generic Autonomic Manager

- add / remove node- enable / disable node - link / unlink nodes- update attribute value

- enable / disable node - link / unlink nodes- update attribute value

Common APIenable/disable, link/unlink nodes,

update value

ActuatorAPI

Monitor APIadd/remove node

Figure 7: Synchronization with real running system.

We rely on a number of libraries (e.g., AWS Java SDK 3, Openstack4j 4) thatease the implementation of adaptors. For example, Listing 3 shows an excerpt ofthe implementation of the enable action for a VM node in Openstack4j. For thefull implementation and for more examples, please see https://gitlab.inria.fr/come4acloud/xaas.580

Listing 3: Excerpt of an IaaS adaptor with OpenStack.

1 OSClientV3 os = OSFactory.builderV3 ()2 .endpoint (...)3 .credentials (...)4 .authenticate ();5 ...6

7 ServerCreate vm = Builders.server ()8 .name(nodeID)9 .flavor(flavorID)10 .image(imageID)11 .availabilityZone(targetCluster + ":" + targetPM)12 .build();13

14 Server vm = os.compute ().servers ().boot(vm);

3https://aws.amazon.com/fr/sdk-for-java/4http://www.openstack4j.com

19

Page 21: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

6. Performance Evaluation

In this section, we present an experimental study of our generic AM implemen-tation that has been applied to an IaaS system. The main objective is to analyzequalitatively the impact of the AM behaviour on the system configuration when a585

given series of events occurs, and notably the time required by the constraint solverto take decisions. Note that the presented simulation focuses on the performanceof the controller. Additionnally, we also experimented with the same scenario on asmaller system but in a real OpenStack IaaS infrastructure5. In a complementarymanner, a more detailed study of the proposed model-based architecture (and no-590

tably its core generic XaaS modeling language) can be found in [30] where we showthe implementation of another use case, this time for a SaaS application6.

6.1. The IaaS system

We relied on the same IaaS system whose models are presented in Listings 1 and2 to evaluate our approach. In the following, we provide more details. For sake of595

simplicity, we consider that the IaaS provides a unique service to their customers:compute resource in the form of VMs. Hence, there exists a node type VMServiceextending the ServiceClient type (cf. Section 3.3). A customer can specify therequired number of CPUs and RAM as attributes of VMService node. The pricesfor a unit of CPU/RAM are defined inside the SLA component, that is, inside600

the SLAVM node type, which extends the SLAClient type of the service-orientedtopology model. Internally, the system has several InternalComponents: VMs(represented by the node type VM) are hosted on PMs (represented by the node typePM), which are themselves grouped into Clusters (represented by the node typeCluster). Each enabled VM has exactly a successor node of type PM and exactly605

a unique predecessor of type VMService. This is represented by a relationship typestating that the predecessors of a PM are the VMs currently hosted by it. The mainconstraint of a VM node is to have the number of CPUs/RAM equal to attributespecified in its predecessor VMService node. The main constraint for a PM is tokeep the sum of allocated resources with VM less or equal than its capacity. A PM610

has a mandatory link to its Cluster, which is also represented by a relationship inthe configuration model. A Cluster needs electrical power in order to operate andhas an attribute representing the current power consumption of all hosted PMs.The PowerService type extends the ServiceProvider type of the service-orientedtopology model, and it corresponds to an electricity meter. A PowerService node615

has an attribute that represents the maximum capacity in terms of kilowatt-hour,which bounds the sum of the current consumption of all Cluster nodes linked tothis node (PowerService). Finally, the SLAPower type extends the SLAProvidertype and represents a signed SLA with an energy provider by defining the price ofa kilowatt-hour.620

6.2. Experimental Testbed

We implemented the Analyzer component of the AM by using the Java-basedconstraint solver Choco [12]. For scalability purposes the experimentation simulatesthe interaction with the real world, i.e., the role of the components Monitor and

5CoMe4ACloud Openstack Demo:http://hyperurl.co/come4acloud_runtime6CoMe4ACloud Moodle Demo: http://hyperurl.co/come4acloud

20

Page 22: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

Executor depicted in Figure 1, although we have experimented the same scenario625

with a smaller system (fewer PMs and VMs) in a real OpenStack infrastructure7).The simulation has been conducted on a single processor machine with an Intel Corei5-6200U CPU (2.30GHz) and 6GB of RAM Memory running Linux 4.4.

The system is modeled following the topology defined in Listing 1, i.e., computeservices are offered to clients by means of Virtual Machines (VM) instances. VMs630

are hosted by PM, which in turn are grouped by Clusters of machines. As Clustersrequire electricity in order to operate, they can be linked to different power providers,if necessary (cf. section 6.1). The snapshot of the running IaaS configuration model(the initial as well as the ones associated to each instant t ∈ T ) is described andstored with our configuration DSL (cf. Listing 2). At each simulated event, the file635

is modified to apply the consequences of the event over the configuration. After eachmodification due to an event, we activated the AM to propagate the modificationon the whole system and to ensure that the configuration meets all the imposedconstraints.

The simulated IaaS system is composed of 3 clusters homogeneous PMs. Each640

PM has 32 processors and 64 GB of RAM memory. The system has two powerproviders: a classical power provider, that is, brown energy provider and a greenenergy provider.

The current consumption of a turned on PM is the sum of its idle power consump-tion (10 power units) when no guest VM is hosted with an additional consumption645

due to allocated resources (1 power unit per CPU and per RAM allocated). Inorder to avoid to degrade the analysis performance by considering too much physi-cal resources compared to the number of consumed virtual resources, we limit thenumber of unused PM nodes in the configuration model while ensuring a sufficientamount of available physical resources to host a potential new VM.650

In the experiments, we considered five types of event:

� AddVMService (a): a new customer arrival which requests for x VMService(x ranges from 1 to 5). The required configuration of this request (i.e., thenumber of CPUs and RAM units and the number of VMService) is chosenindependently, with a random uniform law. The number of required CPU655

ranges from 1 to 8, and the number of required RAM units ranges from 1to 16 GB. The direct consequences of such an event is the addition of oneSLAVM , x VMService nodes and x VM nodes in the configuration modelfile. The aim of the AM after this event is to enable the x new VM and tofind the best PM(s) to host them.660

� leavingClient (l): a customer decides to cancel definitively the SLA. Conse-quently, the corresponding SLAVM , VMService and VM nodes are removedfrom the configuration. After such an event the aim of the AM is potentiallyto shut down the concerned PM or to migrate other VMs to this PM in orderto minimize the revenue loss.665

� GreenAvailable (ga): the Green Power Provider decreases significantly theprice of the power unit to a value below the price of the Brown Energy Provi-der. The consequence of that event is the modification of the price attribute

7CoMe4ACloud Openstack Demo: http://hyperurl.co/come4acloud_runtime

21

Page 23: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

of the green SLAPower node. The expected behaviour of the AM is to enablethe green SLAPower node in order to consume a cheaper service.670

� GreenUnAvailable (gu): contrary to the GreenAvailable event, the GreenPower Provider resets its price to the initial value. Consequently, the BrownEnergy Provider becomes again the most interesting provider. The expectedbehaviour of the AM is to disable the green SLAPower node to the benefitof the classical power provider.675

� CrashOnePM (c): a PM crashes. The consequence on the configuration isthe suppression of the corresponding PM node in the configuration model.The goal of the AM is to potentially turn on a new PM and to migrate theVMs which were hosted by the crashed PM.

In our experiments, we consider the following scenario over the both analysis680

strategies without neigborhood and with neigborhood depicted in Section 4.2.2. Ini-tially, the configuration at t0, no VM is requested and the system is turned off. Atthe beginning, the unit price of the green power provider is twice higher than theprice of the other provider (8 against 4). The unit selling price is 50 for a CPUand 10 for a RAM unit. Our scenario consists to repeat the following sequence of685

events: 5 AddVMService, 1 leavingClient, 1 GreenAvailable, 1 CrashOnePM ,5 AddVMService, 1 leavingClient, 1 GreenUnAvailable and 1 CrashOnePM .This allows to show the behaviour of the AM for each event with different system’ssizes. We show the impact of this scenario over the following metrics:

� the amount of power consumption for each provider (Figures 8a and 8c);690

� the amount of VMService and size of the system in terms of number of nodes(Figure 8f);

� the configuration balance (function H()) (Figure 8e).

� the latency of the Choco Solver to take a decision (Figure 8g)

� the number of PMs being turned on (Figure 8d)695

� the size of generated plan, i.e., the number of required actions to produce thenext satisfiable configuration (Figure 8b)

The x-axis in Figure 8 represents the logical time of the experiment in terms ofconfiguration transition. Each colored area in this figure includes two configurationtransitions: the event immediately followed by the control action. The color differs700

according to the type of the fired event. For the sake of readability, the x-axis doesnot begin at the initiation instant but when the number of node reaches 573 andevents are tagged with the initials of the event’s name.

6.3. Analysis and Discussion

First of all, we can see that both strategies have globally the same behaviour705

whatever the received event. Indeed, in both cases a power provider is deactivatedwhen its unit price becomes higher than the second one (Figures 8a and 8c). Thisshows that the AM is capable of adapting the choice of provided service according totheir current price and thus benefit from sales promotions offered by its providers.

22

Page 24: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l gu

Cur

rent

Cla

ssic

al C

onsu

mpt

ion

with_neighborhood without_neighborhood

(a) Classical Power consumption.

10

20

30

40

50

60

70

a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l gu

num

ber

of

acti

ons

with_neigborhood without_neigborhood

(b) Number of generated actions in planning phase

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l gu

Cur

rent

Gre

en C

onsu

mpt

ion

with_neighborhood without_neighborhood

(c) Green Power consumption.

30

35

40

45

50

55

60

65

a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l gu

Num

ber

of P

M t

urne

d O

n

with_neighborhood without_neighborhood

(d) Turned on PMs.

55000

60000

65000

70000

75000

80000

85000

90000

a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l gu

Sys

tem

Bal

ance

(fu

ncti

on H

)

with_neighborhood without_neighborhood

(e) System balance.

560

580

600

620

640

660

680

700

720

740

760

780

a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l gu 220

240

260

280

300

320

Syst

em’s

siz

e

num

ber o

f VM

Serv

ices

nod

es

system’s size VMServices nodes

(f) Total number of nodes (left y axis) and required VMs (right y axis)

1000

10000

100000

1e+06

a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l guc a a a a a l gac a a a a a l gu

solv

ing t

ime

(in m

s)

with_neighborhood without_neighborhood

(g) Latency of the solver to take a decision

Figure 8: Experimental results of the simulation.

23

Page 25: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

When the amount of requests for VMService increases (Figure 8f) in a regular710

basis, the system power consumption increases (Figures 8a and 8c) sufficiently slowlyso that the system balance also increases (Figure 8e). This can be explained by theability of the AM to decide to turn on a new PM in a just-in-time way, that is, theAM tries to allocate the new coming VMs on existing enabled PM. On the other wayaround, when a client leaves the system, as expected, the number of VMService715

nodes decreases but we can see that the number of PMs remains constant during thisevent, leading to a more important decrease of the system balance. Consequently,we can deduce that the AM has decided in this case to privilege the reconfigurationcost criteria at the expense of the system balance criteria. Indeed, we can notice inFigure 8b that the number of planning actions remains limited for the event l.720

However, we can observe some differences on the values between both strategies.The main difference is in the decision to turn on a PM in the events AddVMServiceand CrashOnePM . In the AddVMService event, the neighborhood strategy favorsthe start-up of new PM, contrary to the other strategy which favors the use of PMalready turned on. Consequently, the neighborhood strategy increases the power725

consumption leading to a less interesting balance. This can be explained by thefact that the neighborhood strategy avoids to modify existing nodes which limitsits capacity for actions. Indeed, this is confirmed in Figure 8b where the curve ofthe neighborhood strategy is mostly lower than the other one. However, the solvingtime is worse (Figure 8g) because the minimal required variable area (V A) to find730

a solution needs several iterations.Conversely, in the CrashOnePM event, we note that the number of PM is

mostly the same with the neighborhood strategy while the other one starts up sys-tematically a new PM. This illustrates the fact that, in case of node disappearance,the neighborhood strategy tries to use as much as possible the existing nodes by mo-735

difying it as less as possible. Without neighborhood, the controller is able to modifydirectly all variables of the model. As a result, it is more difficult to find a satisfiableconfiguration, which comes at the expense of a long solving time (Figure 8g)

Finally, in order to keep an acceptable solving time while limiting the number ofplanning actions and maximizing the balance, it is interesting to choose the strategy740

according to the event. Indeed, the neighborhood strategy is efficient to repair nodesdisappearance but the system balance may be lower in case of new client arrival.Although our AM is generic, we could observe that with the appropriate strategy, itcan take decisions in less than 10 seconds for several hundred nodes. In terms of aCSP problem, the considered system’s size corresponds to an order of magnitude of745

1 million variables and 300000 constraints. Moreover, the taken decisions increasesystemically the balance in case of favorable events (new service request from aclient, price drop from a provider, etc.) and limits its degradation in case of adverseevents (component crash, etc.) .

7. Related Work750

In order to discuss the proposed solution, we identified common characteristicswe believe important for autonomic Cloud (modeling) solutions. Table 1 comparesour approach with other existing work regarding different criteria: 1) Genericity- The solution can support all Cloud system layers (e.g., XaaS), or is specific tosome particular and well-identified layers; 2) UI/Language - It can provide a proper755

user interface and/or a modeling language intended to the different Cloud actors; 3)Interoperability - It can interoperate with other existing/external solutions, and/or

24

Page 26: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

is compatible with a Cloud standard (e.g., TOSCA); 4) Runtime support - It candeal with runtime aspects of Cloud systems, e.g., provide support for autonomicloops and/or synchronization.760

In the industrial Cloud community, there are many existing muti-cloud APIs/-libraries 8 9 and DevOps tools 10 11. APIs enable IaaS provider abstraction,therefore easing the control of many different Cloud services, and generally focus onthe IaaS client side. DevOps tools, in turn, provide scripting language and execu-tion platforms for configuration management. They rather provide support for the765

automation of the configuration, deployment and installation of Cloud systems in aprogrammatical/imperative manner.

The Cloudify12 platform overcomes some of these limitations. It relies on a vari-ant of the TOSCA standard [4] to facilitate the definition of Cloud system topologiesand configurations, as well as to automate their deployment and monitoring. In the770

same vein, Apache Brooklyn13 leverages Autonomic Computing [9] to provide sup-port for runtime management (via sensors/actuators allowing for dynamically mo-nitoring and changing the application when needed). However, both Cloudify andBrooklyn focus on the application/client layer and are not easily applicable to allXaaS layers. Moreover, while Brooklyn is very handy for particular types of adap-775

tation (e.g., imperative event-condition-action ones), it may be limited to handleadaptation within larger architectures (i.e., considering many components/servicesand more complex constraints). Our approach, instead, follows a declarative andholistic approach which is more appropriated for this kind of context.

Recently, OCCI (Open Cloud Computing Interface) has become one of the first780

standards in Cloud. The kernel of OCCI is a generic resource-oriented metamo-del [31], which lacks a rigorous and formal specification as well as the concept of(re)configuration. To tackle these issues, the authors of [32] specify the OCCI CoreModel with the Eclipse Modeling Framework (EMF)14, whereas its static semanticsis rigorously defined with the Object Constraint Language (OCL)15. An EMF-based785

OCCI model can ease the description of a XaaS, which is enriched with OCL con-straints and thus verified by a many MDE tools. The approach, however, does notcope with autonomic decisions at runtime that have to be done in order to meetthose OCL invariants.

The European project 4CaaSt proposed the Blueprint Templates abstract lan-790

guage [33] to describe Cloud services over multiple PaaS/IaaS providers. In the samedirection, the Cloud Application Modeling Language [35] studied in the ARTISTEU project [52] suggests using profiled UML to model (and later deploy) Cloud ap-plications regardless of their underlying infrastructure. Similarly, the mOSAIC EUproject proposes an open-source and Cloud vendor-agnostic platform [36]. Finally,795

StratusML [37] provides another language for Cloud applications dealing with diffe-rent layers to address the various Cloud stakeholders concerns. All these approachesfocus on how to enable the deployment of applications (SaaS or PaaS) in different

8Apache jclouds:https://jclouds.apache.org9Deltacloud:https://deltacloud.apache.org

10Puppet:https://puppet.com11Chef:https://www.chef.io/chef/12http://getcloudify.org13https://brooklyn.apache.org14https://eclipse.org/modeling/emf15http://www.omg.org/spec/OCL

25

Page 27: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

Generi- UI / Interop- Runtimecity Language erability support

APIs/DevOps 3 3

Cloudify 3 3 3

Brooklyn 3 3 ∼[32] 3 3 3

[33] 3

[34] 3 ∼[35] 3 3

[36] 3 ∼[37] 3 3

[38, 39] 3 ∼[40] 3 ∼[41] ∼ 3 3

[42] 3 3

[43][44][45] 3 3 - -[46] 3 3

[47] 3 3

[48] 3 3 3

[49] 3 3 3

[50] 3 3

[51] 3

CoMe4ACloud 3 3 ∼ ∼

Table 1: Comparison of Cloud (modeling) solutions - 3 for full support, ∼ for partial support

IaaS providers. Thus they are quite layer-specific and do not provide support forautonomic adaptation.800

The MODAClouds EU project [5] introduced some support for runtime mana-gement of multiple Clouds, notably by proposing CloudML as part of the CloudModeling Framework (CloudMF) [38, 39]. As in our approach, CloudMF providesa generic provider-agnostic model that can be used to describe any Cloud provi-der as well as mechanisms for runtime management by relying on Models@Runtime805

techniques [29]. In the PaaSage EU project [6], CAMEL [40] extended CloudMLand integrated other languages such as the Scalability Rule Language (SRL) [41].However, contrary to our generic approach, in these cases the adaptation decisionsare delegated to 3rd-parties tools and tailored to specific problems/constraints [53].The framework Saloon [42] was also developed in this same project, relying on fe-810

ature models to provide support for automatic Cloud configuration and selection.Similarly, [44] proposes the use of ontologies were used to express variability inCloud systems. Finally, Mastelic et al., [45] propose a unified model intended tofacilitate the deployment and monitoring of XaaS systems. These approaches fillthe gap between application requirements and cloud providers configurations but,815

unlike our approach, they focus on the initial configuration (at deploy-time), not onthe run-time (re)configuration.

Recently, the ARCADIA EU project proposed a framework to cope with highly

26

Page 28: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

adaptable distributed applications designed as micro-services [43]. While in a veryearly stage and with a different scope than us, it may be interesting to follow this820

work in the future. Among other existing approaches, we can cite the Descartesmodeling language [46] which is based on high-level metamodels to describe resour-ces, applications, adaptation policies, etc. On top of Descartes, a generic controlloop is proposed to fulfill some requirements for quality-of-service and resource ma-nagement. Quite similarly, Popet al., [47] propose an approach to support the de-825

ployment and autonomic management at runtime on multiple IaaS. However bothapproaches are targeting only Cloud systems structured as a SaaS deployed in aIaaS, whereas our approach allows modeling Cloud systems at any layer.

In [48], the authors extend OCCI in order to support autonomic managementfor Cloud resources, describing the needed elements to make a given Cloud resource830

autonomic regardless of the service level. This extension allows autonomic pro-visioning of Cloud resources, driven by elasticity strategies based on imperativeEvent–Condition–Action rules. The adaptation policies are, however, focused onthe business applications, while our declarative approach, thanks to a constraintsolver, is capable of controlling any target XaaS system so as to keep it close to the835

a consistent and/or optimal configuration.In [49], feature models are used to define the configuration space (along with

user preferences) and game theory is considered as a decision-making tool. Thiswork focuses on features that are selected in a multi-tenant context, whereas ourapproach targets the automated computation of SLA-compliant configurations in a840

cross-layer manner.Several approaches on SLA-based resource provisioning – and based on con-

straint solvers – have been proposed. Like in our approach, the authors of [50] relyon rely on MDE techniques and constraint programming to find consistent configu-rations of VM placement in order to optimize energy consumption. But no modeling845

or high-level language support is provided. Nonetheless, the focus remains on theIaaS infrastructure, so there is no cross-layer support. In [51], the authors proposea new approach to autoscaling that utilizes a stochastic model predictive controltechnique to facilitate resource allocation and releases meeting the SLO of the ap-plication provider while minimizing their cost. They use also a convex optimization850

solver for cost functions but no detail is provided about its implementation. Besides,the approach addresses only the relationship between SaaS and IaaS layers, whilein our approach any XaaS service can be defined.

To the best of our knowledge, there is currently no work in the literature thatfeatures at the same time genericity w.r.t. the Cloud layers, interoperability with855

standards (such as TOSCA), high-level modeling language support and some au-tonomic runtime management capabilities. The proposed model-based architecturedescribed in this paper is an initial step in this direction.

8. Conclusion

The CoMe4ACloud architecture is a generic solution for the autonomous runtime860

management of heterogeneous Cloud systems. It unifies the main characteristics andobjectives of Cloud services. This model enabled us to derive a unique and genericAutonomic Manager (AM) capable of managing any Cloud service, regardless of thelayer. The generic AM is based on a constraint solver which reasons on very abstractconcepts (e.g., nodes, relations, constraints) and tries to find the best balance bet-865

ween costs and revenues while meeting constraints regarding the established Service

27

Page 29: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

Level Agreements and the service itself. From the Cloud administrators and expertspoint of view, this is an interesting contribution because it frees them from the dif-ficult task of conceiving and implementing purpose-specific AMs. Indeed, this taskcan be now simplified by expressing the specific features of the XaaS Cloud system870

with a domain specific language based on the TOSCA standard. Our approach wasevaluated experimentally, with a qualitative study. Results have shown that yetgeneric, our AM is able to find satisfiable configurations within reasonable solvingtimes by taking the established SLA and by limiting the reconfiguration overhead.We also showed how we managed the integration with real Cloud systems like such875

as Openstack, while remaining generic.For future work, we intend to apply CoMe4ACloud to other contexts somehow

related to Cloud Computing. For instance, we plan experiment our approach inthe domain of Internet of Things or Cloud-based Internet of Things, which mayincur challenges regarding the scalability in terms of model size. We also plan to880

investigate how our approach could be used to address self-protection, that is, tobe able to deal with security aspects in an autonomic manner. Last but not least,we believe that the constraint solver may be insufficient to make decisions in adurable way, i.e., by considering the past history or even possible future states of themanaged element. A possible alternative to overcome this limitation is to combine885

our constraint programming based decision making tool with control theoreticalapproaches for computing systems.

References

[1] X. Xu, From Cloud Computing to Cloud Manufacturing, Robotics and Computer-Integrated Ma-nufacturing 28 (1) (2012) 75–86.890

[2] Peter Mell and Tim Grance, The NIST Definition of Cloud Computing (2011).

[3] B. Narasimhan, R. Nichols, State of Cloud Applications and Platforms: The Cloud Adopters’ View,Computer 44 (3) (2011) 24–28.

[4] OASIS, Topology and Orchestration Specification for Cloud Applications (TOSCA) (Apr. 17).URL http://docs.oasis-open.org/tosca/TOSCA/v1.0/TOSCA-v1.0.html895

[5] D. Ardagna, E. D. Nitto, G. Casale, D. Petcu, P. Mohagheghi, S. Mosser, P. Matthews, A. Gericke,C. Ballagny, F. D’Andria, et al., MODAClouds: A Model-driven Approach for the Design andExecution of Applications on Multiple Clouds, in: Proceedings of the 4th international workshopon modeling in software engineering (MISE 2012), IEEE Press, Zurich, Switzerland, 2012, pp. 50–56.

[6] A. Rossini, Cloud Application Modelling and Execution Language (CAMEL) and the PaaSage900

Workflow, in: Advances in Service-Oriented and Cloud ComputingACAUWorkshops of ESOCC2015, Vol. 567, Taormina, Italy, 2015, pp. 437–439.

[7] D. C. Schmidt, Guest editor’s introduction: Model-driven engineering, Computer 39 (2) (2006)0025–31.

[8] F. Rossi, P. van Beek, T. Walsh (Eds.), Handbook of Constraint Programming, Elsevier, 2006.905

[9] J. O. Kephart, D. M. Chess, The Vision of Autonomic Computing, Computer 36 (1) (2003) 41–50.

[10] J. Lejeune, F. Alvares, T. Ledoux, Towards a generic autonomic model to manage cloud services, in:CLOSER 2017 - Proceedings of the 7th International Conference on Cloud Computing and ServicesScience, Porto, Portugal, April 24-26, 2017., 2017, pp. 147–158.

[11] OpenStack Foundation, OpenStack Open Source Cloud Computing Software (Apr. 2017).910

URL https://www.openstack.org

[12] C. Prud’homme, J.-G. Fages, X. Lorca, Choco Documentation, TASC, LS2N CNRS UMR 6241,COSLING S.A.S. (2017).URL http://www.choco-solver.org

28

Page 30: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

[13] The Gecode Team, Gecode: A generic constraint development environment (2006).915

URL http://www.gecode.org

[14] The OR-Tools Team, Google optimization tools (2017).URL https://developers.google.com/optimization/

[15] Cplex cp optimizer, http://www-01.ibm.com/software/integration/optimization/cplex-cp-optimizer/.920

[16] J. Andersson, S. Andersson, K. Boortz, M. Carlsson, H. Nilsson, T. SjAuland, J. Widen, Sicstusprolog user’s manual, release 4, Tech. rep., SICS - Swedish Institute of Computer Science (2007).

[17] J. Regin, Generalized arc consistency for global cardinality constraint, in: Proceedings of the Thir-teenth National Conference on Artificial Intelligence and Eighth Innovative Applications of ArtificialIntelligence Conference, AAAI 96, IAAI 96, Portland, Oregon, August 4-8, 1996, Volume 1., 1996.925

[18] G. Pesant, A regular language membership constraint for finite sequences of variables (2004).

[19] D. C. Schmidt, Model-Driven Engineering, Computer 39 (2) (2006) 25.

[20] J. Bezivin, Model Driven Engineering: An Emerging Technical Space, Lecture Notes in ComputerScience 4143 (2006) 36.

[21] M. Brambilla, J. Cabot, M. Wimmer, Model-Driven Software Engineering in Practice, Synthesis930

Lectures on Software Engineering, Morgan & Claypool Publishers, 2012.

[22] M. Fowler, Domain-Specific Languages, Pearson Education, 2010.

[23] P. Derler, E. A. Lee, A. S. Vincentelli, Modeling Cyber–Physical Systems, Proceedings of the IEEE100 (1) (2012) 13–28.

[24] Y. Kouki, T. Ledoux, Csla: a language for improving cloud sla management, in: Int. Conf. on Cloud935

Computing and Services Science, CLOSER 2012, 2012, pp. 586–591.

[25] P. Shaw, Using constraint programming and local search methods to solve vehicle routing problems,in: Principles and Practice of Constraint Programming - CP98, 4th International Conference, Pisa,Italy, October 26-30, 1998, Proceedings, 1998, pp. 417–431.

[26] Z. Xing, E. Stroulia, Umldiff: An algorithm for object-oriented design differencing, in: Proceedings940

of the 20th IEEE/ACM International Conference on Automated Software Engineering, ASE ’05,ACM, New York, NY, USA, 2005, pp. 54–65.

[27] OASIS, YAML (TOSCA Simple Profile) (Apr. 2017).URL http://docs.oasis-open.org/tosca/TOSCA-Simple-Profile-YAML/v1.1/TOSCA-Simple-Profile-YAML-v1.1.html945

[28] Eclipse Foundation, Winery project (Apr. 2017).URL https://projects.eclipse.org/projects/soa.winery

[29] G. Blair, R. B. France, N. Bencomo, Models@ run.time, Computer 42 (2009) 22–27.

[30] H. Bruneliere, Z. Al-Shara, F. Alvares, J. Lejeune, T. Ledoux, A Model-based Architecture forAutonomic and Heterogeneous Cloud Systems, in: Proc. 8th International Conference on Cloud950

Computing and Services Science (CLOSER 2018), Funchal, Madeira, Portugal, 2018.

[31] R. Nyren, A. Edmonds, A. Papaspyrou, T. Metsch, Open cloud computing interface - core, specifi-cation document, Tech. rep., Open Grid Forum, OCCI-WG (June 2011).

[32] P. Merle, O. Barais, J. Parpaillon, N. Plouzeau, S. Tata, A precise metamodel for open cloudcomputing interface, in: CLOUD 2015, 2015, pp. 852–859.955

[33] D. K. Nguyen, F. Lelli, Y. Taher, M. Parkin, M. P. Papazoglou, W.-J. van den Heuvel, BlueprintTemplate Support for Engineering Cloud-based Services, in: Proceedings of the 4th European Con-ference on Towards a Service-based Internet, ServiceWave’11, Springer-Verlag, Berlin, Heidelberg,2011, pp. 26–37.

[34] F. Galan, A. Sampaio, L. Rodero-Merino, I. Loy, V. Gil, L. M. Vaquero, Service Specification in960

Cloud Environments Based on Extensions to Open Standards, in: Proceedings of the 4th Interna-tional ICST Conference on COMmunication System softWAre and middlewaRE, COMSWARE ’09,ACM, New York, NY, USA, 2009, pp. 19:1–19:12.

[35] A. Bergmayr, J. Troya, P. Neubauer, M. Wimmer, G. Kappel, UML-based Cloud Application Mo-deling with Libraries, Profiles, and Templates, in: Proceedings of the 2nd International Workshop965

on Model-Driven Engineering on and for the Cloud, CloudMDE@MODELS 2014, Valencia, Spain,2014., 2014, pp. 56–65.

29

Page 31: CoMe4ACloud: An End-to-End Framework for …...CoMe4ACloud: An End-to-end Framework for Autonomic Cloud Systems Zakarea Al-Shara a, Frederico Alvares , Hugo Bruneliere , Jonathan Lejeuneb,

[36] C. Sandru, D. Petcu, V. I. Munteanu, Building an Open-Source Platform-as-a-Service with In-telligent Management of Multiple Cloud Resources, in: Proceedings of the 2012 IEEE/ACM 5thInternational Conference on Utility and Cloud Computing, UCC ’12, IEEE Computer Society, Wa-970

shington, DC, USA, 2012, pp. 333–338.

[37] M. Hamdaqa, L. Tahvildari, Stratus ML: A Layered Cloud Modeling Framework, in: 2015 IEEEInternational Conference on Cloud Engineering, 2015, pp. 96–105.

[38] N. Ferry, A. Rossini, F. Chauvel, B. Morin, A. Solberg, Towards Model-Driven Provisioning, De-ployment, Monitoring, and Adaptation of Multi-cloud Systems, in: 2013 IEEE Sixth International975

Conference on Cloud Computing, 2013, pp. 887–894.

[39] N. Ferry, H. Song, A. Rossini, F. Chauvel, A. Solberg, CloudMF: Applying MDE to Tame the Com-plexity of Managing Multi-cloud Applications, in: 2014 IEEE/ACM 7th International Conferenceon Utility and Cloud Computing, 2014, pp. 269–277.

[40] A. P. Achilleos, G. M. Kapitsaki, E. Constantinou, G. Horn, G. A. Papadopoulos, Business-Oriented980

Evaluation of the PaaSage Platform, in: 2015 IEEE/ACM 8th International Conference on Utilityand Cloud Computing (UCC), 2015, pp. 322–326.

[41] J. Domaschka, K. Kritikos, A. Rossini, Towards a Generic Language for Scalability Rules, SpringerInternational Publishing, Manchester, UK, 2015, pp. 206–220.

[42] C. Quinton, D. Romero, L. Duchien, SALOON: a Platform for Selecting and Configuring Cloud985

Environments, Software: Practice and Experience 46 (2016) 55–78.

[43] P. Gouvas, E. Fotopoulou, A. Zafeiropoulos, C. Vassilakis, A Context Model and Policies Manage-ment Framework for Reconfigurable-by-design Distributed Applications, Procedia Computer Science97 (2016) 122 – 125.

[44] A. Dastjerdi, S. Tabatabaei, R. Buyya, An effective architecture for automated appliance manage-990

ment system applying ontology-based cloud discovery, in: CCGrid 2010, 2010, pp. 104–112.

[45] T. Mastelic, I. Brandic, A. Garcia Garcia, Towards uniform management of cloud services by ap-plying model-driven development, in: COMPSAC 2014, 2014, pp. 129–138.

[46] S. Kounev, N. Huber, F. Brosig, X. Zhu, A Model-Based Approach to Designing Self-Aware ITSystems and Infrastructures, Computer 49 (7) (2016) 53–61.995

[47] D. Pop, G. Iuhasz, C. Craciun, S. Panica, Support Services for Applications Execution in Multi-clouds Environments, in: 2016 IEEE International Conference on Autonomic Computing (ICAC),2016, pp. 343–348.

[48] M. Mohamed, M. Amziani, D. Belaid, S. Tata, T. Melliti, An autonomic approach to manageelasticity of business processes in the cloud, FGCS 50 (2015) 49 – 61.1000

[49] J. Garcıa-Galan, L. Pasquale, P. Trinidad, A. Ruiz-Cortes, User-centric Adaptation of Multi-tenantServices: Preference-based Analysis for Service Reconfiguration, in: Proceedings of the 9th Interna-tional Symposium on Software Engineering for Adaptive and Self-Managing Systems, SEAMS 2014,ACM, New York, NY, USA, 2014, pp. 65–74.

[50] B. Dougherty, J. White, D. C. Schmidt, Model-driven auto-scaling of green cloud computing infra-1005

structure, FGCS 28 (2) (2012) 371–378.

[51] H. Ghanbari, B. Simmons, M. Litoiu, C. Barna, G. Iszlai, Optimal autoscaling in a iaas cloud, in:ICAC 2012, ACM, 2012, pp. 173–178.

[52] A. Menychtas, K. Konstanteli, J. Alonso, L. Orue-Echevarria, J. Gorronogoitia, G. Kousiouris,C. Santzaridou, H. Bruneliere, B. Pellens, P. Stuer, O. Strauss, T. Senkova, T. Varvarigou, Soft-1010

ware Modernization and Cloudification Using the ARTIST Migration Methodology and Framework,Scalable Computing : Practice and Experience 15 (2) (2014) 131–152.

[53] M. A. A. d. Silva, D. Ardagna, N. Ferry, J. F. Perez, Model-Driven Design of Cloud Applications withQuality-of-Service Guarantees: The MODAClouds Approach, in: 16th International Symposium onSymbolic and Numeric Algorithms for Scientific Computing, 2014, pp. 3–10.1015

30