deliverable title: experimental research on community...

Project Title: Clommunity: A Community networking Cloud in a box

Deliverable Title: Experimental research oncommunity clouds (year 1)

Deliverable number: D3.2

Version 1.0

This project has received funding from the European Union’s SeventhProgramme for research, technological development and demonstrationunder grant agreement No 317879

Project Acronym: CLOMMUNITYProject Full Title: A Community networking Cloud in a box.Type of contract: Collaborative project (STREP)contract No: FP7-ICT-317879Project URL: http://clommunity-project.eu/

Editors: Hooman Peiro Sajjad (KTH), Amin Khan (UPC), Felix Freitag (UPC),Vladimir Vlassov (KTH)

Deliverable nature: Report (R)Dissemination level: Public (PU)Contractual Delivery Date: 30/04/2014Actual Delivery Date: 30/04/2014Suggested Readers: Project partnersNumber of pages: 55Keywords: WP3, Community clouds, architecture, resource management, self-

management, scalable service overlaysAuthors: Vladimir Vlassov (KTH), Hooman Peiro Sajjad (KTH), Paris Car-

bone (KTH), Ying Liu (KTH), Vamis Xhagjika (UPC), Jim Dowling(SICS), Lars Kroll (KTH, SICS), Alexandru - Adrian Ormenisan (KTH,SICS), Amin Khan (UPC), Mennan Selimi (UPC), Navaneeth Rame-shan (UPC), Felix Freitag (UPC), Leandro Navarro (UPC), Davide Vega(UPC)

Peer review: Roc Meseguer (UPC)

Abstract

This document presents the work carried out in WP3 during the first reporting period of the CLOM-MUNITY project, extending previous work reported in D3.1, to resolve research challenges for com-munity clouds, and provides a discussion and consolidation of the network and service architecture,research on resource allocation in community clouds, self-management in components of communityclouds, and work on enabling scalable service overlays.

http://clommunity-project.eu/

Executive Summary

In this document, outcomes of the research on key challenges for community networking Clouds arepresented.A hierarchical architecture for Clouds in community networks is proposed. The architecture definesthe roles of super nodes (SN) as Cloud managers and of ordinary nodes (ON) as a cloud resource.Multiple ONs connect to a SN to build a Cloud. The Cloud resources can further expand by federationof SNs. Indeed, a Cloud management platform for community networking Cloud is proposed andexplained.As a part of the research on the infrastructural resource management, a blueprint for an IaaS ser-vice selection is provided. Since a community networking Cloud infrastructure is contributed by itsparticipants, incentive mechanisms to build such an infrastructure are discussed.Research on self-management infrastructure services is important due to unreliable network, storageand computing resources that the community networking Cloud is built upon. In this regard, a mon-itoring system for Clommunity devices is designed to report OS-provided metrics including uptime,CPU utilization, memory utilization, total memory, disk size, disk space available, 1 minute load,network data sent and received. These metrics can be used by other infrastructural services in thecommunity Cloud. The monitoring system is built of daemons running on the Clommunity devices,a central manager to pull the monitoring data and a web interface to facilitate metrics presentation.Stay-Away is another self-management service implemented to decrease performance interferenceof VMs running on Clommunity devices. It takes into account some application level performancemetrics to detect quality of service violations and by throttling batch applications, can bring the per-formance sensitive VMs to a stable state. Regarding the self-management, two other systems aredesigned: ElastMan and BwMan. ElastMan is an elasticity controller for key-value stores to providean elastic community storage. BwMan is a bandwidth manager to predict and allocate network band-width to decrease the SLA violation. Results of the experiment on OpenStack Swift data store arepresented.As a result on scalable service overlays running on Clommunity Cloud, a novel distributed key-valuestore (CaracalDB) is designed and implemented which applies dynamic partitioning to provide fast re-replication in case of node failures. Tahoe-LAFS distributed filesystem is deployed and evaluated inthe community networks as a promising application for privacy-preserving, secure and fault-tolerantstorage. A self-healing mechanism in OpenStack Swift is implemented to operate in a communityCloud environment. A MAPE (Monitor, Analysis, Plan, and Execute) control loop algorithm is de-signed for self-healing in Swift.

Contents

1 Introduction 61.1 Contents of the deliverable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2 Relationship to other CLOMMUNITY deliverables . . . . . . . . . . . . . . . . . . . . 61.3 Research challenges addressed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Architecture 82.1 Distributed Architecture in Socio-Economic Context . . . . . . . . . . . . . . . . . . . 9

2.1.1 Cost and Value Relationships in Community Cloud . . . . . . . . . . . . . . . 102.1.1.1 Costs for Participation . . . . . . . . . . . . . . . . . . . . . . . . . . 102.1.1.2 Value Proposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.1.1.3 Comparison with Commercial Services . . . . . . . . . . . . . . . . 11

2.1.2 Macroeconomic Mechanisms for Architecture . . . . . . . . . . . . . . . . . . 112.1.2.1 Commons License . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1.2.2 Peering Agreements . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.1.2.3 Ease of Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1.2.4 Social Capital . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1.2.5 Transaction Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1.2.6 Locality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1.2.7 Overlay Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1.2.8 Entry Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.1.2.9 Role of Developers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.1.2.10 Service Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.1.2.11 Value Addition and Differentiation . . . . . . . . . . . . . . . . . . . 13

2.2 Evaluating Scalability of Distributed Architecture . . . . . . . . . . . . . . . . . . . . . 132.2.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2.1.1 Centralized Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.1.2 Federated Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2.1.3 Decentralized Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.2.2.1 Resource Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.2.2.2 Response Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.2.2.3 Requests Completion Time . . . . . . . . . . . . . . . . . . . . . . . 17

3 Research on infrastructural resource management 193.1 IaaS Service Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.1.1 User Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.1.2 Architecture of the IaaS Service Selector . . . . . . . . . . . . . . . . . . . . . 193.1.3 Implementation Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2 Incentive Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2.1 Social aspects of community networks . . . . . . . . . . . . . . . . . . . . . . . 21

3.2.1.1 Members of community networks . . . . . . . . . . . . . . . . . . . 21

1

Contents Contents

3.2.1.2 Resource sharing in community networks . . . . . . . . . . . . . . . 213.2.1.3 Ownership of nodes in community networks . . . . . . . . . . . . . 223.2.1.4 Services in community networks . . . . . . . . . . . . . . . . . . . . 22

3.2.2 Design of Incentive Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . 223.2.3 Prototype Implementation for Incentive Mechanisms . . . . . . . . . . . . . . 233.2.4 Simulation Experiments for Incentive Mechanisms . . . . . . . . . . . . . . . . 24

4 Research on self-management 264.1 Monitoring System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.2 Preventive Mitigation of Performance Interference . . . . . . . . . . . . . . . . . . . . 274.3 ElastMan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.3.1 Controller Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.3.2 Elasticity Control Algorithm of ElastMan . . . . . . . . . . . . . . . . . . . . . 314.3.3 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.3.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.4 BwMan: Bandwidth Manager for Services in the Cloud . . . . . . . . . . . . . . . . . 334.4.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.4.2 Predictive Models of the Target System . . . . . . . . . . . . . . . . . . . . . . 344.4.3 BwMan: BandWidth Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.4.4 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5 Research on Scalable Service Overlays and Clommunity Services 385.1 CaracalDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.1.2 Scalable Fast Re-replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.2 Distributed Search Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405.2.1 Index Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415.2.2 Search Service Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.3 Local service allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.3.1 Experimental evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445.3.2 Lessons learnt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.4 Tahoe-LAFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455.5 XtreemFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465.6 OpenStack Swift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.6.1 Evaluation Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.6.2 Evaluation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.6.3 Swift Throughput versus Network Latencies . . . . . . . . . . . . . . . . . . . 475.6.4 The Self-Healing of Swift in a Community Network . . . . . . . . . . . . . . . 495.6.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

6 Conclusions and Outlook 51

Licence 55

2Deliverable D3.2

List of Figures

2.1 Architecture of the community cloud management system . . . . . . . . . . . . . . . . 82.2 Overview of different layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Role of social and economic context enablers . . . . . . . . . . . . . . . . . . . . . . . 92.4 Relationship between cost and value in evolution of community cloud . . . . . . . . . 102.5 Super and ordinary nodes in federated community cloud . . . . . . . . . . . . . . . . . 152.6 Average response time as number of nodes increases . . . . . . . . . . . . . . . . . . . 172.7 Average response time as the number of users increases . . . . . . . . . . . . . . . . . 182.8 Finish time of requests for the three scenarios with 400 nodes . . . . . . . . . . . . . . 18

3.1 High level overview of the Cloud service selector . . . . . . . . . . . . . . . . . . . . . 203.2 Details of the VM request operation by an ON . . . . . . . . . . . . . . . . . . . . . . . 24

4.1 Average CPU Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.2 Treemap representation of resource usages . . . . . . . . . . . . . . . . . . . . . . . . . 274.3 Stay-Away Conceptual Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.4 Colocated Execution of VLC with Batch applications . . . . . . . . . . . . . . . . . . . 284.5 Multi-Tier Web 2.0 Application with Elasticity Controller Deployed in a Cloud Envi-

ronment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.6 ElastMan Feedforward Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.7 ElastMan Feedback Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.8 Binary Classifier for One Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.9 Performance of Voldemont without ElastMan with fixed number of servers (18

servers) under gradual diurnal workload (0-900 min) and under workload with spikes(900-1500 min) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.10 Performance of Voldemort with ElastMan under gradual diurnal workload (0-900min) and under workload with spikes (900-1500 min) . . . . . . . . . . . . . . . . . . . 33

4.11 Regression Model for System Throughput vs. Available Bandwidth . . . . . . . . . . . 354.12 Regression Model for Recovery Speed vs. Available Bandwidth . . . . . . . . . . . . . 354.13 BwMan Control Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.14 Throughput of Swift without BwMan . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.15 Throughput of Swift with BwMan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5.1 CLOMMUNITY-Supported Search Architecture. . . . . . . . . . . . . . . . . . . . . . 415.2 Decentralized Search Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425.3 Guifi.net zones partition graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435.4 Obtained service diameter (max, min and average) and number of aggregated zones

(average) for different optimal service overlay orders . . . . . . . . . . . . . . . . . . . 445.5 CDF of base-graph nodes at 1 hop to optimal solutions components . . . . . . . . . . . 455.6 Read Performance of Swift under Network Latencies . . . . . . . . . . . . . . . . . . . 485.7 Write Performance of Swift under Network Latencies . . . . . . . . . . . . . . . . . . . 485.8 Control Flow of the Self-healing Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 49

3

List of Figures List of Figures

5.9 Self-healing Validation under the Control System . . . . . . . . . . . . . . . . . . . . . 50

4Deliverable D3.2

List of Tables

2.1 Scenarios for providing infrastructure service with n nodes . . . . . . . . . . . . . . . . 152.2 Characteristics of nodes in data centres . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.3 Characteristics of VMs requested by users . . . . . . . . . . . . . . . . . . . . . . . . . 162.4 Percentage of resources utilized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5.1 Re-replication performance analysis for r = 100 and 1TB of data. . . . . . . . . . . . . 39

5

1 Introduction

1.1 Contents of the deliverable

This deliverable reports the work carried out in WP3 during the first reporting period of the CLOM-MUNITY project to resolve research challenges for community clouds, extending D3.1 reported atM06. Through the work carried out in tasks T3.1 - T3.4, we achieved discussion and consolidationof the network and service architecture, research on resource allocation in community clouds, self-management in components of community clouds, and work on enabling scalable service overlays.

1.2 Relationship to other CLOMMUNITY deliverables

The deliverable D3.2 builds upon D3.1 delivered in M06. D3.1 contains the requirements for the net-work and service architecture, and presented already a first version of the architecture itself. Buildingupon D3.1, we discuss and extend the initial network and service architecture, bringing in new in-sights, we have obtained from results of WP4 and WP2. The final version of the network and servicearchitecture will be reported in D3.4 (M30).Deliverable D3.2 furthermore contains a description of the research work carried out in tasks T3.2- T3.4, which feed the development of the community cloud system developed in WP2, reported inD2.2.D3.2 has received input from WP4 task T4.1 on the cases for pilots to be deployed in the secondreporting period of the project.

1.3 Research challenges addressed

While community networks successfully achieve a community-owned and managed infrastructure onthe networking level, services the presence of services and applications for end users within commu-nity networks is very low, and even less these applications are cloud-based. Clouds in communitynetworks were not existing. The scenarios for community clouds needed to be envisioned and identi-fied taking into account the socio-technical conditions of community clouds. The architecture of thecommunity cloud needed to integrate the specific extensions that are required to fit clouds into com-munity networks. We argue in our research contribution the need for an economic and social extensionof cloud computing architectures for community clouds, in order to be able to regulate provision andusage of resources [1, 2, 3]. The architecture needs to incorporate economic mechanisms to directresource contribution and collaborative sharing by the community members [4], see for more detailsSection 2.1. The architecture with distributed components and decentralised management has to scalewell for providing good performance as the adoption of cloud services grows in the community [5],see the discussion in Section 2.2.Research work on infrastructural resource management in community clouds, given the communitycloud scenarios we developed, investigated on cloud federations, since in community clouds we en-vision a large set of micro-cloud providers, building their cloud infrastructure with heterogeneous

6

1. Introduction 1.3. Research challenges addressed

hardware, and manage them with different cloud management platforms, in contrast to commercialcloud worlds, where we rather see a smaller number of big players, operating the clouds with skilledprofessional cloud management personal. Furthermore, incentive mechanisms were investigated aspart of a cloud eco-system that sustains the community clouds.Research on self-management addressed solutions for particular components, such as an elastic Cloudstorage, needed to support the functionality of the community clouds system. A monitoring systemfor community cloud devices is part of the regulation system for community clouds which contributesto being able to extract users’ contribution and usage of cloud resources. Performance interferenceis another relevant aspect for community clouds, which finally influences on the user experience [6].User experience itself is of great importance for the community cloud system, since only motivatedusers will join the community cloud.In order to improve utilisation of Cloud resources used by an elastic Cloud-based service, such asa Cloud storage, there is a need for an elasticity controller that automatically resizes (scales out orshrinks) an elastic Cloud service in response to changes in workload, in order to meet SLOs at areduced cost, i.e. with a minimal amount of resources (VMs, storage capacity and services instances).To automate elasticity of a Cloud storage, we have developed an elasticity controller ElastMan thatcontrols the size of the storage based on workload.Another autonomic manager that we have developed for a Cloud-based storage is BwMan, a networkbandwidth manager that arbitrates the bandwidth allocation among individual services and differentservice activities in a Cloud service sharing the same Cloud infrastructure. Specifically, BwManarbitrates the available inbound and outbound bandwidth of servers of a Cloud storage among theuser-centric workload and the system-centric workload (e.g. related to self-healing). Insufficientbandwidth allocation for the user centric workload might lead to the violation of SLOs; whereas thesystem may fail when insufficient bandwidth is allocated for data re-balancing and failure recovery,i.e. system-centric activities. We have implemented and evaluated BwMan for the OpenStack Swiftstore. Our evaluation has shown that by using BwMan, we can assure that the network bandwidth iseffectively arbitrated and allocated for user-centric and system-centric workloads according to speci-fied SLOs and policies. Our experiments show the effectiveness of BwMan that allows to reduce SLOviolations at least by a factor of two.In order to address the need for scalable and consistent storage of (meta-)data for cloud services, wehave developed CaracalDB with a focus on fast highly-parallel recovery from failures on point-to-point networks. CaracalDB uses automatic partitioning and load balancing of data to allow informa-tion from failed nodes to be recovered utilising a large portion of the remaining cluster members.Finally, the ability to discover content from cloud services is crucial to the effective use of a commu-nity network. To this end we developed a distributed search service that is robust and scalable whiledelivering consistent results to user queries. The service uses parallel search in multiple categories inorder to provide fast and relevant results from user-generated data, like shared movies, for example.

Deliverable D3.27

2 Architecture

Realising community cloud involves a lot of challenges both in technological and socio-economiccontext, but also promises interesting value proposition for communities in terms of local services andapplications. We focus here on a distributed architecture for community clouds, which integrates intothe cloud the computation and storage hardware contributed by the community network members tothe community network, but also the socio-economic contribution these community network membersdonate to the collective effort in the form of knowledge, time and help.A community network is managed and owned by the community, where nodes are managed inde-pendently by their owners. The capacity, availability and connectivity vary widely among the nodes.Nodes that form the backbone however, i.e. supernodes (SNs), are usually intended to be stable withpermanent connectivity. Ordinary nodes(ONs) do more frequently change their connectivity status.An architecture for the community cloud system that manages such infrastructure needs to be robust,self-managing and efficient at handling the heterogeneity among the nodes.The option for enabling a community cloud on which we focus here is to deploy a cloud manage-ment platform tailored to community networks on the nodes attached to the network. There are a fewcloud management systems available to manage public and private clouds, for example OpenNeb-ula1 [7], OpenStack2, CloudStack3, Eucalyptus4, and Nimbus5. Such cloud management systems canbe tailored for community networks by extending the existing functionality to address the particularconditions of community networks. For example, incentive mechanisms inspired by the social na-

1http://www.opennebula.org2http://www.openstack.org3http://cloudstack.apache.org4http://www.eucalyptus.com5http://www.nimbusproject.org

Figure 2.1: Architecture of the community cloud management system

8

http://www.opennebula.org

http://www.openstack.org

http://cloudstack.apache.org

http://www.eucalyptus.com

http://www.nimbusproject.org

2. Architecture 2.1. Distributed Architecture in Socio-Economic Context

Figure 2.2: Overview of different layers of the community cloud management system

Figure 2.3: Role of social and economic context enablers in the community cloud managementsystem

ture of community networks can be built into resource regulation component to encourage users tocontribute resources [1, 3, 2].

The conceptual overview for the cloud management system 6 that we propose for community networksis shown in Figure 2.1. The nodes along with the communication infrastructure of the communitynetwork form the the hardware layer of the cloud architecture. The core layer residing in the SNcontains the software for managing and monitoring the virtual machines (VMs) on ONs. The frontend layer provides the interface of the infrastructure service (Infrastructure-as-a-Service, IaaS). Thecomponents cloud coordinator, economic engine and social engine provide additional services forcustomizing cloud infrastructure to the community networks.

2.1 Distributed Architecture in Socio-Economic Context

We explore here the macroeconomic mechanisms [4] that can help in adoption and growth of com-munity cloud model [8, 9]. The proposed architecture, see Figure 2.2, needs to take advantage ofsocio-economic context of community networks to ensure the success of community cloud model(Figure 2.3). We discuss first a cost-value proposition describing the conditions under which commu-nity clouds should emerge. Secondly, we propose a set of macroeconomic policies that, if placed incommunity networks, should accelerate the uptake and help the sustainability of community clouds.

6Refer to deliverable D3.1 “Requirements for a holistic network and service architecture” for detailed requirements anddiscussion about different components of the architecture.

Deliverable D3.29

2.1. Distributed Architecture in Socio-Economic Context 2. Architecture

Figure 2.4: Relationship between cost and value in evolution of community cloud

2.1.1 Cost and Value Relationships in Community Cloud

Figure 2.4 shows the desired relationship between the cost and value proposition as the communitycloud evolves and gets adopted by wider audience. In the nascent stage, the community cloud will notbe able to provide much value until a critical mass of users are using the system. After that threshold,still the relative cost to achieve a little utility will be significant, which means that the early adopters ofthe system remain highly motivated and committed to the success of community cloud and continueto contribute resources even though they receive little value from the system in return. But once asignificant proportion of the overall population has joined the community cloud, the relative cost toobtain value from the system tumbles and in the longer run the system is able to sustain itself withcontributions that may be small in size but are made by a large number of users. The objective of theeconomic mechanisms and the social and psychological incentives is to let the system transition frominception through early adoption to finally ubiquitous usage.

2.1.1.1 Costs for Participation

The initial costs for setting up nodes in the community cloud involves hardware costs including theprice of the computing and networking equipment, and installation costs including the manual labourneeded. The continuous operation of the cloud node requires additional costs including network costsgiven by donating network bandwidth and any other subscription fees, energy costs to pay for elec-tricity bills to run the computer equipment as well as cooling apparatus, maintenance cost to fundany technical support and replacements for parts, and hosting costs to provide storage space for theequipment. Besides these costs at the individual level, there are also the transaction costs [10] or man-agement overheads to direct the group coordination and collaborative production efforts necessary forthe operation of community cloud.

2.1.1.2 Value Proposition

The individuals in community cloud act as private enterprises where they offer services to generaterevenue. The revenue for the community cloud users include tangible benefits like the services andapplications that they will be able to consume, and intangible benefits like the sense of belongingto the community and personal satisfaction because of their contributions. The services can rangefrom infrastructure to platform to software services meeting a spectrum of different needs of theusers. Once community cloud gets adopted by a critical mass, community may also generate revenueby offering computing resources to commercial enterprises, similar to selling excess power capacityin the case of Smart Grid. For example, community can get into partnership agreements with theICT providers where community can buy network bandwidth in return for providing access to thecomputing resources of the community cloud.

10Deliverable D3.2

2. Architecture 2.1. Distributed Architecture in Socio-Economic Context

2.1.1.3 Comparison with Commercial Services

We discuss the community cloud cost and value in comparison with two popular commercial ser-vices that are also based in part on the idea of reciprocal sharing, Spotify7 and Skype8. Spotify is asubscription-based music streaming service which reduces its infrastructure and bandwidth costs byserving cached content from users’ devices as well as its own servers. Skype is a communicationservice which uses caches on users’ devices for storing and processing information required for man-aging the underlying infrastructure. Both Spotify and Skype offer free as well as paid services. Whydo users agree to contribute resources, and even when they are paying for the service?An argument is that the costs for users are minimal. Both services mostly consume storage, compu-tation time, power and bandwidth on the users’ devices. Since these resources are not very expensiveand the services’ usage remains relatively low, the users do not mind this arrangement or not evennotice it. But even more important, these services are designed so intuitively that most users do noteven realise about donating the resources, and even when they do, the value these services providehas sufficient incentive.The success of such services implies that for community cloud as well, the users should be able tojoin with zero or very little costs. The value proposition of the community cloud services should bestrong enough to attract early adopters and keep them committed. The economic mechanisms in placefor encouraging reciprocal sharing and ensuring overall system health and stability should be eitherinvisible for non-technical users or very simple to understand and work with.

2.1.2 Macroeconomic Mechanisms for Architecture

We discuss in this section the macroeconomic policies we propose for community clouds, addressingrelevant issues of the technical, social, economic and legal aspects of the community cloud system.We approach the problem by having explored some of the mechanisms previously in simulations [2]and also by developing a prototype implementation which is currently deployed in the Guifi com-munity network [11] and which will allow to get users involved and participating in a real worldscenario.

2.1.2.1 Commons License

The agreement and license to join a community cloud should encourage and help enforce reciprocalsharing for community clouds to work. The Wireless Commons License9 or Pico Peering Agree-ment10 is adopted by many community networks to regulate network sharing. This agreement couldserve as a good base for drafting an extension that lays out the rules for community clouds.

2.1.2.2 Peering Agreements

When different community clouds federate together, agreements should ensure fairness for all theparties. Agreements between different communities should describe the rules for peering betweenclouds. Within such agreements, local currency exchanges could be extended to address cases ofimbalance in contribution across different zones [12].

7http://www.spotify.com8http://www.skype.com9http://guifi.net/es/ProcomunXOLN

10http://www.picopeer.net

Deliverable D3.211

http://www.spotify.com

http://www.skype.com

http://guifi.net/es/ProcomunXOLN

http://www.picopeer.net

2.1. Distributed Architecture in Socio-Economic Context 2. Architecture

2.1.2.3 Ease of Use

The easier it is for users to join, participate and manage their resources in the community cloud, themore the community cloud model will be adopted. This requires lowering the startup costs and entrybarriers for participation. To this end, in terms of an institutional policy, we have developed a Linux-based distribution11, to be used in the Guifi.net community cloud [11]. It will make the process ofjoining and consuming cloud services almost automated with little user intervention. This effect willmake the community cloud appealing to non-technical users.

2.1.2.4 Social Capital

Community clouds need to appeal to the social instincts of the community instead of solely providingeconomic rewards. This requires maximising both bonding social capital [13] within local commu-nity clouds in order to increase the amount of resources and commitment of the users, and bridgingsocial capital in order to ensure strong cooperation between partners in federated community clouds.Research on social cloud computing [14] has already shown how to take advantage of the trust re-lationships between members of social networks to motivate contribution towards a cloud storageservice.

2.1.2.5 Transaction Costs

The community cloud, especially in its initial stages, will require strong coordination and collabora-tion between early adopters as well as developers of cloud applications and services, so we need tolower the transaction costs for group coordination [10]. This can take advantage of existing Guifi.net’smailing list12, but also of the regular social meetings and other social and software collaboration tools.It also requires finding the right balance between a strong central authority and decentralised and au-tonomous mode of participating for community members and software developers.

2.1.2.6 Locality

Since the performance and quality of cloud application in community networks can depend a lot onthe locality, applications need to be network and location aware, but this also requires that providersof resources should honour their commitment to local community cloud implying that most requestsare fulfilled within the local zone instead of being forwarded to other zones. We have explored theimplications of this earlier when studying the relationship between federating community clouds [2,3].

2.1.2.7 Overlay Topology

Community networks are an example of scale-free small-world networks [15], and the communitycloud that results from joining community networks users is expected to follow the same topology andinherit characteristics similar to scale-free networks. As the overlay between nodes in the communitycloud gets created dynamically [16], the community cloud may evolve along different directions asusers of the underlying community network join the system. As the applications in community cloud

11http://repo.clommunity-project.eu12http://guifi.net/en/forum

12Deliverable D3.2

http://repo.clommunity-project.eu

http://guifi.net/en/forum

2. Architecture 2.2. Evaluating Scalability of Distributed Architecture

will most likely be location and network aware to make the most efficient use of the limited andvariable resources in the network, the overlay steered concentration and distribution of consumersand providers of services direct the state and health of the community cloud.

2.1.2.8 Entry Barriers

In order to control the growth of the community cloud and provide a reasonable quality of experiencefor early adopters and permanent users, different approaches can be considered, for example, a com-munity cloud open to everyone, by invitation only, or one that requires a minimum prior contribution.

2.1.2.9 Role of Developers

The developers of the cloud applications are expected to play an important intermediary role betweenproviders of resources and consumers of services, for example adding value to the raw resourcesand selling them to consumers at a premium. End users could have both the roles of raw resourceproviders and consumers which find the value of the cloud in the provided applications.

2.1.2.10 Service Models

Cloud computing offers different service levels, infrastructure, platform and software-as-a-service(SaaS). Similar to the three economic sectors for provisioning goods, the third level, the SaaS of thecloud reaches the end users. For providing value from the beginning in the community cloud, wepropose to prioritize provisioning SaaS at the early stage of the community cloud.

2.1.2.11 Value Addition and Differentiation

The community cloud requires services that provide value for users. In addition, these services needto compete and differentiate from the generic cloud services available over the Internet. In this line,FreedomBox13 services focus on ensuring privacy, and FI-WARE CoudEdge14 and ownCloud15 letcloud applications consume resources locally.

2.2 Evaluating Scalability of Distributed Architecture

Community networks have the potential to offer applications to users through open and neutral cloudservices hosted on community-owned resources. We evaluate here the scalability issues for a com-munity cloud-based infrastructure service [5]. The scenarios of community cloud help in obtainingthrough simulations a preliminary characterization of the behaviour of an infrastructure service incommunity clouds. It is observed that for community clouds the distribution and capability of re-sources, which are less powerful than those in commercial centralized clouds, will impact the re-sponse time and the resource assignment. Network-aware cloud services, however, seem to havesome potential to improve the performance of infrastructure service by reducing its dependency onthe conditions of community network. Achieving a reasonable quality of user experience with com-munity clouds will be needed to sufficiently motivate the members of community network to extend13http://freedomboxfoundation.org14http://catalogue.fi-ware.eu/enablers/cloud-edge15http://owncloud.org

Deliverable D3.213

http://freedomboxfoundation.org

http://catalogue.fi-ware.eu/enablers/cloud-edge

http://owncloud.org

2.2. Evaluating Scalability of Distributed Architecture 2. Architecture

the current collective management at the network level to that of cloud services. Once this level oftechnical performance is assured, community clouds may outperform commercial clouds in their so-cial aspects by offering open and neutral cloud services provided from within the community network.With such simulations we expect to obtain a better understanding of the potential and the design ofnetwork-aware infrastructure services for community clouds.

For our simulation experiments, we have used CloudSim simulation toolkit [17] which sup-ports analysing the behaviour of cloud computing environments under various experimental condi-tions [18]. CloudSim is an event-based simulator that models cloud infrastructures as data centrescharacterized by the number of physical nodes and the scheduling policy for assigning users’ requeststo the nodes. The nodes in CloudSim are defined by their processing capacity given in millions ofinstructions per second (MIPS), memory, storage, bandwidth, the number of VMs and the policy fordistributing resources among VMs. The requests from users are termed as cloudlets in CloudSimand dynamic behaviour of applications with respect to resource requirements can be programmed byextending the code for cloudlets. Both VMs and cloudlets have attributes like processing capacity,memory, storage and bandwidth and these attributes are used by the broker process which managesthe instantiation and allocation of VMs to map cloudlets to VMs. CloudSim also supports feder-ated cloud architectures by providing a cloud coordinator entity which distributes the users’ requestsamong multiple data centres.

We have modelled commercial centralized clouds as a single data centre and community clouds as acloud coordinator with multiple data centres in CloudSim. SN in a community network is assumed tobe the broker in data centre while ONs in a community network are the nodes hosting VMs. Our goalis to analyse the behaviour of community clouds and compare its profile with commercial clouds andin our preliminary experiments, we have focused on the resource utilization in terms of processingcapacity (CPU), memory (RAM), network bandwidth and the quality of service in terms of averageresponse time for users’ requests.

2.2.1 Experiment Setup

We simulate three scenarios in our experiments with different distribution of nodes among data cen-tres. Table 2.1 shows the number of data centres and the number of nodes at each data centre fordifferent values of n which is the total number of available nodes across all data centres.

2.2.1.1 Centralized Cloud

Centralized Cloud represents commercial clouds where all the nodes are present in a single datacentre.

2.2.1.2 Federated Cloud

Federated Cloud depicts the common scenario in community networks where a super node providesthe cloud management system and manages the VMs on the ordinary nodes linked to it. In this case,the nodes are evenly distributed between the data centres and the number of data centres equals thenumber of nodes at each data centre. For example, the federated cloud in Figure 2.5 has total 9 ONsso n = 9, and we can view it as consisting of 3 data centres where each data centre has 3 nodes.

14Deliverable D3.2


Figure 2.5: Super and ordinary nodes in federated community cloud

Table 2.1: Scenarios for providing infrastructure service with n nodes

Scenario Data Centres Nodes per Data Centre

Centralized Cloud 1 n

Federated Cloud ≈

√

n ≈

√

n

Decentralized Cloud n 1

2.2.1.3 Decentralized Cloud

Decentralized Cloud depicts the scenario where all the nodes act as independent service providers.This is taken as an extreme case of federated community cloud when there is no hierarchy amongnodes and all the nodes act as individual local clouds, albeit consisting of just one node. This ismodelled as multiple data centres each with only a single node in CloudSim.The hardware profile of nodes, as shown in Table 2.2, is same across all the three scenarios and iscomparable to the configurations used in other recent experiments involving CloudSim [18]. Theprofile of VMs requested by users, as shown in Table 2.3, is also identical in all the three scenarios.We have not considered the effect of network topology in these experiments, so bandwidth betweenall data centres is assumed to be identical.

2.2.2 Results

We present here the results from our simulation experiments. We have run each experiment ten times,and plotted the average values in the following graphs.

2.2.2.1 Resource Utilization

We first consider the overall resource utilization in our experiments, as shown in Table 2.4. Ourfirst observation is that the values are similar across all the scenarios. This is because the numberand capability of the nodes, and the number of requests made by the users are identical in the threescenarios. In addition, we are not considering the network delays and state of the links betweendifferent nodes. In reality, the hardware of the nodes in a community network will differ in capacityand the network bandwidth will also be limited, so we need to extend our simulations to take thisheterogeneity into account for community cloud. The other point is that utilization is almost 50%

Deliverable D3.215


Table 2.2: Characteristics of nodes in data centres

Attribute Value

Architecture x86

Operating System Linux

Hypervisor Xen

CPU 2,400 MIPS per VM

RAM 8 GB

Storage 80 GB

Bandwidth 100 Mbps

Hosted VMs 4

VM Scheduling Time Shared

VM Migration Not Allowed

Table 2.3: Characteristics of VMs requested by users

Attribute Value

CPU Time 1,000 MI

Number of Cores 1

RAM 512 MB

Bandwidth 100 Mbps

VM Image Size 1 GB

Scheduling Policy Dynamic Workload

Number of Requests 50 requests per minute

in all the cases, which shows that there are not sufficient number of requests to keep the availableresources in data centres busy all the time and so the nodes are idle almost half of the time while theexperiment is running. In the future work, we need to evaluate the behaviour of the system includingthe level of resource utilization under different load conditions.

2.2.2.2 Response Time

We consider the average response time in order to analyse the quality of service provided by theinfrastructure service. This is the difference between the time when a request is submitted to thesystem and when a VM is allocated for that request. Figure 2.6 shows the average response time acrossall requests as the number of nodes increases in the system for the three scenarios. We find that forlimited number of nodes, centralized cloud provides better service because resources are consolidatedat one data centre. However, in the case of federated and decentralized scenarios, resources aredistributed between multiple data centres and are not sufficient to meet the requests forwarded to the

16Deliverable D3.2


Table 2.4: Percentage of resources utilized

Nodes Data Centres Nodes/Centre CPU RAM Bandwidth

100 1 100 48.51 49.3 49.08

100 10 10 49.13 49.43 49.09

100 100 1 49.27 49.52 49.35

400 1 400 50.22 48.83 48.56

400 20 20 49.46 49.27 49.72

400 400 1 49.36 48.29 50.31

1000 1 1000 49.50 49.43 49.50

1000 30 30–35 49.41 49.36 49.57

1000 1000 1 50.12 48.94 48.42

1000

1100

1200

1300

1400

1500

1600

1700

1800

100 400 1000

Ave

rage

Re

spo

nse

Tim

e (

ms)

Total Number of Nodes

Centralized

Federated

Decentralized

Figure 2.6: Average response time as number of nodes increases

data centres. However, as the number of nodes increases along with the rise in the volume of requests,the overheads for centralized data centre become significant since the requests remain in the queuerelatively longer waiting for resources to become available.

We also studied how the rise in demand affected the quality of service. Figure 2.7 shows the responsetime as the number of users increases in the system for the three scenarios with 100 and 400 nodes.We find that for federated scenario in the case of 100 nodes, where only 10 ONs are assigned to eachSN, the system is not able to cope well with the increase in number of users and the performancedegrades sharply as compared to the other scenarios. We notice that in addition to the availability anddistribution of resources, another factor that could affect the performance is the implementation ofcloud coordinator and the broker processes as they may act as a bottleneck under very high load, sowe also plan to explore their impact on the system in the future work.

2.2.2.3 Requests Completion Time

In addition to the response time, Figure 2.8 shows the finish time when a request is completed andis released back to the system. We find that the behaviour is similar in all the three scenarios, which

Deliverable D3.217


140

145

150

155

160

165

170

10 100 1000 10000

Ave

rage

Res

po

nse

Tim

e (m

s)

Total Number of Users

Centralized (100)

Federated (100)

Decentralized (100)

Centralized (400)

Federated (400)

Decentralized (400)

Figure 2.7: Average response time as the number of users increases

0

50

100

150

200

250

300

350

1 1,001 2,001 3,001 4,001

Fin

ish

Tim

e (

seco

nd

s)

Number of Requests

Centralized

Federated

Decentralized

Figure 2.8: Finish time of requests for the three scenarios with 400 nodes

is due to the fact that the total number of resources available in the system remain the same for eachscenario. Moreover, as the duration of requests is relatively short and does not vary much, thereare not any significant delays in the system at any point. It would be interesting to model requeststhat have different resource requirements and see how they affect the overall quality of service of thesystem.

18Deliverable D3.2

3 Research on infrastructural resource management

3.1 IaaS Service Selection

In an environment such as Clommunity, which is built of multiple small clouds providing IaaS plat-forms with diverse capacities and quality of services located in different geographical areas, there isa need for a service to select the best cloud provider according to the needs of users. In this sectionwe define the specifications and design of an IaaS service selector (ISS) to satisfy the aforementionedrequirement in Clommunity.

3.1.1 User Criteria

User gives a set of specifications for the required resources as an XML input to the ISS. This set ofspecifications are the user criteria which ISS finds the best provider based on them. The user criteriaincludes number of virtual machines with different number of Virtual CPUs, size of memory, networkbandwidth and persistent disk. There are some application specific criteria which are necessary tobe considered for some applications. For example, for some applications such as video-on-demandservice, geographical location of the service affects the QoS. Indeed, some distributed applicationshave multiple components such as a three-tier web application containing a web server and a backenddatabase. Considering inter communication between those components are important in the QoSof the service. Therefore, we also consider some application specific criteria such as geographicallocation, network topology and average latency between multiple VMs.

3.1.2 Architecture of the IaaS Service Selector

We defined the ISS architecture as a centralised RESTful webservice. This will provide the remoteaccess to the ISS and it also can become a part of a more general system such as a cloud brokersystem.An ISS user must be authenticated before being able to use the service. ISS user’s credentials aretheir Clommunity username and password. The authentication phase is to avoid unauthorized accessto clouds informations, to apply resource quota per user and to enable the extension of ISS to providethe actual resources on behalf of the user.User criteria can be defined as an XML file with a predefined schema which specifies a valid structurefor user criteria description. After defining the criteria as an XML content, user can send a requestincluding credentials and the criteria description through POST request to the ISS.The ISS, first, tries to authenticate the user by sending an authentication request to the ClommunityIdentity Manager. After a successful authentication, it parses the XML formatted criteria and willquery the Clommunity clouds for their available resources. If user has any application specific criteriasuch as geographical location and average network latency then ISS collects the related informationfrom Cloud Monitoring service, as well. Based on the criteria and the gathered data, ISS will sort allthe possible cloud providers that potentially can host the user’s VMs. A high level overview of theISS is shown in Figure 3.1.

19

3.2. Incentive Mechanisms 3. Research on infrastructural resource management

Figure 3.1: High level overview of the Cloud service selector

3.1.3 Implementation Plan

We defined the blueprint of the ISS and the implementation steps are as follows:In the initial implementation of ISS, a RESTful web service in Python programming language will beimplemented which accepts users requests and returns a sorted list of clouds that fits best the users’criteria. The schema for defining the quota in XML format will be defined. Internal structure ofthe web service allows cloud selection policies to be plugable. As a service selection policy, it usesa simple comparison method between the user criteria and the clouds information to find the rightclouds. This release will only support OpenStack clouds and will use OpenStack Python APIs toleverage full functionality of OpenStack services.In the second release, we implement more efficient cloud selection policies which can avoid the pos-sible existing issues in the first step such as first come better service (For instance, the high demandover cloud providers with better quality of services or being located in a highly demanded geograph-ical location due to the free resources). Due to the scarcity of resources, we need to apply a bettercloud selection policy to avoid the first come better service problem. We will implement a dynamicpricing method in the ISS taking into account Clommunity users’ credits to solve this problem.

3.2 Incentive Mechanisms in Community Clouds for ResourceRegulation

Community networks are an ecosystem which is able to regulate and maintain itself, some of thecommunity networks are there for even more than a decade. Participants of the community networknot only contribute infrastructure to the network, but also their knowledge, time and effort for suc-cessful operation of the network. We anticipate that cloud infrastructures for community networkswill need additional incentive mechanisms in order to achieve sustainability.We have studied incen-

20Deliverable D3.2

3. Research on infrastructural resource management 3.2. Incentive Mechanisms

tive mechanism for clouds in community networks, keeping in view key characteristics of communitynetworks and the scenarios we foresee for community clouds. Our approach is to do the evaluationwith a prototype [3], which will allow us to derive additional conclusions regarding its feasibility forimplementation and deployment on a wider scale.Since our community cloud aims to be used in real community networks, it is a must that our architec-ture, design, implementation and deployment fits into these conditions and scenarios. We focus ouranalysis on the Guifi.net community network, which is considered the largest community networkworldwide, and it is where we have also deployed our prototype.

3.2.1 Social aspects of community networks

Personal and social relationships play an important role in the community network deployment. Thedeployment of new nodes need the collaboration among people. If a new node is deployed, theowners of the neighbouring nodes need to connect with it, thus there has to be an interaction amongthe people. Two types of social networks can be observed from Guifi.net’s mailing list1. One is at theglobal level of the whole Guifi.net network. In this list, technical issues are discussed. People fromany part of Guifi.net community participate, and even external people who are interested can takepart. The second type is the local social network, between node owners within a zone and betweenneighbouring zones. They use local mailing lists and some local groups also hold weekly meetings.Guifi.net is organized into zones. A zone can be a village, a small city, a region, or a district of a largercity. The organization of the group within a zone is of many types. Mostly the interests, available timeand education of the people drive what happens in the zone. We note that while the allocation of IPaddresses and layer 3 networking is agreed among all Guifi.net zones, as it is needed to make the IPnetwork work, the detailed technical support is rather given within the local community of the zone.Therefore, we identify a zone to have the highest social strength within the community network.

3.2.1.1 Members of community networks

Participants of community networks are principally consumers and producers of the network. Mostof them as producers contribute infrastructure and time to the networks, while as consumers theyuse the available services the network offers. The community network, however, is not maintainedsolely based on the contribution of infrastructure. Some users must also contribute with their timeand knowledge. Time is needed, for instance, for maintenance tasks, which might require technicalknowledge or not. Technical knowledge is required because the network is an IP network, whichneeds to be managed and configured.

3.2.1.2 Resource sharing in community networks

Community networks are a successful case of resource sharing among a collective. The resourcesshared are networking hardware but also community network participants’ time that they donate, todifferent extent, for maintaining the network. While the community network infrastructure is thesum of the individual contributions of wireless equipment, the network operation is achieved by thecontribution of time and knowledge of the participants. This is because even under the decentralizedmanagement of the equipment, the owner of the device ultimately has the full access and control ofthat network device.

1http://guifi.net/en/forum

Deliverable D3.221

http://guifi.net/en/forum


Reciprocal resource sharing is, in fact, part of the membership rules or peering agreements of manycommunity networks. The Wireless Commons License2 (WCL) of many community networks statesthat the network participants that extend the network, e.g. contribute new nodes, will extend the net-work in the same WCL terms and conditions, allowing traffic of other members to transit on theirown network segments. Therefore, resource sharing in community networks from the equipmentperspective refers in practice to the sharing of the nodes’ bandwidth. This sharing, done in a re-ciprocal manner, enables the traffic from other nodes to be routed over the nodes of different nodeowners and allows community networks to successfully operate as IP networks. We observe that inmost community networks the focus at the moment is on the bandwidth sharing alone. There is notmuch awareness about sharing other computing resources, such as storage or CPU time, inside ofcommunity networks.

3.2.1.3 Ownership of nodes in community networks

Community networks grow organically. Typically a new member that wants to connect to the commu-nity network contributes with the hardware required to connect to other nodes. A node of a communitynetwork therefore belongs to the member who is its sole owner. Such a node is normally located inthe member’s premises.Although less typical, a few nodes in Guifi.net have also been successfully crowd-funded if sucha node was needed by several people. Crowd-funding of a node happened when for a group ofpeople an infrastructure improvement was necessary. For example, an isolated zone of Guifi.netestablished a super node to connect to other zones. In such a case, the node has been purchasedwith the contributions of many people. The location of such a node follows strategic considerations,trying to optimize the positive effects on the performance that are achieved with the addition of thenew infrastructure. We can see that both the options, individual ownership and crowd-funding ofresources, occur in practice and could be considered for community clouds.

3.2.1.4 Services in community networks

Services and applications offered in community networks usually run on the machines that the mem-ber connects to the network and these machines are used exclusively by that member. The usage of thecommunity network’s services among its members, beyond that of access to the Internet, is howevernot very strong.

3.2.2 Design of Incentive Mechanisms

The participants in community network are mainly volunteers so it is necessary that the communitycloud has incentive mechanisms in place that encourage members to contribute with their hardware,effort and time. When designing such mechanisms, one has to take into account the heterogeneity ofthe network, nodes and communication links, since each member brings in a widely varying set ofresources and physical capacity to the system.As detailed in [1, 2], we propose an incentive mechanism which applies reciprocity-based resourceallocation. This is inspired by the Parecon economic model [19, 20] which focuses on social welfareby considering the inequality between nodes. In this model, nodes’ rewards are calculated based ontheir effort, which is a function of their capacity as well as their contribution to the system.

2http://guifi.net/es/ProcomunXOLN

22Deliverable D3.2

http://guifi.net/es/ProcomunXOLN


The criteria that a SN uses to evaluate requests from ONs is the following: When an ON asks for aresource from a SN, which in this case is the number of VMs and the duration for which they areneeded, the SN first checks whether the ON’s credit is sufficient to cover the cost of the transaction.This cost is proportional to the number of VMs requested and the duration they are occupied. If theON does not have sufficient credit, the request is rejected. However, SN sometimes allows requestsfrom ONs with zero or negative credit so as to encourage them to participate in the system and earncredit by contributing more VMs. If an ON has enough credits, the SN searches for VMs providedby the ONs in its zone. If the demand cannot be met locally, the SN forwards the request to otherSN zones. For each ON which provides VMs, the SN calculates the transaction cost and adds it tothat ON’s credits, while the cost is deducted from the consumer ON’s credits. Once the operation iscompleted, the effort for each ON involved in the transaction is recalculated. The effort of a nodeexpresses its relative contribution to the system since the mechanism considers the capacity Ci of anode as well. The significance of this is that a node with low capacity has put in more effort than anode with higher capacity even if both of them donated an equal number of VMs.

3.2.3 Prototype Implementation for Incentive Mechanisms

We have implemented a prototype of the incentive-based regulation mechanism that was proposedin [1, 2]. We implemented the components in the Python programming language and used CouchDB3

as database. We chose Python because the current host operating system installed on ONs is Open-WRT4, which supports Python, but does not support many other languages such as Java. We selectedCouchDB because among its advantages, it is lock-free, schema-less and provides a REST interface,and is also part of other components of the SN’s cloud management software being developed. In theSNs, Debian operating system is installed.

ONs use the remote procedure call (RPC) mechanism to connect to the SN. First of all, an ON assignsitself to a parent SN with a register message which includes metadata of that ON such as IP address,total capacity and number of VMs shared. This registration information is stored in the ON-Listdatabase of the parent SN by creating an entry for the corresponding ON. After that, the ON is readyto send request messages to its parent SN. Figure 3.2 shows the request processing algorithm followedby SN [2]. When an ON requests its parent SN for any VMs, it specifies the duration for how longit needs to use the VMs. This request is evaluated by performing incentive and decision mechanismsas explained in section 3. If a request cannot be met locally, the corresponding parent SN checks itsSN-List database to find another zone with available resources. The interactions between SNs arealso made through RPC mechanism. In the SN controller software, there is a separate process whichregularly checks the database for any updates. If a consumer ON’s resource request duration has ex-pired, it frees the VMs and make them available again for the provider ON, and updates the metadataentries of the corresponding ONs in the ON-List database. The current implementation keeps track ofthe number of VMs contributed and consumed by each ON. The system copes with ONs connectingand disconnecting from the SN at any time since ONs periodically send heartbeat messages to theSN. The design allows us to include, in addition, values of metrics like CPU, memory and bandwidthusage, which in the future could be used for fine-grained decisions on resource assignments.

3http://couchdb.apache.org4http://openwrt.org

Deliverable D3.223

http://couchdb.apache.org

http://openwrt.org


Figure 3.2: Details of the VM request operation by an ON

3.2.4 Simulation Experiments for Incentive Mechanisms

We have studied incentive mechanisms for resource regulation within a single SN zone which corre-sponds to local community cloud scenario [1]. We have also extended our simulator to study resourceregulation across multiple SN zones covering both local and federated community cloud scenarios [2].Even though we also implemented and deployed a prototype of the regulation component of CloudCoordinator on nodes of a real community network [3], as only a handful of nodes are made availablecurrently, the analysis of our proposed system on greater scale using the real prototype system is toolimited. Therefore, we focused results from the simulation experiments, where our scenario could beextended to a community cloud consisting of 1,000 nodes.

We simulate a community network comprising of 1,000 nodes which is divided into 100 zones andeach zone has one super node and nine ordinary nodes. The zones are distributed in a small worldtopology where each zone is neighbour to 10 other zones. This approximation holds well for realworld community networks as, for example, topology analysis of Guifi.net [15] shows that the ratioof super node to ordinary nodes is approximately 1 to 10. Each ordinary node in the simulation canhost a number of VM instances that allows users’ applications to run in isolation. Nodes in the zonehave two main attributes, one is capacity which is the number of available VM instances, and other issharing behaviour which is how many instances are shared with other nodes. In the configuration inour experiments, nodes with low, medium and high capacity host 3, 6 and 9 VM instances respectively

24Deliverable D3.2


and they exhibit selfish, normal or altruistic behaviour sharing one-third, two-thirds or all of theirVM instances. When the experiment runs, nodes make requests for resources proportional to theircapacity asking for two-thirds of their capacity. For instance nodes with capacity of 3, 6 and 9 VMinstances request 2, 4 and 6 instances respectively. Nodes request instances for fixed duration andafter transaction is complete wait briefly before making further requests.Our results indicate the impact of incentive mechanisms on the efficiency of the system and on reg-ulating the resource assignments. The understanding gained from the different experimental resultshelps in the design of the policies that such incentive mechanism could follow in a future prototypeof real community cloud system.

Deliverable D3.225

4 Research on self-management mechanisms inClommunity

4.1 Monitoring System

The monitoring system [6] is designed specifically to monitor activity on the Clommunity device.Monitoring the devices presents specific challenges in the form of large scale of infrequently-useddata. The monitoring system should support active measurements that provide insight into the func-tioning of devices without revealing too much information of what is running on it, and should beflexible enough to add new metrics without hampering the functionality. Monitoring logs shouldnever lose precision and should support passively measured data such as last-time ssh succeeded,number of ports in use, resource hogs (which services are using the most CPU, memory, bandwidthand ports).

At a high level, the monitoring system consists of a monitoring daemon running on each researchdevice, a centralized data gathering and processing infrastructure and a display facility. The daemonrunning on the research device provide node-centric data including service specific information andmonitors periodically (e.g. every sixty seconds). It accepts HTTP requests and responds with HTTPresponses, to allow them to be accessed from web browsers in addition to being used with automatedsystems. The response is provided in JSON format to allow researchers to query and use monitoreddata. The daemon stores the monitored information in a file locally until the data gathering servicehas seen it. This ensures that no monitored information is lost during a network partition and helpsresearchers diagnose any problem that may have happened during this period.

While the daemons operate on research devices, the data gathering and processing operates on aproperly-provisioned machine. Data is collected from the daemons using a pull model and is fetchedevery 5 minutes. All fetches are performed in parallel to reduce latency. Global information isgenerated by analyzing the node centric logs and is stored in the database. Additionally informationis aggregated and summaries are provided at a granularity necessary to make meaningful inferencefrom the data. Precision of the monitored information is never lost and it supports data offloadingwhich is then provided as an open data-set.

Monitored information is reported via a web interface that supports sorting, and shows graphs ofhistorical data. The reporting currently covers OS-provided metrics and metrics synthesized fromother sources on the node. The system reports the following OS-provided metrics: uptime, CPUutilization, memory utilization, total memory, disk size, disk space available, 1 minute load, networkdata sent and received. Figure 4.1 shows the CPU usage over time for a research device. Synthesizeddata includes last time the monitoring daemon on the research device was seen, open ports, ping statusand slice centric information. The system maintains only a manageable set of metrics that help theresearchers get insight of any strange behavior in a given node. To facilitate selecting the best nodesto run new services, the web interface provides a Treemap view of all the nodes based on the historicaltrend (customizable) of resource usage. Figure 4.2 shows a Treemap view of the devices based on theresources consumed.

26

4. Research on self-management 4.2. Preventive Mitigation of Performance Interference

Figure 4.1: Average CPU Usage

Figure 4.2: Treemap representation of resource usages

4.2 Preventive Mitigation of Performance Interference

Typically the cloud services run on virtual machines for consolidation and to ensure operational iso-lation. Since resources in a physical host is shared, virtualization does not guarantee any isolationin performance and can critically degrade the performance of sensitive services. While co-locatingvirtual machines improves utilization in resource shared environments, the resulting performanceinterference between VMs is difficult to model or predict. The common practice of overprovision-ing resources helps to avoid performance interference and guarantee QoS but leads to low machineutilization. Thus, assuring QoS for sensitive services when allowing co-locations is a challengingproblem.We present Stay-Away, a generic mechanism that periodically monitors the resource usage metricsof every Virtual Machine in the host and maps it onto a two dimensional space while maintainingthe topological properties (relative distances) of the higher dimensional metric space. The monitoredresource metrics are multidimensional as they capture relevant resource metrics such as CPU, mem-ory, disk I/O and network metrics for every VM in the physical machine. If any VM has experienceda violation in its performance metric, it results in a specific combination ofmonitored metrics thatcorrespond to a QoS violation, called a violated state. The system relies on the application reportedperformance metric to detect a QoS violation. Once violated states are captured, the system detectsany transition toward violated states and take preventive actions to steer performance sensitive VMs

Deliverable D3.227

4.2. Preventive Mitigation of Performance Interference 4. Research on self-management

Figure 4.3: Stay-Away Conceptual Architecture

Figure 4.4: Colocated Execution of VLC with Batch applications

away from any degradation. This is achieved by a continuous spatial analysis of the resource usagemetrics to identify transitions, their rate and direction. Upon detection of any transition towards aviolated state, the system is tuned and contention removed by throttling batch applications until thesystem progresses to a stable state. Additionally, the state representation for a performance sensitiveapplication is independent of the specific batch applications running on the co-located virtual ma-chines, since the captured states are representative of usage and contention at the resource level. Asa result, the captured states for a performance sensitive application doubles as a template that can beused for future executions alongside different set of application co-locations. Figure 4.3 shows theconceptual architecture of Stay-Away.

We implemented a prototype for LXC containers and experiment with VLC streaming service as thesensitive application co-located with different batch applications. Our work presents a new way tovisualise co-located application behaviour and the results indicate that with Stay-Away it is possibleto guarantee a high degree of QoS while also improving machine utilization. Figure 4.4 shows thevisualisation of co-located execution of VLC streaming server with other batch applications. Regionis red indicates violation and blue indicate normal state of execution.

28Deliverable D3.2

4. Research on self-management 4.3. ElastMan

Figure 4.5: Multi-Tier Web 2.0 Application with Elasticity Controller Deployed in a CloudEnvironment

4.3 ElastMan: Elasticity Manager for Elastic Key-Value Stores in theCloud

There is a need for an elasticity controller in order to improve utilization of Cloud resources used byan elastic Cloud-based service. The elasticity controller allows to use a minimal amount of resourcesneeded to provide the required (or acceptable) quality of service and, as a consequence, to avoid over-provisioning of resources under a low workload. The elasticity controller automatically resizes anelastic service in response to changes in workload, in order to meet Service Level Objectives (SLOs)at a reduced cost. However, variable performance of Cloud virtual machines and nonlinearities inCloud services complicates the controller design.

In this research, we are targeting multi-tier Web 2.0 applications (the left side of 4.5) that can beprovided, in particular, in a community network. Web 2.0 applications, such as Social Networks,Wikis, and Blogs, are data-centric with frequent data access. We are focusing on managing the data-tier because of its major effect on the performance of Web 2.0 applications, which are mostly datacentric. For the data-tier, we assume horizontally scalable key-value stores due to their popularityin many large scale Web 2.0 applications such as Facebook and LinkedIn. A typical key-value storeprovides a simple put/get interface. This simplicity enables efficient partitioning of the data amongmultiple servers and thus to scale well to a large number of servers. In CLOMMUNTY, an elastic key-value store with the elasticity controller can be provided as a community storage or used as a part of aCloud-based (Cloud-assisted) application deployed on an multi-cloud infrastructure in a communitynetwork.

The strict performance requirements posed on the data-tier in a multi-tier Web 2.0 application togetherwith the variable performance of Cloud VMs and dynamic workload make it challenging to automateelasticity.

We present the design and evaluation of ElastMan [21][22], an elasticity controller for Cloud-basedelastic key-value stores. More detailed description of ElasMan can be found in [22].

ElastMan is an autonomic elasticity manager that automatically resizes (scales out or shrinks) anelastic service in response to changes in workload, in order to meet SLOs at a reduced cost. Inorder to achieve this in an efficient and effective way, ElastMan combines feedforward and feedbackcontrol.

The feedforward controller of ElastMan is used to quickly respond to sudden large changes (spikes)

Deliverable D3.229

4.3. ElastMan 4. Research on self-management

Figure 4.6: ElastMan Feedforward Control

Figure 4.7: ElastMan Feedback Control

in the workload. It monitors the workload and uses a logistic regression model of the service topredict whether the workload will cause the service to violate the SLOs and acts accordingly. Whenthe workload grows rather fast, the controller computes the number of servers to be added in orderto avoid possible SLO violation. If the workload quickly drops, the controller requests removing anumber of servers in order to reduce service cost while meeting required SLO.

The feedback controller is used to correct errors in the model used by the feedforward controller andto handle gradual (e.g., diurnal) changes in workload. It monitors the service performance and reactsbased on the amount of deviation from the desired performance specified in the SLO. We have imple-mented and evaluated ElastMan using the Voldemort key-value store running in an OpenStack Cloud.The Voldemort key-value store is used in production in many applications such as LinkedIn. Our eval-uation results presented below show the feasibility and effectiveness of our approach to automationof Cloud service elasticity.

4.3.1 Controller Design

The objective of ElastMan is to regulate the performance of key-value stores according to a prede-fined SLO expressed as the 99th percentile of read operations latency over a fixed period of time(R99p thereafter). To address the challenges of controlling a noisy signal and variable performanceof VMs, ElastMan consists of two components, a feedforward controller (Figure 4.6) and a feedbackcontroller (Figure 4.7). The actuator in both controllers uses the Cloud API to request/release re-sources, the elasticity API to add/remove new servers, and the re balance API to redistribute the dataamong servers. ElastMan relies on the feedforward controller to handle rapid large changes in theworkload (e.g., spikes). This enables ElastMan to smooth the noisy 99th percentile signal and usethe PI feedback controller to correct errors in the feedforward system model in order to accuratelybring the 99th percentile of read operations to the desired SLO value. In other words, the feedforwardcontrol is used to quickly bring the performance of the system near the desired value and then thefeedback control is used to fine tune the performance.

Due to the nonlinearities in elastic Cloud services, resulting from the diminishing reward of addinga service instance (VM) with increasing the scale, we propose a scale-independent model used todesign the feedback controller. This enables the feedback controller to operate at various scales ofthe service without the need to use techniques such as gain scheduling. To achieve this, our designleverages the near-linear scalability of an elastic service.

In the design of the feedback controller, we propose to model the target store using the averagethroughput per server as the control input. Although we cannot control the total throughput on the

30Deliverable D3.2

4. Research on self-management 4.3. ElastMan

Figure 4.8: Binary Classifier for One Server

system, we can indirectly control the average throughput of a server by adding/removing servers.Adding servers reduces the average throughput per server under the same load, whereas removingservers increases the average throughput per server. Thus, the controller decisions become indepen-dent of the number of service instances. The major advantage of our proposed approach to model thestore is that the model remains valid as we scale the store, and it does not depend on the number ofservers.

For the model in the feedforward controller we use a binary classifier built using logistic regression.The model is trained offline by varying the average intensity and the ratio of read/write operations perserver as shown in Figure 4.8. The classifier splits the workload plane into two regions. In the regionon and below the model line, the SLO is met. In the region above the line, SLO is violated. Ideally,the average measured throughput should be on the line, which means that the SLO is met with theminimal number of servers.

4.3.2 Elasticity Control Algorithm of ElastMan

ElastMan combines the feedforward and feedback controllers, which complement each other. Thefeedforward controller relies on the feedback controller to correct errors in the feedforward model.The feedback controller relies on the feedforward controller to quickly respond to spikes so that thenoisy R99p signal that drives the feedback controller can be smoothed. ElastMan starts by measuringthe 99th percentile of read latency (R99p) and the average throughput (tp) per server. The R99psignal is smoothed using a smoothing filter resulting in a smoothed signal (fR99p). The controllerthen calculates the error, which is the difference between the setpoint, which in our case is the SLOvalue of R99p, and the measured system output. If the error is in the deadzone defined by a thresholdaround the desired R99p value, the controller takes no action. Otherwise, the controller compares thecurrent tp with the value in the previous round. A significant change in the throughput (workload)indicates a spike. The elasticity controller then uses the feedforward controller to calculate the newaverage throughput per server needed to handle the current load. On the other hand, if the change in

Deliverable D3.231

4.3. ElastMan 4. Research on self-management

Figure 4.9: Performance of Voldemont without ElastMan with fixed number of servers (18 servers)under gradual diurnal workload (0-900 min) and under workload with spikes (900-1500 min)

the workload is relatively small, the elasticity controller uses the feedback controller which calculatesthe new average throughput per server based on the current error. In both cases the actuator uses thecurrent total throughput and the new average throughput per server to calculate the new number ofservers. During the rebalance operation performed when adding or removing servers, both controllersare disabled in order not to be misled by rebalancing activities.

4.3.3 Evaluation Results

We have implemented ElastMan in order to evaluate our proposed approach to automation of Cloudservice elasticity. In order to evaluate ElastMan, we have chosen the Voldemort (version 0.91) Key-Value Store [4] which is used in production in many applications such as LinkedIn. We run ourexperiments in the OpenStack-based cloud infrastructure at KTH provided on a cluster of 11 serverseach with two Intel Xeon X5660 processors (24 HW threads), and 44 GB of memory. The cluster runsUbuntu 11.10. We setup the private Cloud at KTH using OpenStack Diablo release. We have testedElastMan controller with both gradual diurnal workload and sudden changes (spikes) in workload.The goal of ElastMan controller is to keep R99p at a value specified in the service SLO. In ourexperiments we choose the value to be 5 ms in 1 min period. Figure 4.9 depicts the performance ofVoldemort without ElastMan, i.e., with a fixed number of servers. Results shows that the store cannot meet required SLO most of the time. Figure 4.10 depicts the performance of Voldemort withElastMan. Results show that ElastMan is able to keep the R99p within the desired region most of thetime under a gradual workload (0-900 min) as well as under workload with spikes (900-1500 min).

4.3.4 Future Work

In our future work on elasticity automation, we plan to investigate the elasticity controllers needed tocontrol other (if not all) tiers of a Web 2.0 application and the orchestration of the controllers in order

32Deliverable D3.2

4. Research on self-management 4.4. BwMan: Bandwidth Manager for Services in the Cloud

Figure 4.10: Performance of Voldemort with ElastMan under gradual diurnal workload (0-900 min)and under workload with spikes (900-1500 min)

to correctly achieve their goals. We intend to deploy ElasMan with an elastic key-value store in theCLOMMUNITY cloud on the guifi.net community network. We plan to integrate Elasman with otherservices developed in the CLOMMUNITY project.

4.4 BwMan: Bandwidth Manager for Services in the Cloud

The flexibility of Cloud computing allows elastic services to adapt to changes in the workload in orderto achieve the desired Service Level Objectives (SLOs) at a reduced cost. Typically, the service adaptsto workload changes by adding or removing service instances (VMs), which for stateful services willrequire moving data among instances. The SLOs of a Cloud-based service is sensitive to the amountof available network bandwidth, which is usually shared by various service activities without beingexplicitly allocated and managed as a resource.We present the design and evaluation of BwMan, a network bandwidth manager for services in theCloud. BwMan predicts and performs the bandwidth allocation and tradeoffs between multiple ser-vice activities in order to meet SLOs. To make management decisions, BwMan uses statistical ma-chine learning (SML) to build predictive models that allow BwMan to arbitrate available bandwidthamong different activities to satisfy specified SLOs. We have implemented and evaluated BwManfor the OpenStack Swift object store. Our experiments show using BwMan allows reducing SLOviolations in Swift by at least a factor of two.

4.4.1 Motivation

In order to improve user experience with a Cloud-based or Cloud-assisted service at a minimal cost, itis essential to automate the resource provisioning in order to resize the service in response to workloadchanges at runtime without violating SLOs and without over-provisioning of Cloud resources. Issues

Deliverable D3.233

4.4. BwMan: Bandwidth Manager for Services in the Cloud 4. Research on self-management

to be considered when building systems to be automatically scalable in terms of server capabilities,CPU and memory, are fairly well understood by the research community. However, efficient andeffective network resource management is still an open issue. In most of recent Cloud-based systems,services, and applications, network bandwidth is usually not explicitly allocated and managed as ashared resource. Network bandwidth should be considered as a first class managed resource of Cloudinfrastructure deployed in a community networks, where the network bandwidth is the major resource.In this work, we demonstrate the necessity of managing the network bandwidth shared by servicesrunning on the same platform, especially when the services are bandwidth intensive. The sharing ofnetwork bandwidth can happen among multiple individual applications or within one application ofmultiple services deployed in the same platform. In essence, both cases can be solved using the samebandwidth management approach. The difference is that the bandwidth allocation is with differentgranularities, for example, on VMs, applications or threads. In our work, we have implementedthe finest bandwidth control granularity, i.e., network port level, which can be easily adapted in theusage scenario of VMs, applications, or services. Specifically, our approach is able to distinguishbandwidth allocations to different ports used by different services within the same application. Infact, this fine-grained control is needed in many distributed applications, where there are multipleconcurrent threads creating workloads competing for bandwidth resources.As a use case we consider a Cloud storage service, namely the OpenStack Swift object store. Thereare two kinds of workload in a storage service. First, the system handles dynamic workload generatedby the clients, that we call user-centric workload. Second, the system tackles with the workload re-lated to system maintenance including load rebalancing, data migration, failure recovery, and dynamicreconfiguration (e.g., elasticity). We call this workload system-centric workload. From our experi-ment observations, in a distributed storage system, both user-centric and system-centric workloadsare network bandwidth intensive. To arbitrate the allocation of bandwidth between these two kinds ofworkload is challenging. Insufficient bandwidth allocation for the user centric workload might leadto the violation of SLOs; whereas the system may fail when insufficient bandwidth is allocated fordata rebalance and failure recovery, i.e. system-centric activities.We propose the design of BwMan, a network bandwidth manager for elastic Cloud services. BwManarbitrates the bandwidth allocation among individual services and different service activities sharingthe same Cloud infrastructure. Specifically, BwMan arbitrates the available inbound and outboundbandwidth of servers , i.e., bandwidth at the network edges, to multiple hosted services; whereas thebandwidth allocation of particular network flows in switches is not under the BwMan control. Inmost of the deployments, control of the bandwidth allocation in the network by services might not besupported.

4.4.2 Predictive Models of the Target System

BwMan bandwidth manager uses easy-computable predictive models to foresee system performanceunder a given workload in correlation to bandwidth allocation. As there are two types of work-loads in the system, namely user-centric and system-centric, we show how to build two predictivemodels. The first model defines correlation between the user-oriented performance metrics underuser-centric workload and the available bandwidth. The second model defines correlation betweensystem-oriented performance metrics under system-centric workload and the available bandwidth.First, we analyze the read/write (user-centric) performance of the system under a given network band-width allocation. In order to conduct decisions on bandwidth allocation against read/write perfor-mance, BwMan uses a regression model of performance as a function of available bandwidth. The

34Deliverable D3.2


Figure 4.11: Regression Model for System Throughput vs. Available Bandwidth

Figure 4.12: Regression Model for Recovery Speed vs. Available Bandwidth

model can be built either off-line by conducting experiments on a rather wide (if not complete) oper-ational region; or on-line by measuring performance at runtime. In this work, we present the modeltrained off-line for the OpenStack Swift store by varying the bandwidth allocation and measuring sys-tem throughput as shown in Figure 4.11. The model is set up in each individual storage node. Basedon the incoming workload monitoring, each storage node is assigned with demanded bandwidth ac-cordingly by BwMan in one control loop. The simplest computable model that fits the gathered datais a linear regression of the following form:

Throughput[op/s] = α ∗Bandwidth + β

Next, we analyse the correlation between system-centric performance and available bandwidth,namely, data recovery speed under a given network bandwidth allocation. By analogy to the firstmodel, the second model was trained off-line by varying the bandwidth allocation and measuringthe recovery speed under a fixed failure rate. The difference is that the model predictive process iscentrally conducted based on the monitored system data integrity and bandwidth are allocated homo-geneously to all storage servers. For the moment, we do not consider the fine-grained monitor of dataintegrity on each storage node. We treat data integrity at the system level.

The model that fits the collected data and correlates the recovery speed with the available bandwidth isa regression model where main feature is of logarithmic nature as shown in Figure 4.12. The concisemathematical model is:

RecoverySpeed[MB/S] = α ∗ ln(Bandwidth) + β

Deliverable D3.235

4.4. BwMan: Bandwidth Manager for Services in the Cloud 4. Research on self-management

Figure 4.13: BwMan Control Workflow

4.4.3 BwMan: BandWidth Manager

BwMan operates according to the MAPE-K loop. The flowchart of BwMan is shown in Figure 4.13.BwMan monitors three signals, namely, user-centric throughput (defined in SLO), the actually work-load to each storage server and data integrity in the system. At given time intervals, the gathered dataare averaged and fed to analysis modules. Then the results of the analysis based on our regressionmodel are passed to the planning phase to decide on actions based on SLOs and potentially maketradeoff decision. The results from the planning phase are executed by the actuators in the executionphase. Figure 4.13 depicts the MAPE phases as designed for BwMan. For the Monitor phase, we havetwo separate monitor ports, one for user-centric throughput (M1) and the other one for data failurerates (M2). The outputs of these stages are passed to the Analysis phase represented by two calcu-lation units, namely A1 and A2, that aggregate and calculate new bandwidth availability, allocationand metrics to be used during the Planning phase according to the trained models in the previous sec-tion. The best course of action to take during the Execution phase is chosen based on the calculatedbandwidth necessary for user-centric workload(SLO) and the current data failure rate, estimated fromsystem data integrity in the Planning phase. The execution plan may include also the tradeoff decisionin the case of bandwidth saturation. Finally, during the Execution phase, the actuators are employedto modify the current state of the system, which is the new bandwidth allocations for the user-centricworkload and for the system-centric (failure recovery) workload to each storage server.

4.4.4 Evaluation Results

We run our experiments in the OpenStack-based cloud infrastructure at KTH provided on a cluster of11 servers each with two Intel Xeon X5660 processors (24 HW threads), and 44 GB of memory. Wehave deployed a Swift cluster with a ratio of 1 proxy server to 8 storage servers as recommended inthe OpenStack Swift documentation.We demonstrate that BwMan allows meeting the SLO according to specified policies in tradeoff deci-sions when the total available bandwidth is saturated by user-centric and system-centric workloads. Inour experiments, we have chosen to give preference to user-centric workload, namely system through-put, instead of system-centric workload, namely data recovery. Thus, bandwidth allocation to datarecovery may be sacrificed to ensure conformance to system throughput in case of tradeoffs.Figure 4.14 and Figure 4.15 depict the results of our experiments conducted simultaneously in thesame time frame; the x-axis shares the same timeline. The failure scenario introduced by our fail-ure simulator. Figure 4 presents the achieved throughput executing user-centric workload withoutbandwidth management, i.e., without BwMan. In these experiments, the desired throughput starts at80 op/s, then increases to 90 op/s at about 70 min, and then to 100 op/s at about 140 min. Resultsindicate high presence of SLO violations (about 37.1%) with relatively high fluctuations of achieved

36Deliverable D3.2


Figure 4.14: Throughput of Swift without BwMan

Figure 4.15: Throughput of Swift with BwMan

throughput. Figure 5 shows the achieved throughput in Swift with BwMan. In contrast to Swiftwithout bandwidth management, the use of BwMan in Swift allows the service to achieve requiredthroughput (meet SLO) most of the time (about 8.5% of violation) with relatively low fluctuations ofachieved throughput.The results demonstrate the benefits of BwMan in reducing the SLO violations with at least a factorof 2 given a 5% interval and a factor of 4 given a 15% interval.

4.4.5 Conclusion

We have presented the design and evaluation of BwMan, a network bandwidth manager providingmodel-predictive policy-based bandwidth allocation for elastic services in the Cloud. For dynamicbandwidth allocation, BwMan uses predictive models, built from statistical machine learning, to de-cide bandwidth quotas for each service with respect to specified SLOs and policies. Tradeoffs needto be handled among services sharing the same network resource. Specific tradeoff policies can beeasily integrated in BwMan.We have implemented and evaluated BwMan for the OpenStack Swift store. Our evaluation hasshown that by controlling the bandwidth in Swift, we can assure that the network bandwidth is effec-tively arbitrated and allocated for user-centric and system-centric workloads according to specifiedSLOs and policies. Our experiments show the effectiveness of BwMan that allows to reduce SLOviolations at least by a factor of two.

Deliverable D3.237

5 Research on Scalable Service Overlays and ClommunityServices

5.1 CaracalDB

Distributed storage services form the backbone of modern large-scale applications and data processingsolutions. In this integral role they have to provide a scalable, reliable and performant service. One ofthe major challenges any distributed storage system has to address is skew in the data load, which caneither be in the distribution of data items or data access over the nodes in the system. One widespreadapproach to deal with skewed load is data assignment based on uniform consistent hashing. However,there is an opposing desire to optimise and exploit data-locality. That is to say, it is advantageous tocollocate items that are typically accessed together. Often this locality property can be achieved bystoring keys in an ordered fashion and using application level knowledge to construct keys in sucha way that items accessed together will end up very close together in the key space. It can easilybe seen, however, that this behaviour exacerbates the load skew issue. A different approach to loadbalancing is partitioning the data into small subsets which can be relocated independently. Thesesubsets may be known as partitions, tablets or virtual nodes, for example. Our system, CaracalDB,is a distributed key-value store which provides automatic load-balancing and data-locality, as well asfast re-replication after node failures, while remaining flexible enough to support different consistencylevels to choose from.

5.1.1 Introduction

When systems are deployed over a large number of machines, the probability of failure increaseslinearly with their number. Hence, it is well accepted that in distributed systems failures are thenorm and not an exception. It is imperative for almost all distributed storage systems for data to bereplicated across a number of machines so that a single failure can not lead to immediate serviceinterruptions. However, it is also important to take measures that reduce the probability for data-lossin the face of multiple concurrent or consecutive failures. The two most prominent parameters thatinfluence this probability are replication degree δ and the re-replication time, that is the time aftera failure at which point again δ copies of every data item are available in the system. However, formost replication algorithms increasing δ degrades the performance of the system when there are nofailures, making it a tuneable which needs to be considered very carefully. This relationship is notso clear cut with re-replication time. In the optimal case there is no influence between steady-stateperformance and re-replication time. Now there is a number of parameters that influence the actualtime it takes to re-replicate data after a failure, but basically it comes down to how many disks areparticipating and where the network will bottleneck. There have been significant improvements inrecent years to remove the network bottleneck for datacenters [23, 24] as well as push the number ofparticipating disks [25].It has been known since the original Chord paper [26] that virtual nodes(vnodes) can be used toimprove distributed hash tables in terms of load balancing and re-replication speed. However, moderndistributed key-value stores like Riak [27] partition their key-space during bootstrapping of the system

38

5. Research on Scalable Service Overlays and Clommunity Services 5.1. CaracalDB

and the partitioning can not be changed at run-time. This effectively means that as the system grows interms of hosts, the re-replication time for a single host failure actually grows, as the number of vnodesper host shrinks. We propose a system where the number of partitions is dynamic and increases withthe number of hosts, such that the number of vnodes per host stays approximately constant and there-replication time for single host failures decreases with the number of hosts in the system.

5.1.2 Scalable Fast Re-replication

Table 5.1: Re-replication performance analysis for r = 100 and 1TB of data.

re-replication timenetwork speed 2 SSDs (600MB/s) 1 HDD (130MB/s) 1 HDD (130MB/s) + CLOS

1 partition per host10Gbit/s 30min 2.5h 2.5h1Gbit/s 2.5h 2.5h 2.5h80Mbit/s 30h 30h 30h8Mbit/s 12d 12d 12d

3 partitions per host (SL-rep with δ = 3)10Gbit/s 14min 43min 43min1Gbit/s 2.5h 2.5h 43min80Mbit/s 30h 30h 9.5h8Mbit/s 12d 12d 4d

10 partitions per host10Gbit/s 14min 14min 13min1Gbit/s 2.5h 2.5h 14min80Mbit/s 30h 30h 3h8Mbit/s 12d 12d 9.5h

50 partitions per host10Gbit/s 14min 14min 10s1Gbit/s 2.5h 2.5h 2min80Mbit/s 30h 30h 20min8Mbit/s 12d 12d 3h

We argue that for hosts with total storage capacity Ch, using a number nv of vnodes of a fixed size Cvsuch that nvCv ≤ Ch can improve the performance of data transfer after node failures (re-replication)significantly. In order to see why a system with Cv = Ch might be slow on re-replication consider thefollowing scenario: In our system S = {a, b, c, d} let hosts a, b, c form a replication group such thatall data is 3-way replicated (replication degree δ = 3) among them and d is an empty standby host.Let us further assume that a, b, c are filled up to their capacity Ch and the network capacity betweeneach pair of hosts in the system is at least vS , the capacity of the slowest link between two hosts. Fora datacenter deployment vS may be 1 or 10Gbit/s, across the internet maybe 80 Mbit/s. If each hoststores its data on a single hard disk drive (HDD) it might be able to transfer data at vD 130 MB/s or300MB/s for a an solid state disk (SSD).

Deliverable D3.239

5.2. Distributed Search Service 5. Research on Scalable Service Overlays and Clommunity Services

Now if a were to fail the the data stored on b and c would have to be re-replicated to d in order tomaintain the replication degree. This can happen only with transfer rate vT = min{vS, vD}, sinceall data needs to pass to a single node d. Hence the re-replication time, and with it the time duringwhich another fail might lead to inconsistent data and a third one to total data loss, is tT =

Ch

vT. For

example, if Ch = 1TB the best case scenario vT = 600MB/s (2 SSDs and 10Gbit/s ethernet) we gettT ≈ 30min. In the worst case we are writing to a single HDD over internet speed, so vT = 10MB/s,resulting intT ≈ 30h. Table 5.1 gives an idea of the tT for different vT .

Now instead, we can use clever assignments of vnodes to replication groups in order to improve there-replication time in case of host failures. This is an idea that was exploited to an extreme in [25],but it also applies to cases with coarser partitioning. Firstly, we note that in the above case other hostsin the system are sitting idle in terms of re-replication of the data from a. They might of course berunning normal operations or re-replication from another failure.As opposed to this consider a system S = {hi ∣ i ∈ {1, . . . , r}} where data is partitioned and partitionsare mapped to a number of different replication groups. If there are p partitions all with replicationdegree δ (for example 3 as above) over the h hosts in S, that means there will be E[p] = p⋅δ

r vnodeson average per host, if the system is in a well balanced state. Also assume that the system has somefree capacity on every host, that is Ch − E[p] ⋅ Cv > Cv. Now consider if a single host hf fails. Forlarge p

r the percentage of hosts in the system that have some replication group in common with a isclose to 100%. If this system re-replicates by assigning the failed vnode for every partition on a toa random other host that doesn’t already have a replica of the data, for example, the result will bethat re-replication is done using (almost) all available disks in the cluster. Thus in a network that isbottlenecked at vS we get vT =min{vS, r ⋅vD}. If the network provides full bisection bandwidth (fbb),like a CLOS [23, 24] network for example, then we even get vT =min{r ⋅ vS, r ⋅ vD}, since every hostwill most likely be transferring data to and from a number of different hosts. If the network topologyis such that some hosts are connected better than others, the actual tT will be somewhere between thefirst of the second case as can shown in table 5.1. Of course, it also means that normal operations forall partitions in the cluster are affected. Yet, we argue that it is preferable to have a latency increaseon all operations in the cluster for a relatively short time, than to have a latency increase in only asmall subset but for a very long time with the added higher possibility for data loss.

5.2 Distributed Search Service

The video-on-demand service is not practical without an ancillary search service that enables usersto discover content available in the network. Our main requirements on the search service are that itshould:

• be reliable and provide results with low latency;• be general enough to be used by many different services, not just video-on-demand;• provide consistent responses to queries;• be as robust as possible to network partitions;• not over-consume the limited resources in the CLOMMUNITY Cloud.

The index will be user-generated and user maintained. Users will be able to add new index entries,for example, when he/she makes a new video available in the system. We see our main researchchallenges as:

• building a reliable, low-latency search service using unreliable nodes and an unreliable network;

40Deliverable D3.2

5. Research on Scalable Service Overlays and Clommunity Services 5.2. Distributed Search Service

Figure 5.1: CLOMMUNITY-Supported Search Architecture.

• ensuring that the service works for nodes behind NATs;

• ensuring the integrity of the search index at all times, given that users are responsible for up-dating the index.

5.2.1 Index Entries

The following is an example of the structure of a search index entry that could be used by our system:

• IndexId (required, not searchable)

• URL (required, not searchable)

• Social network ID of the person who published the index entry (required, not searchable)

• Name of file (required, searchable)

• File size (not required, searchable)

• Date uploaded (not required, searchable)

• Language (not required, searchable)

• Category of the Content (video, books, music, games, etc) (required, searchable)

• Free text description of the file, up to a limit of 5000 characters (required, searchable)

• InfoHash of the searchable contents (required, not searchable)

• Availability status for the content (required, searchable)

We should strive to make index entries immutable, as this simplifies protocols for replicating the indexamong nodes. However, in the above example for an index entry, the availability status for the contentwould be mutable. We could treat that data as separate from the index entry, and as data that couldbe discovered at runtime using the service responsible for the content – e.g., the video-on-demandservice.

Deliverable D3.241

5.2. Distributed Search Service 5. Research on Scalable Service Overlays and Clommunity Services

Figure 5.2: Decentralized Search Architecture.

5.2.2 Search Service Architecture

Architecturally, there are two different approaches we could follow to build our search service. Wecould host the search service and the entire index on CLOMMUNITY Cloud nodes or we coulddistributed the search index across potentially all nodes in the wireless community network.

• CLOMMUNITY-supported search service: A CLOMMUNITY-supported search service (Fig-ure 5.1) would require us to use resources at CLOMMUNITY nodes to provide the searchservice, to store the search index and to replicate the search index among all nodes in theCLOMMUNITY cloud. Clients would locally build a model of the expected response timefor some of the CLOMMUNITY nodes, and greedily route search requests to the node withthe lowest expected latency. The Cloud- supported search service could be implemented asan asynchronously replicated search service, as high latency links and unreliable connectionsprevent the use of either strongly consistent replication algorithms or quorum-based replicationalgorithms. Search queries would have eventual consistency semantics, meaning that, eventu-ally, all index updates will reach all CLOMMUNITY nodes. However, for a small amount oftime, users may not be able to find content until the index entry referring to the content has beenreplicated to the Clommunity node to which it has sent its query.

• Fully decentralized search service: We will also investigate the idea of building a fully decen-tralized search service, based on a partitioned index (Figure 5.2). Nodes would store partitionsof the index locally, probably in a local instance of Apache Lucene. Nodes within the samepartition would gossip updates to the index with each other. This service would again provideeventual consistency semantics for the search service. Nodes would add an index entry by con-tacting a leader node within their partition, who would then be responsible for ensuring theconsistency of the index entry and that it is safely replicated to enough nodes. When a nodesearches for an index entry, it needs to send the query to at least one node in all partitions. Inpractice, nodes will send requests to more than one node per partition and use the result fromthe first node to respond from each partition. The query will return results to the user eitherwhen a node from each partition has returned results or when a timer expires (say 5 seconds).We can constrain users to search within categories of content, significantly reducing the systemoverhead of a search query, as illustrated below.

42Deliverable D3.2

5. Research on Scalable Service Overlays and Clommunity Services 5.3. Local service allocation

Figure 5.3: Guifi.net zones partition graph

We will make our decision about whether to build a CLOMMUNITY-supported search service or adecentralized search service based on the success of our NAT traversal service. If the NAT traversalservice is reliable and does not introduce excessive control traffic, we will adopt the decentralizedarchitecture. Otherwise, we will build a CLOMMUNITY-supported search service.

5.3 Local service allocation

Service allocation protocols are a key design challenge on cloud systems with considerably impacton resource usage optimization. They are specially important when cloud services requires a Qualityof Service (QoS) and network stability or performance (delay, jitter, minimum bandwidth) cannot beguaranteed beforehand. Our work explores a service allocation algorithm that minimizes the coordi-nation cost between CLOMMUNITY super nodes and regular nodes, assuming that such cost can beimputed to the overlay topological configuration.

Our work was performed considering Guifi.net, one of the largest community networks and whereCLOMMUNITY is deployed. The network consists of a number of different devices mostly inter-connected by wireless links. These nodes and links are organized under a set of mutually exclusiveand abstract structures called administrative zones, which represent the geographic areas where nodesare deployed. Figure 5.3 depicts the resulting zones partition graph of the Guifi.net graph. The sizeof each vertex is proportional to the number of Guifi.net locations contained in that zone, while thethickness of the edges stands for the number of physical links between nodes on both connected zones.

For our research work we assumed that the CLOMMUNITY services coordination will be managedlocally in each administrative zone by a single administrative node. The coordination between thesesuper nodes will be done using the wireless community network, which means that it will be directlyproportional of the number of zones traversed.

Deliverable D3.243

5.3. Local service allocation 5. Research on Scalable Service Overlays and Clommunity Services

Figure 5.4: Obtained service diameter (max, min and average) and number of aggregated zones(average) for different optimal service overlay orders

5.3.1 Experimental evaluation

To evaluate the potential interest of our local service coordination, we employed a topological snap-shot from Guifi.net community network taken on April 2013. We assumed a central coordinationentity that has knowledge about the WCN (Wireless Community Network) topology in real-time,which is feasible due the Guifi.net monitoring aggregation system. Then, we used an exploratoryalgorithm to find all the optimal and sub-optimal service overlay placement in the Guifi.net core net-work, namely the minimum sub-network diameter that traverses less zones in the Guifi.net networkbackbone representation.Details on our research can be found in [28], along with a complete description of the network model,the developed algorithms and the topological analysis of the overlay solutions – which also evaluatethe social centrality parameters. In the following we present the most important results concerningthe optimal solutions found.Figure 5.4 shows that it is always possible to find an optimal service allocation with an overlay diam-eter of 2 hops within Guifi.net. Additionally, the average solutions diameter increases as the numberof nodes that composes the services does. It is also interesting to discover that the average number ofaggregated zones considered to find the optimal allocation does not depend on the service size. Bothbehaviours indicate the existence of nodes with very high degree values, either within the same zoneor between several zones in the network topology.From the point of view of users, however, we were more interested on finding out how many nodesin the network were at one hop of any service replica. In the CLOMMUNITY scenario, this is animportant factor because almost all services that users intend to deploy will be intended for beingused. Figure 5.5 shows the cumulative probability function (CDF) of the number of nodes in thenetwork base-graph that are at a single hop to any of the nodes in the optimal service overlay.As we expected, the more zones are taken into account for resource selection, the more nodes are atone hop of any of the allocated replicas. This result is very reasonable because the selected nodesare spread along geographic larger network areas and so are closer to many other nodes in the whole

44Deliverable D3.2

5. Research on Scalable Service Overlays and Clommunity Services 5.4. Tahoe-LAFS

Figure 5.5: CDF of base-graph nodes at 1 hop to optimal solutions components

network. For instance, around 50% of the service allocations obtained considering only one adminis-trative zone had about 100 nodes or less at one hop.

5.3.2 Lessons learnt

The experimental evaluation provided us valuable information about how to use geographically-basedoverlays to coordinate local-services allocation. The lessons learnt from the experiments can be sum-marized as:

• Coordination cost. In our experimental model, the coordination cost has been revealed as beingalmost fixed between 2 or 3 zones traversal on optimal solutions. However, the number ofzones over which a given service is deployed has a huge impact on the number of clients thatwill have fast access to the service. Hence, new algorithms must be take into account that findthe trade-off between the coordination cost and their service availability or reachability in thenetwork.

• Solutions components and properties. A topological pattern has been also observed in the nodefeatures that conform the optimal allocations (see [28]). They reveal that the nodes’ minimumdegree centrality can be used to select the first node that compose each service overlay, withouttaking into account other solutions that are more complex to compute. However, finding optimaloverlays for multiple services requires to select the nodes in a particular range of closeness andbetweenness centrality.

5.4 Tahoe-LAFS

Community networks are successful large scale, decentralized IP networks, built and operated bycitizens for citizens. The number of applications which are deployed within community networks,however, is surprisingly low, making community networks lose an opportunity of having an importantadditional value for society. The cloud computing infrastructures present in today’s Internet, hardly

Deliverable D3.245

5.5. XtreemFS 5. Research on Scalable Service Overlays and Clommunity Services

exist in community networks. But the demand for cloud storage, and in particular secure cloudstorage, increase also from within the communities. Hence, distributed storage solutions are becomingimportant components for community networks.

Tahoe-LAFS is a decentralized storage system with provider-independent security that guaranteesprivacy to the users. Experimental work on Tahoe-LAFS reported by the scientific community didonly test Tahoe-LAFS in setting very different to clouds in community networks. In terms of provid-ing cloud storage services with Tahoe-LAFS in WAN settings, Chen’s paper [29] is the most relevantto our work regarding the study of this application. The authors deployed Tahoe-LAFS, QFS 1, andSwift2 in a multi-site environment and measured the impact of WAN characteristics on these storagesystems. However, the authors deployed their experiments on a multi-site data center with very dif-ferent characteristics to our scenario. Our approach is to assess the Tahoe-LAFS in the real context ofcommunity networks (with heterogeneous and less powerful machines).

We deployed Tahoe-LAFS in the community clouds and evaluate the Tahoe-LAFS storage system per-forms when it is deployed over a WAN setting on nodes in a community network. With experimentsin community networks, we characterized for different file sizes the upload and download times ofTahoe-LAFS, in relation to the bandwidth and latency between the client and the storage nodes. Whilewe observed higher upload and faster download times, Tahoe-LAFS performed correctly in the chal-lenging environment of the community network. Our results suggested Tahoe-LAFS as a promisingapplication for privacy-preserving, secure and fault-tolerant storage in community networks.

The experiments we performed with Tahoe-LAFS in community clouds are documented in deliverableD4.3.

5.5 XtreemFS

The XtreemFS is an open source object-based distributed file system for grid and cloud infrastruc-tures. The file system replicates objects for fault tolerance and caches data and metadata to improveperformance over high-latency links. As an object-based file-system, XtreemFS stores the directorytree on the Metadata and Replica Catalog (MRC) and file content on Object Storage Devices (OSD).The MRC uses an LSM-tree based database which can handle volumes that are larger than the mainmemory. OSDs can be added to the system as needed without any data re-balancing; empty OSDsare automatically used for newly created files and replicas. In addition to regular file replication,XtreemFS provides read-only replication.

Related work on XtreemFS include the work of [30] which evaluates XtreemFS, Ceph, GlusterFSand SheepDog when they are used as virtual disk image stores in a large scale virtual machine host-ing environment.They use the StarBED3 testbed for their experiments with very powerful machines.The paper of Talyansky [31] evaluates the performance of XtreemFS under the IO load produced byenterprise applications. They suggest that XtreemFS has a good potential to support transactional IOload in distributed environments, demonstrating good performance of read operations and scalabilityin general. The goal of our experiments is with XtreemFS in a real deployment within communitynetwork, understand its performance and operation feasibility under real network conditions.

Experiments with XtreemFS in community clouds are reported in deliverable D4.3.

1http://quantcast.github.io/qfs/2http://docs.openstack.org/developer/swift/3http://www.starbed.org

46Deliverable D3.2

5. Research on Scalable Service Overlays and Clommunity Services 5.6. OpenStack Swift

5.6 Evaluation of OpenStack Swift for Community Cloud Storage

In order to provide Infrastructure as a Service Cloud in a community network and enable Cloud-basedservices and applications, a proper backend storage system is needed for various purposes, such asmaintaining user information, storing Virtual Machine (VM) images as well as handling intermediateexperiment results. The first approach towards such a storage system is to focus and evaluate theexisting open source storage systems, such as, OpenStack Swift, which provides highly available andscalable storage for OpenStack-based Cloud. Swift typically is used as a backend storage systemoperating in a data-center Cloud. In order to examine the feasibility of using Swift in a communityCloud, we have conducted evaluation of OpenStack Swift in a simulated environment with propertiesof a community network – unreliable and heterogeneous network interconnection with relatively highlatency and low bandwidth. In order to tolerate possible node and link failures in a communitynetwork, we have developed and tested a self-healing mechanism for OpenStack Swift. More detailson evaluation of Swift can be found in [32].

5.6.1 Evaluation Methodology

The three main differences between a data-center cloud environment and a community cloud are asfollows. The first difference is in types of computing resources: powerful dedicated servers in data-center Clouds vs. heterogeneous computing components in a community Cloud. The second differ-ence is in the network features: exclusive high speed networks vs. shared ISP broadband and wirelessconnections. The third difference is in maintenance efforts: regular vs. self-organized. We conductour evaluation of Swift storage in a community network along the above differences. Specifically,considering the differences of computing resources, the minimum CPU and memory requirement ofa Swift proxy server to avoid bottlenecks and achieve efficient resource usage is evaluated. Then,given that the proxy servers and the storage servers are not the bottleneck, the evaluations quantifythe influences introduced by network latencies and bandwidth to the performance of a Swift clusteris presented comply with the second difference identification. Lastly, a self-healing mechanism isdesigned and evaluated in case of failures to compensate for the maintenance efforts in a data centerenvironment.

5.6.2 Evaluation Setup

In order to simulate a community Cloud environment, we have constructed a private Cloud and tunesome of its parameters according to the aspects that we want to evaluate. Specifically, we have con-figured the OpenStack Compute (Nova) and the Dashboard (Horizon) on a number of interconnectedservers. On top of the OpenStack Cloud platform, we have configured a Swift cluster with differentVM flavors and network connections for our experiments. The Swift cluster is deployed with a ratioof 1 proxy server to 8 storage servers. According to OpenStack Swift Documentation, this proxyand storage server ratio achieves efficient usage of the proxy and storage CPU and memory underthe speed bounds of the network and disk. Under the assumption of uniform workload, the storageservers are equally loaded.

5.6.3 Swift Throughput versus Network Latencies

In the first series of experiments, we have evaluated the Swift’s performance under specific networklatencies. The network latencies are introduced on the network interfaces of the proxy servers and the

Deliverable D3.247

5.6. OpenStack Swift 5. Research on Scalable Service Overlays and Clommunity Services

Figure 5.6: Read Performance of Swift under Network Latencies

Figure 5.7: Write Performance of Swift under Network Latencies

storage servers separately. In particular, for read accesses, we have introduced the latencies on theoutgoing links, whereas for write accesses, latencies are introduced on the incoming links. Latenciesare introduced in an uniform random fashion in a short window with the average values from 10 msto 400 ms. After 400 ms latency, the Swift cluster might become unavailable because of request time-outs. Figure 5.6 and Figure 5.7 demonstrate the influence of network latencies to the read and writeperformance of a Swift cluster. In both figures, the x-axis presents the average latencies introduced toeither the proxy or the storage servers’ network interfaces. The y-axis shows the corresponding per-formance, quantified as system throughput in op/s. Experiments on different latency configurationslast for 10 minutes. Data are collected every 10 seconds from the system performance feedback inYCSB. The plot shows the mean (the bar) and standard deviation (the error line) of the results fromeach latency configuration.

Figure 5.6 and Figure 5.7 share similar patterns, which indicates that the network latencies on eitherthe storage servers or the proxy servers bound the read and write performance of the cluster. Fur-thermore, it is shown in both figures that the network latencies on the proxy servers result in furtherperformance degradation. The throughput of Swift becomes more stable (shown as the standard de-viation), although decreasing, with the increasing of network latencies. The decreasing of throughputcauses less network congestions in the system and results in more stable performance.

48Deliverable D3.2

5. Research on Scalable Service Overlays and Clommunity Services 5.6. OpenStack Swift

Figure 5.8: Control Flow of the Self-healing Algorithm

5.6.4 The Self-Healing of Swift in a Community Network

Since a community Cloud is built upon a community network, which is less stable than a data centerenvironment, systems operating in a community Cloud environment have to tackle with more frequentnode leaves and network failures. Thus, we have developed a self-healing mechanism in Swift tooperate in a community Cloud environment.We design a MAPE (Monitor, Analysis, Plan, and Execute) control loop for the self-healing algorithmin Swift. The control cycle illustrated in Figure 5.8 is implemented on a control server, which canaccess the local network of the Swift cluster. The algorithm goes through the MAPE phases as shownin Figure 5.8.In Figure 5.9, we illustrate the result of the self-healing process. The horizontal axis shows theexperiment timeline. The blue points along with the vertical axis present the cluster’s health byshowing the data integrity information obtained by a random sampling process, where the samplesize is 1% of the whole namespace. The total number of files stored in our Swift cluster is around5000. In order to simulate storage server failures, we randomly shut down a number of the storageservers. The decrease of the data integrity observed in Figure 4 is caused by shutting down 1 to 4storage servers. The red points show the control latency introduced by the threshold value, which istwo in our case, set for the failure detector of the control system to confirm a server failure. Whenwe fail more than 3 (replication degree) servers, which may contain all the replicas of some data, thefinal state of the self-healing cannot reach 100% data integrity because of data loss.The choice of the threshold value for our failure detector provides the flexibility to make trade-offs.Specifically, a larger threshold value may delay the detection of server failures but with higher confi-dence. On the other hand, a smaller threshold value makes the control system react to server failuresfaster. However, the commitments of the server failures come with a potential cost. This cost is therebalance of data on the substitute servers. Thus, with higher failure detection confidence, the datarebalance cost is minimized, but it may slow down the system reaction time. Furthermore, the cost ofdata rebalance is proportional to the data stored in the failed server. Thus, it is a good strategy to seta larger threshold value when there is a large amount of data stored in the storage servers.

5.6.5 Conclusion

In this work, we evaluated the performance of OpenStack Swift in a typical community Cloud setup.The evaluation of Swift is conducted in a simulated environment, using the most essential environ-ment parameters that distinguish a community Cloud environment from a data center Cloud environ-ment. We have conducted evaluation of Swift regarding its bottom-line hardware requirements, itssensitivity to network latencies and insufficient bandwidth. Furthermore, in order to tackle with pos-sible server and link failures in the community Cloud, a self-healing control system is implementedand validated. Our evaluation results have established the relationship between the performance of aSwift cluster and the major environment’s factors in a community Cloud, including the proxy serverhardware and the network characteristics.

Deliverable D3.249

5.6. OpenStack Swift 5. Research on Scalable Service Overlays and Clommunity Services

Figure 5.9: Self-healing Validation under the Control System

50Deliverable D3.2

6 Conclusions and Outlook

There is an evident need today to offer cloud services to existing community networks and thus makeuse of their unutilised computational and storage resources. An architecture design is needed thattakes into consideration the nature of such networks. We offered a description of solution scenariosand proposed the high and low level requirements of a community cloud based on existing communitynetworks semantics. The proposed design is flexible and allows federations of heterogeneous cloudswith self-management capabilities. By considering node role properties, there is a potential to offersustainable bottom-up cloud operability and distributed virtualized management of resources that canalso be adaptive to changes.For the high level requirements, we considered user-related concepts such as the need of incentives forcontribution, security and ease of use for adaptability. In terms of low level requirements, we furtherproposed a layer stack, where each layer addresses specific issues such as virtualisation support,discovery and self-management, resource allocation and member contributions tracking. Incentivescan be offered to users to encourage contributions to improve the community network. A SupportTicketing System could be used that rewards with virtual credits which further translate into cloudservice benefits. Finally, the overall approach includes off-the-shelf cloud management platforms,potentially federated with other cloud organizations, and further extends them to provide higher levelservices to the members of community networks.Research in this first reporting period was conducted in parallel with the development and testbed de-ployment tasks of the project. Initial research work applied experimental methodologies that involvedsimulations or using the existing Confine testbed, while CLOMMUNITY’s own infrastructure wasbuilt. Along the first reporting period, CLOMMUNITY started deploying additional infrastructure inthe Guifi community network, and extending to the project partners, enabled more community cloudexperiments in Confine. Research in the second half of the first reporting period more and more could-combined our own experimental infrastructure, and showed the possibility to deploy experiments ona very heterogeneous distributed cloud infrastructure.Research in the second reporting period will benefit from the efforts of WP4 towards attracting realusers. Among the options to motivate real users, we have already experimented with attractive end-user application front-ends such as ownCloud which can use Tahoe-LAFS as backend storage system.As we achieve end users to become involved, we will be able to extend our experiments with datacoming from real usage.

51

Bibliography

[1] Umit C Buyuksahin, Amin M Khan, and Felix Freitag, “Support Service for Reciprocal Com-putational Resource Sharing in Wireless Community Networks,” in 5th International Workshopon Hot Topics in Mesh Networking (HotMESH 2013), within IEEE WoWMoM, Madrid, Spain,June 2013. 1.3, 2, 3.2.2, 3.2.3, 3.2.4

[2] Amin M Khan, Umit C Buyuksahin, and Felix Freitag, “Towards Incentive-Based Resource As-signment and Regulation in Clouds for Community Networks,” in Economics of Grids, Clouds,Systems, and Services, Jorn Altmann, Kurt Vanmechelen, and Omer F. Rana, Eds., vol. 8193 ofLecture Notes in Computer Science, pp. 197–211. Springer International Publishing, Zaragoza,Spain, Sept. 2013. 1.3, 2, 2.1.2, 2.1.2.6, 3.2.2, 3.2.3, 3.2.4

[3] Amin M. Khan, Umit C Buyuksahin, and Felix Freitag, “Prototyping Incentive-based ResourceAssignment for Clouds in Community Networks,” in 28th International Conference on Ad-vanced Information Networking and Applications (AINA 2014), Victoria, Canada, May 2014,IEEE. 1.3, 2, 2.1.2.6, 3.2, 3.2.4

[4] Amin M Khan and Felix Freitag, “Exploring the Role of Macroeconomic Mechanisms in Volun-tary Resource Provisioning in Community Network Clouds,” in 11th International Symposiumon Distributed Computing and Artificial Intelligence (DCAI ’14), Salamanca, Spain, June 2014,Springer. 1.3, 2.1

[5] Amin M. Khan, Leila Sharifi, Luıs Veiga, and Leandro Navarro, “Clouds of Small Things:Provisioning Infrastructure-as-a-Service from within Community Networks,” in 2nd Interna-tional Workshop on Community Networks and Bottom-up-Broadband (CNBuB 2013), withinIEEE WiMob, Lyon, France, Oct. 2013. 1.3, 2.2

[6] Navaneeth Rameshan, Leandro Navarro, and Ioanna Tsalouchidou, “A monitoring system forcommunity-lab,” in Proceedings of the 11th ACM International Symposium on Mobility Man-agement and Wireless Access, New York, NY, USA, 2013, MobiWac ’13, pp. 33–36, ACM. 1.3,4.1

[7] Rafael Moreno-Vozmediano, Ruben S. Montero, and Ignacio M. Llorente, “IaaS Cloud Archi-tecture: From Virtualized Datacenters to Federated Cloud Infrastructures,” Computer, vol. 45,no. 12, pp. 65–72, Dec. 2012. 2

[8] Peter Mell and Timothy Grance, “The NIST Definition of Cloud Computing,” NIST SpecialPublication, vol. 800, no. 145, 2011. 2.1

[9] Alexandros Marinos and Gerard Briscoe, “Community Cloud Computing,” in Cloud Comput-ing, MartinGilje Jaatun, Gansen Zhao, and Chunming Rong, Eds., vol. 5931 of Lecture Notes inComputer Science, pp. 472–484. Springer Berlin Heidelberg, Beijing, China, Dec. 2009. 2.1

[10] R. H. Coase, “The Nature of the Firm,” Economica, vol. 4, no. 16, pp. 386–405, Nov. 1937.2.1.1.1, 2.1.2.5

[11] Javi Jimenez, Roger Baig, Felix Freitag, Leandro Navarro, and Pau Escrich, “Deploying PaaSfor Accelerating Cloud Uptake in the Guifi.net Community Network,” in International Work-shop on the Future of PaaS 2014, within IEEE IC2E, Boston, Massachusetts, USA, Mar. 2014,IEEE. 2.1.2, 2.1.2.3

52

Bibliography Bibliography

[12] Magdalena Punceva, Ivan Rodero, Manish Parashar, Omer F Rana, and Ioan Petri, “Incentivisingresource sharing in social clouds,” Concurrency and Computation: Practice and Experience,Mar. 2013. 2.1.2.2

[13] James S. Coleman, “Social capital in the creation of human capital,” American Journal ofSociology, vol. 94, pp. pp. S95–S120, 1988. 2.1.2.4

[14] Kyle Chard, Kris Bubendorfer, Simon Caton, and Omer F. Rana, “Social Cloud Computing: AVision for Socially Motivated Resource Sharing,” IEEE Transactions on Services Computing,vol. 5, no. 4, pp. 551–563, Jan. 2012. 2.1.2.4

[15] Davide Vega, Llorenc Cerda-Alabern, Leandro Navarro, and Roc Meseguer, “Topology patternsof a community network: Guifi.net,” in 1st International Workshop on Community Networksand Bottom-up-Broadband (CNBuB 2012), within IEEE WiMob, Barcelona, Spain, Oct. 2012,pp. 612–619. 2.1.2.7, 3.2.4

[16] Akihiro Nakao and Yufeng Wang, “On Cooperative and Efficient Overlay Network EvolutionBased on a Group Selection Pattern,” IEEE Transactions on Systems, Man, and Cybernetics,Part B: Cybernetics, vol. 40, no. 2, pp. 493–504, Apr. 2010. 2.1.2.7

[17] Rodrigo N. Calheiros, Rajiv Ranjan, Anton Beloglazov, Cesar A. F. De Rose, and RajkumarBuyya, “CloudSim: a toolkit for modeling and simulation of cloud computing environmentsand evaluation of resource provisioning algorithms,” Software: Practice and Experience, vol.41, no. 1, pp. 23–50, Jan. 2011. 2.2

[18] Marc Bux and U Leser, “DynamicCloudSim: Simulating Heterogeneity in ComputationalClouds,” in International Workshop on Scalable Workflow Enactment Engines and Technologies(SWEET ’13), within ACM SIGMOD, New York, USA, June 2013. 2.2, 2.2.1.3

[19] R. Rahman, M. Meulpolder, D. Hales, J. Pouwelse, D. Epema, and H. Sips, “Improving Ef-ficiency and Fairness in P2P Systems with Effort-Based Incentives,” 2010 IEEE InternationalConference on Communications, pp. 1–5, May 2010. 3.2.2

[20] Davide Vega, Roc Messeguer, Sergio F. Ochoa, and Felix Freitag, “Sharing hardware resourcesin heterogeneous computer-supported collaboration scenarios,” Integrated Computer-Aided En-gineering, vol. 20, no. 1, pp. 59–77, 2013. 3.2.2

[21] Ahmad Al-Shishtawy and Vladimir Vlassov, “Elastman: Autonomic elasticity manager forcloud-based key-value stores,” in Proceedings of the 22Nd International Symposium on High-performance Parallel and Distributed Computing, New York, NY, USA, 2013, HPDC ’13, pp.115–116, ACM. 4.3

[22] Ahmad Al-Shishtawy and Vladimir Vlassov, “Elastman: Elasticity manager for elastic key-value stores in the cloud,” in Proceedings of the 2013 ACM Cloud and Autonomic ComputingConference, New York, NY, USA, 2013, CAC ’13, pp. 7:1–7:10, ACM. 4.3

[23] Albert Greenberg, Parantap Lahiri, David A Maltz, Parveen Patel, and Sudipta Sengupta, “To-wards a next generation data center architecture: scalability and commoditization,” in Proceed-ings of the ACM workshop on Programmable routers for extensible services of tomorrow, NewYork, NY, USA, 2008, PRESTO ’08, pp. 57–62, ACM. 5.1.1, 5.1.2

[24] Albert Greenberg, James R Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Paran-tap Lahiri, David A Maltz, Parveen Patel, and Sudipta Sengupta, “VL2: a scalable and flexibledata center network,” in Proceedings of the ACM SIGCOMM 2009 conference on Data commu-nication, New York, NY, USA, 2009, SIGCOMM ’09, pp. 51–62, ACM. 5.1.1, 5.1.2

[25] Edmund B Nightingale, Jeremy Elson, Jinliang Fan, Owen Hofmann, Jon Howell, and Yutaka

Deliverable D3.253

Bibliography Bibliography

Suzue, “Flat Datacenter Storage,” Proceedings of the 10th USENIX conference on OperatingSystems Design and Implementation, pp. 1–15, 2012. 5.1.1, 5.1.2

[26] I. Stoica and R. Morris, “Chord: a scalable peer-to-peer lookup protocol for internet applica-tions,” Networking, IEEE/ . . . , vol. 11, no. 1, pp. 17–32, Feb. 2003. 5.1.1

[27] Basho Technologies, Inc., “Riak,” . 5.1.1[28] Davide Vega, Guillem Cabrera, Roc Meseguer, and Juan Marques, “Exploring local service

allocation in community networks,” Tech. Rep. UPC-DAC-RR-XCSD-2014-1, Universitat Po-litecnica de Catalunya - BarcelonaTech, 2013. 5.3.1, 5.3.2

[29] Yih-Farn Chen, Scott Daniels, Marios Hadjieleftheriou, Pingkai Liu, Chao Tian, and VinayVaishampayan, “Distributed storage evaluation on a three-wide inter-data center deployment,”in Big Data, 2013 IEEE International Conference, 2013, pp. 17–22. 5.4

[30] Keiichi Shima and Nam Dang, “Indexes for Distributed File/Storage Systems as a Large ScaleVirtual Machine Disk Image Storage in a Wide Area Network,” . 5.5

[31] Roman Talyansky, Adolf Hohl, Bernd Scheuermann, Bjorn Kolbeck, and Erich Focht, “Towardstransactional load over xtreemfs,” CoRR, vol. abs/1001.2931, 2010. 5.5

[32] Y. Liu, V. Vlassov, , and L. Navarro, “Towards a community cloud storage,” in 28th IEEEInternational Conference on Advanced Information Networking and Applications. 2014, IEEE.5.6

54Deliverable D3.2

Licence

The CLOMMUNITY project, April 2014, CLOMMUNITY-201404-D3.2:

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 Unported License.

55

deliverable title: experimental research on community...

Documents