[ieee 2014 twelfth annual conference on privacy, security and trust (pst) - toronto, on, canada...

Analyzing trustworthiness of Virtual Machines in Data-Intensive Cloud Computing

Dipen Contractor Department of Computer Engineering

NIT Surat, India 395007 [email protected]

Abstract- Data-intensive cloud computing offers an

abstraction of high availability, usability, and efficiency while

processing petabytes of data. As individual users do not have sole

control on cloud resources, it may lead to concerns regarding the

trustworthiness of the resources used. In this paper, we propose a

trustworthiness analysis framework that adapts software

attestation mechanisms and heartbeat messages to evaluate the

current status of virtual machine in cloud computing to know

trustworthiness of the resources.

Index Terms- Software Attestation, Data-Intensive Cloud

Computing, Trustworthiness.

INTRODUCTION

Internet based data-intensive services have recently become integral parts of our daily life: we stay in touch with friends via Facebook or Omail, work on documents with OoogleDocs, share our photos on Flicker or Picasa. All these services are built on top of massive and scalable cloud infrastructures [1]. Cloud computing is changing the ways we design, maintain and optimize large-scale data-intensive software systems. Recently, Amazon has started provisioning EC2-like instances offering a cluster of virtual machines to handle data-intensive applications [2].

In data-intensive cloud computing, service providers can lease a set of resources from cloud infrastructures to provide their software as services to multiple clients. Although virtualization ensures isolation among users to a certain extent, malicious attackers can leverage the shared hardware to launch attacks.

Customers relinquish control over their code, data, and computation when they move from a self-hosted environment to the cloud. According to Fujitsu [3], one of the top security concerns for cloud users is to verify the identity and integrity of applications running in their VMs. One solution resides in the remote attestation to identify the state of hardware and software. Remote attestation is the activity of making a claim about properties of an application by supplying evidence over a network [4]. Active research is being pursued on design of remote attestation mechanisms for VMs.

We argue that such lack of trustworthiness is a key challenge facing wider adoption of Data-intensive cloud computing and is the chief contributor to security being the primary concern. We propose trustworthiness analysis framework for cloud infrastructure in the data-intensive computing It relies on the Trusted Computing architecture [5] (for calculating cryptographic primitives) and heartbeat messages [6] (for communication purposes). Involvement of TCG architecture increases the trustworthiness of overall infrastructure, whereas the use of heartbeat messages ensures "freshness" of information used for decision making.

Rest of the paper is organized as follows: In section II, we discuss background and related work on software attestation and TCO. In section III, we propose our framework for

978-1-4799-3503-1/14/$31.00 ©20141EEE

Dhiren Patel Department of Computer Engineering

NIT Surat, India 395007 [email protected]

analyzing trustworthiness of virtual machines in data-intensive cloud infrastructure, with conclusion and references at the end.

BACKGROUND

Data Intensive computing refers to the computing of large scale data [7]. Such systems may include pure data-intensive systems or they may also contain data/compute-intensive systems. Trustworthiness is a measure of the integrity, ability, competence, and surety of an entity to provide a service. This can be established by evidence (assurance). E.g. having experience of good behavior in the past might make someone reasonably trustworthy for the future. The goal of attestation is to prove to a remote party that the operating system and application software are intact and trustworthy.

Software attestation and trusted computing environment

TCO has built architecture for trusted computing on two elementary requirements:

I. The platform must keep a running snapshot of its execution environment to keep track of all on going activities

2. The platform must be able to report this snapshot reliably over the network.

From low level firmware to high level applications, software is inherently malleable, and can thus betray the user. Therefore, mechanisms in hardware, a relatively less malleable medium, are necessary to protect against malicious software [8]. To this effect, TCO specifies a hardware module, the Trusted Platform Module (TPM) [9]. TPM is a tamper resistant piece of cryptographic hardware tied to the motherboard of a platform. It implements primitive cryptographic functions from which more complex features can be derived. TPM is manufactured with a public/private key pair built into the hardware, called the endorsement key (EK). The EK (RSA 2048bit) is unique to a particular TPM. Due to security and privacy concerns, we use another key called AIK (Attestation Identity Key); it is generated by TPM and certified by a trusted third party, typically called a Privacy-CA (Fig. I)

Originality of TPM PKCA{KEK, KA1K}

PRCA{Certificate{KEd} '--_---J Certificate given to TPM

Fig. 1. Certificate form Privacy CA«PKcA, PRcA) is public, private key pair

TCO specifies a Trusted Software Stack (TSS) [10] to operate on TPM at various level. The TSS consists of multiple layers, including TPM device driver, TSS Device Driver Library (TDDL), TSS Core Services (TCS), TSS Service Provider (TSP) and cryptography services. TSS is responsible for maintaining the Integrity Measurement Log and swapping encrypted keys in and out of the TPM's limited memory. Several implementations of various components of TSS exist,

403

such as openPTS [11], Trousers [12], IBM's Integrity Measurement architecture [13], etc.

Remote attestation in a Non-Virtualized Environment

TCG works on the concept of "measure before loading". TCG provides a mechanism for a TPM that every executables can be verified using hash code before loading it. The platform is bootstrapped by the CRTM [14] (the Core Root of Trust for Measurement), which is trusted by default. The platform will measure the B[OS (let its hash value be �) and the boot loader (let its hash value be A) to construct a chain of trust. This chain of trust formed either statically at the platform's boot time, or dynamically at any time during execution [15]. After loading the kernel, this chain is extended through every operating system components up to the applications and their configuration files. All the executables are measured into a shielded location called PCRs (Platform Configuration Registers) inside the TPM. Each TPM has multiple 20-byte PCRs used to record measurements related to the platform state, such as those of various software components on the platform. [t is implemented by the TPM _ Extend instruction of the TPM, which replaces the PCR value with a hash of its original value and the hash value of the most recently loaded version of executable. For example, a 160 bit SHA [ hash code of the application Xl can be calculated as

PCRXI = SHA[ ( ... (SHAl(SHAl(�))IIA)) ... 11 Xl)

As PCR can only be reset after a system restart, all the values measured in it cannot be reversed. On chip there is a limited number of available PCRs. To overcome that limitation, an integrity measurement list (IML) is maintained at the application layer. The [ML records the detailed list of the measurement values and necessary meta-data for each executable, thus representing the integrity status of the entire platform. The software running on the platform can be identified by matching the hash values in the IML with reference data. This requires a list of reference integrity measurements (RIM) contained within a Reference Manifest Database [[ 6]. These references are collected from the original source: the software and hardware manufacturers. Thus using a hash code, other node can get its current status and takes decision about trustworthiness. A more detailed description of trusted computing technologies for today's commodity platforms is provided by Parno et al [[ 7].

TCG{or Virtualized environment

Hypervisor gives a single physical system (host server), the ability to distribute resources to one or more Virtual Machines at a given time. Type 1 Hypervisor known as 'bare metal' runs directly on the hardware. Type 2 Hypervisor known as 'hosted' runs as an application on an existing os.

The vTPM is a virtual extension of the TPM that enables the hypervisor to emulate the features of the TPM, by exporting to each VM a virtual implementation of the TPM having the same interface of the TPM. As mentioned earlier, there can be more number of AIKs to be generated and stored in TPM. These can be used in creating vTPMs. Individual vTPM will get certified AIK as their own EK and behave like TPM in VM.

Heartbeats in Cloud computing

Heartbeat is a message that provides a simple, standardized way for a cloud system to monitor performance and make that information available to external observers. Applications can use heartbeat information to automatically add or subtract resources from their pool. Fault tolerance can be handled by

timely exchanging heartbeat messages in between different components.

Related work

Verification-based Integrity Assurance Framework proposed in [18], is based on the idea of replication and quiz related methods. It can detect malicious and normal task trackers in Hadoop system with the help of predefmed set of questionnaires. These questionnaires are based on system performance. Mateizaharia et al. [[ 9] proposed algorithm named Longest Approximate Time to End (LATE). LATE finds the slow tasks in a homogeneous environment. LATE first estimates the remaining time for each tasks, then assigns the speculative times for those tasks with the longest remaining time to end to maintain integrity of the system. The Trusted cloud computing platform (TCCP) [20] was proposed for confidentiality and integrity of computations that are outsourced to IaaS services.

The TCCP provides the abstraction of a closed box execution environment for a customer's VM, guaranteeing that no cloud provider's privileged administrator can watch or tamper its content. [ntegrityVM [2 [] tries to attest a complete VM and store it in PCR. This approach does not reveal the behavior of application running inside VM. This makes identification of trusted state of VM. Cloud Verifier [22] demonstrates a service that generates integrity proofs. These proofs used to verify the integrity and access control enforcement of cloud platform.

TRUSTWORTHINESS ANALYSIS FRAMEWORK: OUR PROPOSAL

[n this section, we discuss an abstract scenario and followed by description of our proposed framework. We assume that TPM and vTPM are attacker resistant.

.... 8g , Client 1 :8

HDFS ,,' " " '

� CI�""t3

" I C1, C2, C3. Client Application I

Fig. 2. Data-Intensive Cloud abstract scenario

Abstract scenario

Cloud service provider offer functionality for data intensive computing. In this situation, a cluster of virtual machines is allotted to a client. A cluster may be already configured as per clients' requirement and equipped with data intensive services (e.g. HDFS [23] with MapReduce). In a typically deployed data-intensive infrastructure, one node acts as master node. It receives requests for resources from clients. The master node assigns VMs to that client hosted on single or multiple physical infrastructures. Client can access those VMs via various remote access protocols.

The goal of the attacker is to compromise a client's VM in order to gain access to sensitive information. Various methods to compromise VM have been extensively studied in host based intrusion detection [24]. One of those methods is where an attacker inserts a malicious code into a genuine update of

404

an application running inside a client VM. Once the update is performed, the application, and possibly the entire VM are compromised. Our proposed framework takes necessary steps to detect such malicious actions.

Terminologies and Framework

Host is a physical machine or server and Guest is a virtual machine (VM). Master node is a VM that has higher hardware capabilities to act as an auditor of other Slave VMs. Slave

machines are those VMs that can be assigned to clients after completion of registration process.

Trustworthy state is a state of VM, in which all the loaded modules are uniquely identified by their signatures (hash code). In Trustworthy state all running modules and binaries are un-altered. All the binaries and modules are from genuine providers. TCG provides a mechanism for measurement that no application can bypass. Depending on the extent to which a client is comfortable with a VM's rescission of client's trust, a threshold T can be selected. T is used to categorize VM based on verification of IML values. Let N be the length of IML list. Let V( List1MJ) be the number of values verified to be correct in the IML list List1M1 .• Then we defme:

If V( List1MJ) = N, then VM is inserted in the whitelist If V( List1M1) >= T & V( List1M1) <N, then VM is inserted in

the grey list If V( ListlMJ >=0 & V( ListlMJ < T, then VM is inserted in

the blacklist

Master node /' "'"

Portal and cloud management I· Applications

'\

Ret. J [ Heart eat J. Manifest Module I

"- Guest as ./ vTP� Virtual Hardware

Slave node / "

I Application I \

. .

/' � + Heartbeat J Module

\.. Guest as ./

�vTPM Virtual Hardware

[ I vTPM Manager � Hypervisor 1

[ [

.I. HostOS 1 I TPM I Hardware 1

Fig. 3. Basic structure of our framework

White listed VMs are trustworthy, whereas blacklisted VMs are untrustworthy. The fate of greylisted VMs reside in the hands of the data-intensive cloud Infrastructure Provider. {Information} KAIK denotes that information is encrypted with key AIK.

Our framework is designed for hosted hypervisor based cloud architectures and comprises of two modules: Collector Module and Verifier Module. Communication between these modules is through Heartbeat messages.

Collector Module: This module continuously collects cryptographic hash code

of applications before they are loaded. It also sends these values to Verifier module using Heartbeat messages.

Verifier Module: This module is responsible for calculating the

aforementioned V(List1MJ) and categorizing the VMs into whitelist, greylist or blacklist accordingly. The incoming lists from various slave VMs are verified against a standard

reference manifest database [16]. It is also involved in the slave node registration process. (explained later).

Heartbeat messages are responsible for communication between collector and verifier modules. The payload of a single message is altered by appending IML list calculated at Collector module. The standard working of heartbeat messages ensures that "freshness" of IML lists is maintained.

The workflow of our framework can be divided into: a onetime per slave node registration process (Fig 4) and a continuous slave node integrity measurement process (Fig 5).

vTPM

2. Registration (Certificate{K", )} KA'� 3. Verify

(Certificate(Km )) KA'·I--::K-EKl--:, K"-A '-. � ertificate{KM)

5. Reply ertificate(KM ) ,NodelD, ,Nonce) KA,.

Fig. 4. Slave node registration process(+ = Heartbeat Messages)

Slave node registration process

This is a process that takes place once over the lifetime of a VM. The process is divided into 5 steps.

1. Slave node gathers credentials from its vTPM. Credentials are in the form of public key component of EK encrypted with private key component of AIK.

2. The gathered credentials are then appended to the payload of a heartbeat message.

3. Master node will certity both public key component of EK and AIK if they originated from a genuine vTPM.

4. New NodelD will be created for the slave. 5. Master will send its public key KM (public key componenet

of master node's AIK) along with Nonce. Nonce is used to avoid replay attacks. The message will be encrypted by public key component of AIK received from the slave

node's vTPM.

Slave node Integrity Measurement process

This process continuously occurs over the course of runtime of the entire framework as follows

I. Collector module measures hash of an application which wants to run

2. Then it will update VMs local encrypted IML value using AIK key.

3. Now slave node notifies master node. It forms a heartbeat message to send updated IML values using previously gained credenteials and public key of master node. After that slave node notify master using public encrypted key of master node, appended with its own ID.

4. Master node will take IML value. Verifier does search IML value in reference database.

5. If verifier verifies that value, then master node will update IML values based on slave node's ID. And if verifier could not match IML values then it will not update slave

node's IML. Master node also updates slave node's

trustworthiness status and check if it goes below thresold T, salve node will be moved to another list.

405

l Application i" J l Collector J [ Slave I Node i,h I Master I I Verifier I Ref. DB . Measure Hash

Hash 2. Update IML

(Name, Hash}KA1o �. Notify Master (IML Value, Node Id, Nonce,}K ..

4. Verify Hash and Check Hash

Update IML of IDj

S. Update trustworthiness

Fig. 5. Slave node integrity measurement process (+ = Heartbeat Messages)

Continues monitoring of application running inside VM will result in monitoring current state.

Experiments

We have created a data-intensive Hadoop cluster using KVM (Kernel Virtual machine). Apache Hadoop [23] enables scaling up processes from a single server to thousands of machines and comes with high availability services which provides Heartbeat module [25]. Hadoop Cluster includes a single master and multiple worker nodes. The master node consists of a lob Tracker and NameNode. A slave or worker node acts as both a DataNode and TaskTracker.

IBM S·BladeSystem with Intel E5 2620 processor

Fig. 6. Experiment Setup

Host

We have built 3 VMs to run Hadoop environment. One of that VM is master (name node) and other VMs are slave (contains data node). Currently we are working on OpenPTS to build Verifier and Collector modules and accommodating Heartbeat messages module for autonomous communications.

We assess trustworthiness of Hadoop services and set it in IML. On that cluster, we run a data-intensive task to measure trustworthiness of actual virtual machine. We also measure time taken by Master node to perform different task to compare different relevant TCO based attestation services.

CONCLUSION

With remote attestations and trusted heartbeats, a Master node can determine the exact status (working or malfunctioning) of its nodes. This framework demonstrates the capabilities of identification of genuine nodes using TPM. It also shows how to utilize common messages to establish trustworthiness among all the components in open distributed environment.

REFERENCES

[1] M. Armbrust, A Fox, R. Griffith, A D. Joseph, R. Katz, A Konwinski, G. Lee, D. Patterson, A Rabkin, I. Stoica, and others, "A view of cloud computing," Communications of the ACM, vol. 53, no. 4, pp. 50-58, 2010 .

[2] "Amazon Elastic MapReduce." [Online]. Available: http://aws.amazon. com/elasticmapreducei.

[3] "Personal data in the cloud : the inportance of trust." [Online]. Available: http://www.fujitsu.com/downloadsIWWW2/news/publicationsI.FSL-0016_ A4 _ TrustReport_ online-final. pdf

[4] G. Coker, J. Guttman, P. Loscocco, A Herzog, J. Millen, B. O'Hanlon, J. Ramsdell, A Segall, 1. Sheehy, and B. Sniflen, "Principles of remote attestation," International Journal of Information Security, vol. 10, no. 2, pp. 63-81, Apr. 20 I I.

[5] 'Trusted Computing Group." [Online]. Available: http://www. trustedcomputinggroup.com. [Accessed: 0 I-May-20 13].

[6] J. Miller, A. Agarwal, M. Santambrogio, J. Eastep, and H. Hoffinann, "Application Heartbeats for Software Performance and Health," 2009.

[7] R. E. Bryant, "Data intensive scalable computing," Carnegie Mellon University Retrieved August, vol. 10, 2008.

[8] S. R, "Internet Security Glossary, Version 2. RFC 4949 (Informational)." [Online]. Available: http://www.ietforglrfc/rfc4949.txt.

[9] V. Scarlata, C. Rozas, M. Wiseman, D. Grawrock, and C. Vishik, 'TPM virtualization: Building a general framework," Trusted Computing, pp. 43-56, 2008.

[10] 'TPM Software Stack (TSS) Specification, Version 1.2." [Online]. Available: http://www.trustedcomputinggroup.org/resources/tcg_software _stack_specification_tss_12jaq. [Accessed: 01-Jul-2013].

[11)"OpenPTS." [Online]. Available: http://sourceforgejp/projects/openpts [12] 'Trusted grub." [Online]. Available: http://trousers.sourceforge.neti

grub.html. [13] "Linux IMA." [Online]. Available: http://sourceforge.net/apps/mediawiki

/linux-imalindex.php. [14] C. Shen, H. Zhang, D. Feng, Z. Cao, and 1. Huang, "Survey of

information security," Science in China Series F: Information SCiences,

vol. 50, no. 3, pp. 273-298, Jun. 2007. [15] Intel, "Intel ® Trusted Execution Technology Architectural Overview,"

2003. [16] 'TCG: Reference Manifest Database Schema." [Online]. Available:

http://www . trustedcomputinggroup. orglfi1e s/resource _ files/73 63 8 609-1A4B-B294-D0902CC2A13C8697/Reference Manifest Schema Specification _ v2.0.r5.pdf

- - -

[17] B. Parno, "Bootstrapping trust in a 'trusted' platform," in Proceedings of the USENIX Workshop on Hot Topics in Security (HOTSEC), 2008, pp. 9:1-9:6.

[18] Y. Wang and J. Wei, "VIAF: Verification-Based integrity assurance framework for mapReduce," Cloud Computing (CLOUD). 2011 IEEE, pp. 300-307, .luI. 2011.

[19] A. K. MateiZaharia, A. D. Joseph, and I. S. RandyKatz, "Improving MapReduce Performance in Heterogeneous Environments," 2010.

[20] N. Santos, K. P. Gummadi, and R. Rodrigues, 'Towards trusted cloud computing," in Proceedings of the 2009 conference on Hot topics in cloud

computing, 2009, p. 3. [21] A. Yu, Y. Qin, and D. Wang, "Obtaining the Integrity of Your Virtual

Machine in the Cloud," 2011 IEEE Third International Conference on Cloud Computing Technology and Science, pp. 213-222, Nov. 2011.

[22] J. Schiffinan, H. Vijayakumar, and T. Jaeger, "Verifying system integrity by proxy," in Trust and Trustworthy Computing, Springer, 2012, pp. 179-200.

[23] T. White, Hadoop: the definitive guide. O'Reilly, 2012. [24] C. Modi, D. Patel, B. Borisaniya, H. Patel, A. Patel, and M. Rajarajan,

"A survey of intrusion detection techniques in cloud," Journal of Network

and Computer Applications, vol. 36, no. I, pp. 42-57, 2013. [25] F. Wang, J. Qiu, J. Yang, B. Dong, X. Li, and Y. Li, "Hadoop high

availability through metadata replication," in Proceedings of the first

international workshop on Cloud data management, 2009, pp. 37-44.

406

[ieee 2014 twelfth annual conference on privacy, security and trust (pst) - toronto, on, canada...

Documents