cloud data patterns for confidentiality … - cloud data...cloud data patterns for confidentiality...

9
1 Institute of Architecture of Application Systems, University of Stuttgart, Germany, [email protected] 2 GRIDSOLUT GmbH + Co. KG, Stuttgart, Germany, [email protected] CLOUD DATA PATTERNS FOR CONFIDENTIALITY Steve Strauch 1 , Uwe Breitenbuecher 1 , Oliver Kopp 1 , Frank Leymann 1 , Tobias Unger 1,2 These publication and contributions have been presented at CLOSER 2012 CLOSER 2012 Web site: http://closer.scitevents.org © 2012 SciTePress. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the SciTePress. @inproceedings{Strauch2012, author = {Steve Strauch and Uwe Breitenbuecher and Oliver Kopp and Frank Leymann and Tobias Unger}, title = {Cloud Data Patterns for Confidentiality}, booktitle = {Proceedings of the 2 nd International Conference on Cloud Computing and Service Science, CLOSER 2012, 18-21 April 2012, Porto, Portugal}, year = {2012}, pages = {387-394}, publisher = {SciTePress} } : Institute of Architecture of Application Systems

Upload: lekiet

Post on 20-May-2018

231 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CLOUD DATA PATTERNS FOR CONFIDENTIALITY … - Cloud Data...CLOUD DATA PATTERNS FOR CONFIDENTIALITY Steve Strauch 1, ... are business secrets of companies, ... to the public while keeping

1Institute of Architecture of Application Systems, University of Stuttgart, Germany, [email protected]

2GRIDSOLUT GmbH + Co. KG, Stuttgart, Germany, [email protected]

CLOUD DATA PATTERNS FOR CONFIDENTIALITY

Steve Strauch1, Uwe Breitenbuecher1, Oliver Kopp1, Frank Leymann1, Tobias Unger1,2

These publication and contributions have been presented at CLOSER 2012

CLOSER 2012 Web site: http://closer.scitevents.org © 2012 SciTePress. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the SciTePress.

@inproceedings{Strauch2012, author = {Steve Strauch and Uwe Breitenbuecher and Oliver Kopp and Frank Leymann and Tobias Unger}, title = {Cloud Data Patterns for Confidentiality}, booktitle = {Proceedings of the 2nd International Conference on Cloud Computing and Service Science, CLOSER 2012, 18-21 April 2012, Porto, Portugal}, year = {2012}, pages = {387-394}, publisher = {SciTePress} }

:

Institute of Architecture of Application Systems

Page 2: CLOUD DATA PATTERNS FOR CONFIDENTIALITY … - Cloud Data...CLOUD DATA PATTERNS FOR CONFIDENTIALITY Steve Strauch 1, ... are business secrets of companies, ... to the public while keeping

CLOUD DATA PATTERNS FOR CONFIDENTIALITY

Steve Strauch1, Uwe Breitenbuecher1, Oliver Kopp1, Frank Leymann1 and Tobias Unger1;2

1Institute of Architecture of Application Systems, University of Stuttgart, Stuttgart, Germany2GRIDSOLUT GmbH + Co. KG, Stuttgart, [email protected], [email protected]

Keywords: Patterns, Confidentiality, Database Layer, Migration, Distributed Application Architecture, Cloud Data Store.

Abstract: Cloud computing enables cost-effective, self-service, and elastic hosting of applications in the Cloud. Appli-cations may be partially or completely moved to the Cloud. When hosting or moving the database layer tothe Cloud, challenges such as avoidance of disclosure of critical data have to be faced. The main challengesare handling different levels of confidentiality and satisfying security and privacy requirements. We providereusable solutions in the form of patterns.

1 INTRODUCTION

Cloud computing has started to be applied in industrydue to the advantage of reducing capital expenditureand transforming it into operational costs (Armbrustet al., 2009). Therefore, applications are directly builtusing Cloud service technology or existing applica-tions are migrated to the Cloud.

Overview of Cloud Application Hosting Topologies

Traditional

ApplicationLayers

PresentationLayer

Application

Business Layer

Data Access Layer

Database Layer

PresentationLayer

Business Layer

Data Access Layer

Database Layer

Private Cloud

Community Cloud

DeploymentModels

Hybrid Cloud

Public Cloud

PresentationLayer

PresentationLayer

Business Layer

Business Layer

Data Access Layer

Data Access Layer

Database Layer

Database Layer

DataLayer

Figure 1: Overview of Cloud Deployment Models and Ap-plication Layers.

Applications are typically built using a three layerarchitecture model consisting of a presentation layer,a business logic layer, and a data layer (Dunkel et al.,2008). The data layer is responsible for data stor-age and is in turn subdivided into data access layer(DAL) and database layer (DBL). The DAL is an ab-straction layer encapsulating the data access function-ality. The DBL is responsible for data persistence anddata manipulation. The subdivision of the data layerfinally leads to a four layer application architecture

(Figure 1). To take advantage of Cloud computing, anapplication may be moved to the Cloud or designedfrom the beginning to use Cloud technologies. Fig-ure 1 shows all potential possibilities for the distribu-tion of an application consisting of four layers. Thetraditional application not using any Cloud technol-ogy is shown on the left. Each layer can be hostedusing different Cloud deployment models. AvailableCloud deployment models are: Private Cloud, PublicCloud, Community Cloud, and Hybrid Cloud (Melland Grance, 2009). In this context, “on-premise” de-notes that the Cloud infrastructure is hosted inside thecompany and “off-premise” denotes that it is hostedoutside the company. Figure 1 will be reused for pro-viding a sketch for each Cloud Data Pattern, illustrat-ing the solution.

This paper deals with hosting or moving the data-base layer to the Cloud and the related challenges re-garding data confidentiality to be faced. The contribu-tion of the paper is the identification of these challen-ges and the description of Cloud Data Patterns to facethese challenges. A Cloud Data Pattern describes areusable and implementation technology-independentsolution for a challenge related to the data layer of anapplication in the Cloud for a specific context. CloudData Patterns address both the migration of a datalayer hosted traditionally to the Cloud as well as en-abling access to the data layer in the Cloud. “Tra-ditionally” denotes not using any Cloud technology.Confidentiality Patterns provide solutions for avoid-ing disclosure of confidential data. Confidentiality in-cludes security and privacy. In the context of confi-dentiality we consider the data to be kept private and

387

Page 3: CLOUD DATA PATTERNS FOR CONFIDENTIALITY … - Cloud Data...CLOUD DATA PATTERNS FOR CONFIDENTIALITY Steve Strauch 1, ... are business secrets of companies, ... to the public while keeping

secure as “critical data”. For instance, critical dataare business secrets of companies, personal data, andhealth care data. Critical data can be categorized intodifferent confidentiality levels such as “NATO Secret”or “NATO Confidential” as described in the securityclassification in the security regulations of the NorthAtlantic Treaty Organization (NATO) (Department ofDefense Security Institute, 1993).

To provide a set of Cloud Data Patterns as refer-ence work, the description of each pattern has to beself-contained. Thus, the concrete characteristics ofthe patterns presented in Section 2 overlap. Section 3presents related work in the field of patterns. Finally,Section 4 concludes and presents an outlook on futurework.

2 CONFIDENTIALITYPATTERNS

This section introduces five new Cloud Data Patternsfor confidentiality. As the data is not always catego-rized or might be categorized using different confi-dentiality categorizations we present two patterns forhandling these challenges: The Confidentiality LevelData Aggregator (Section 2.1) provides one confiden-tiality level for data from different sources with po-tentially different confidential categorizations on dif-ferent scales. The Confidentiality Level Data Splitter(Section 2.2) provides different confidentiality levelsfor data originally categorized into one confidential-ity level. Both the aggregation and splitting have tobe configured, e. g., using parameterization or config-uration files.

To protect the confidential data on the one handand to benefit from Cloud computing on the otherhand, we introduce three different patterns: filter-ing, pseudonymization, and anonymization. The Fil-ter of Critical Data (Section 2.3) ensures that noconfidential data is disclosed to the public. ThePseudonymizer of Critical Data (Section 2.4) im-plements pseudonymization. Pseudonymization is atechnique to provide a masked version of the datato the public while keeping the relation to the non-masked data (Federal Ministry of Justice, 1990). Thatenables processing of non-masked data in the privateenvironment. The Anonymizer of Critical Data (Sec-tion 2.5) implements anonymization. Anonymizationis a technique to provide a reduced version of the crit-ical data to the public while it is impossible to relatethe reduced version to the critical data. A discussionon the composition of Confidentiality Patterns is pro-vided in Section 2.6.

2.1 Confidentiality Level DataAggregator

Context. The data formerly storedin one traditionally hosted data storeis separated according to the differentconfidentiality levels and stored in dif-ferent locations. The business layer is

separated into the traditionally hosted part processingthe critical data and the business layer hosted in thepublic Cloud processing the non-critical data. As theapplication accesses data from several data sources,the different confidentiality levels of the data itemshave to be aggregated to one common confidentialitylevel. This builds the basis for avoiding disclosure ofcritical data by passing it to the public Cloud.

Challenge. How can data of different confiden-tiality levels from different data sources be aggregatedto one common confidentiality level?

Forces. The data might be categorized into confi-dentiality levels from different scales. Thus, aggrega-tion by normalizing scales is not always possible.

Solution. An aggregator is placed within theCloud infrastructure of the Cloud data storage withthe highest confidentiality level. The aggregator re-trieves data from all Cloud data stores. Thus, it es-pecially processes data with the highest confidential-ity level, which may not be placed where data witha lower level resides. As a consequence, it has to beplaced in a location where the demands of the highestconfidentiality level are fulfilled.

Sketch Privacy and Security Data Level Aggregator

Traditional

Presentation Layer

Application

Business Layer*

Data Access Layer*

Private Cloud

Community Cloud

Deployment Models

Public Cloud

Database Layer

Business Layer*

Data Access Layer*

Database Layer

Database Layer

Legend

Dataflow

Migration

Partial Migration

Modified Component *

Ap

plic

atio

n L

ayer

s

Figure 2: Sketch of Confidentiality Level Data Aggregator.

Sidebars. The data stored in the different datastores might be configured with different security lev-els from different security domains. The aggregatoroutputs data from one security domain only. The ag-gregator has to be configured accordingly and a map-ping between different confidentiality levels from po-tentially different security or privacy domains has tobe provided. Finally, a procedure for determining thefinal output level of the data combined by the aggre-gation has to be given. Usually, it is the highest con-

CLOSER�2012�-�2nd�International�Conference�on�Cloud�Computing�and�Services�Science

388

Page 4: CLOUD DATA PATTERNS FOR CONFIDENTIALITY … - Cloud Data...CLOUD DATA PATTERNS FOR CONFIDENTIALITY Steve Strauch 1, ... are business secrets of companies, ... to the public while keeping

fidentiality level. Both parts of the data access layerhosted traditionally and in the public Cloud have to bereconfigured to access the database layer only throughthe confidentiality level data aggregator. It is assumedthat the data stores themselves have appropriate ac-cess right configurations.

Results. For each request, several data stores arequeried. The data is routed through the aggregator,which annotates the data with the appropriate confi-dentiality level valid for the aggregated data set.

Example. The database layer of an applicationbuilt on Oracle Corporation MySQL (Oracle Corpo-ration, 2011), version 5.1 is split into an AmazonVirtual Private Cloud (Amazon VPC (Amazon.com,Inc., 2011b)) data store hosting an AMI and AmazonRelational Database Service (Amazon RDS (Ama-zon.com, Inc., 2011a)). The AMI runs MySQL 5.1relational database on OpenSolaris. Regarding Ama-zon RDS, a MySQL DB instance is chosen. Thus,the database functionality remains the same. The datastored in the Amazon VPC is annotated with “NATOConfidential” and the data stored in the Amazon RDSis annotated with “NATO Restricted” (Department ofDefense Security Institute, 1993). The logic imple-mented in the business layer is split into the one pro-cessing critical and the one processing non-criticaldata. The data access layer is also split and configuredto operate on the confidentiality level data aggregator.The aggregator is fetching the data from the two dif-ferent data sources. In case data is returned from theAmazon VPC, the result set is annotated with “NATOConfidential”, otherwise “NATO Restricted” is used.On its own, this annotation does not prevent the dis-closure of data. For instance, the disclosure itself maybe prevented in combination with the filter of criticaldata pattern.

Next. In case the stored data should be updated,the Confidentiality Level Data Splitter has to be con-sidered.

2.2 Confidentiality Level Data Splitter

Context. The data formerly stored inone traditionally hosted data store isseparated according to data stores withdifferent confidentiality levels. As theapplication writes data to several data

stores, the data has to be categorized and split accord-ing to the different confidentiality levels. This buildsthe basis for avoiding disclosure of critical data whenstoring it in the public Cloud.

Challenge. How can data of one common confi-dentiality level be categorized and split into separatedata parts belonging to different confidentiality lev-

els?Forces. The data has to be annotated and split

manually, in case the confidentiality level data splitteris not used.

Solution. A splitter is placed within the Cloudinfrastructure of the data access layer. Thus, addi-tional data movement, network traffic, and load can beminimized. The splitter writes data to all Cloud datastores. As the splitter processes data with the highestconfidentiality level, it has to be placed in a locationwhere the demands of the highest confidentiality levelare fulfilled.

Sketch Privacy and Security Data Level Splitter

Traditional

Presentation Layer

Application

Business Layer

Data Access Layer*

Private Cloud

Community Cloud

Deployment Models

Public Cloud

Database Layer

Database Layer

Database Layer

Legend

Dataflow

Migration

Partial Migration

Modified Component *

Ap

plic

atio

n L

ayer

s

Figure 3: Sketch of Confidentiality Level Data Splitter.

Sidebars. The data to be stored in the differentdata stores might have to be separated and catego-rized into different confidentiality levels from one se-curity or privacy domain. The splitter gets input dataof a common security or privacy domain and outputsa separation of this data according to the particularsecurity or privacy levels. The splitter has to be con-figured for the mapping from one common confiden-tiality level to different confidentiality levels from thesame security or privacy domain. The configurationalso includes a determination of the data store for eachlevel. It is assumed that the data stores themselveshave appropriate access right configurations. The dataaccess layer has to be configured to write data exclu-sively through the Confidentiality Level Data Splitterto the database layer. In order to avoid disclosure ofcritical data, the data access layer has to be placed in alocation where the demands of highest confidentialitylevel are fulfilled. Usually, this is the private Cloud.

Results. For each data write, several data storesare used. The data is routed through the splitter, whichcategorizes and separates the data of one commonconfidentiality level into disjoint data sets of differ-ent confidentiality levels. The data sets are stored inthe appropriate data store based on the correspondingsecurity level of each data set.

Example. The database layer of an applicationused by the NATO for management of data of differ-ent security levels consists of three data stores hosted

CLOUD�DATA�PATTERNS�FOR�CONFIDENTIALITY

389

Page 5: CLOUD DATA PATTERNS FOR CONFIDENTIALITY … - Cloud Data...CLOUD DATA PATTERNS FOR CONFIDENTIALITY Steve Strauch 1, ... are business secrets of companies, ... to the public while keeping

using different deployment models. The data storehosted traditionally is built on Oracle CorporationMySQL, version 5.1. The second and third parts us-ing Amazon VPC Cloud data store hosting an AMIand Amazon RDS. The AMI runs MySQL 5.1 rela-tional database on OpenSolaris. Regarding AmazonRDS, a MySQL DB instance is chosen. Thus, thedatabase functionality remains the same. Now sev-eral tera bytes of new data sets have to be importedsequentially in the data management system. Thisdata is neither categorized nor annotated regarding theclassification levels defined by the NATO. Therefore,the splitter is used and configured as follows: All per-sonal information has to be categorized as “NATO Se-cret” and stored in the Amazon VPC Cloud data store.All financial information have to be annotated with“NATO Confidential” and stored in the traditionallyhosted part of the database layer. All other data is cat-egorizes as “NATO Restricted” and stored in AmazonRDS.

Next. In case the stored data should be retrieved,the Confidentiality Data Level Aggregator has to beconsidered.

2.3 Filter of Critical Data

Context. The private Cloud data storecontains critical and non-critical data.To prevent disclosure of the criticaldata it has to be enforced that the crit-ical data does not leave the private

Cloud. The logic implemented in the business layeris split into the one processing critical and the oneprocessing non-critical data. The party implementingor hosting the processing of the non-critical data maynot be trusted.

Challenge. How can data-access rights be keptwhen moving the database layer into the private Cloudand a part of the business layer as well as a part of thedata access layer into the public Cloud?

Forces. As the business layer in the public Cloudneeds to have access to the non-critical data, it isno option to completely forbid access to the data-base from the public Cloud. As the business layerhosted traditionally might need to have access to thecomplete data, the distribution of critical and non-critical data into different databases would require ad-justments of the corresponding business logic and thedata access layer hosted traditionally.

Solution. A Filter of Critical Data is placedwithin the Cloud infrastructure of the private Clouddata store. All requests to the private Cloud data storehave to be directed to the filter. The private Cloud datastore is only reachable by the filter of critical data.

Requests for critical data originating off-premise aredenied by the filter.

Sketch Filter for Critical Data

Traditional

Presentation Layer

Application

Business Layer*

Data Access Layer*

Private Cloud

Community Cloud

Deployment Models

Public Cloud

Database Layer

Database Layer

Business Layer*

Data Access Layer*

Legend

Dataflow

Migration

Partial Migration

Modified Component *

Ap

plic

atio

n L

ayer

s

Figure 4: Sketch of Filter of Critical Data.

Sidebars. The data contained in the private Clouddata store has to be categorized into critical and non-critical data, e. g., by applying the security classifica-tion of the NATO (Department of Defense SecurityInstitute, 1993) and annotating the data accordingly.

Both parts of the data access layer hosted tradi-tionally and in the public Cloud have to be reconfig-ured to access the database layer only through the fil-ter of critical data. We denote the required reconfig-urations as “Data Access Layer*” in Figure 4. More-over, the business layer hosted in the public Cloudhas to be enabled to deal with errors occurring whendata access to critical data is not granted, denoted by“Business Layer*” in Figure 4.

The method for filtering, the respective granular-ity, and the information to be filtered have to be con-figured, e. g., by parameterization.

The application in the public Cloud has to beaware that the data returned might be incomplete.Thus, the application design has to consider differentavailability of data. For instance, a function “writeletters to all costumers” cannot be put in the publicCloud if the customer data must not be disclosed.

As the data access exclusively happens throughthe Filter of Critical Data this pattern decreases per-formance. Therefore, it is not applicable for dis-tributed applications with high data traffic. Besides, ithas to be avoided that the implementation of the filterbecomes a single point of failure. To overcome thatlimitation, it could be horizontally scaled on demand,for instance.

Results. Assuming the data transfer from andto the private database layer happens exclusivelythrough the Filter of Critical Data, only non-criticaldata is passed to the public Cloud and critical data re-mains private.

Example. The database layer of a task man-agement application of a car manufacturing companybuilt on Oracle Corporation MySQL (Oracle Corpo-

CLOSER�2012�-�2nd�International�Conference�on�Cloud�Computing�and�Services�Science

390

Page 6: CLOUD DATA PATTERNS FOR CONFIDENTIALITY … - Cloud Data...CLOUD DATA PATTERNS FOR CONFIDENTIALITY Steve Strauch 1, ... are business secrets of companies, ... to the public while keeping

ration, 2011), version 5.1 is moved to Amazon Vir-tual Private Cloud (Amazon VPC (Amazon.com, Inc.,2011b)) choosing an Amazon Machine Image (AMI)running MySQL 5.1 relational database on OpenSo-laris. Thus, the database functionality remains thesame. The database contains (among others) tasksdirectly related to company internal research. Thesetasks are only allowed to be executed if the employeerequesting the task details is currently located withinthe research technology center (cf. “Location Depen-dent Task Visibility and Access” (Unger and Bauer,2008)). In case the request for such a task originatesfrom outside the research technology center the ac-cess is denied by providing an error message. Thelocation-based data filtering is IP-based and the inter-nal IPs assigned to VPN connections are consideredexternal.

Next. In case denying of access to critical datais not acceptable, the Anonymizer of Critical Dataor Pseudonymizer of Critical Data have to be consid-ered.

2.4 Pseudonymizer of Critical Data

Context. The private Cloud data storecontains critical and non-critical data.The business layer is partially movedto the public Cloud and needs accessto data. The logic implemented in the

business layer is split into one requiring critical dataand one, where critical data in pseudonymized formis sufficient for processing. This modification ofthe original business layer is denoted by “BusinessLayer*” in Figure 5. The party hosting the process-ing for which critical data in pseudonymized form issufficient may not be trusted. Furthermore, passingcritical data may be restricted by compliance regula-tions. It is required to relate the pseudonymized pro-cessing results from the public business layer to thecritical data.

Challenge. How can a private Cloud data storeensure passing critical data in pseudonymized formto the public Cloud?

Forces. It has to be ensured that all data clas-sified as critical is pseudonymized in the data setpassed to the public Cloud in order to prevent thatthe pseudonymization can be broken, e. g., by usingtools for data analysis and data mining. The relationbetween the pseudonymized data and the critical datahas to be stored in the private Cloud and prevented tobe passed to the public Cloud. This will enable the re-lation of the pseudonymized processing results fromthe public business layer to the critical data.

Solution. A pseudonymizer of critical data is

placed within the Cloud infrastructure of the privateCloud data storage. All requests to the private Clouddata storage have to be directed to the pseudonymizer.The private Cloud data storage is only reachable bythe pseudonymizer of critical data. Results of re-quests for critical data originating off-premise arepseudonymized.

Sketch Pseudonymizer for Personal Data

Traditional

Presentation Layer

Application

Business Layer*

Data Access Layer*

Private Cloud

Community Cloud

Deployment Models

Public Cloud

Database Layer

Database Layer

Business Layer*

Data Access Layer*

Legend

Dataflow

Migration

Partial Migration

Modified Component *

Ap

plic

atio

n L

ayer

s Figure 5: Sketch of Pseudonymizer of Critical Data.

Sidebars. The data contained in the private Clouddata store has to be categorized into critical and non-critical data, e. g., by applying the security classifica-tion described in the security regulations of the NATOand annotating the data accordingly. Both parts of thedata access layer hosted traditionally and in the pub-lic Cloud have to be reconfigured to access the data-base layer only through the Pseudonymizer of Criti-cal Data. We denote the required reconfigurations as“Data Access Layer*” in Figure 5.

The method for pseudonymization has to be con-figured and the relation has to be stored in the privateCloud data store.

Results. Assuming the data transfer from andto the private database layer happens exclusivelythrough the Pseudonymizer of Critical Data, onlynon-critical data and pseudonymized data is passedto the public Cloud.

Example. A company started an initiative to im-prove there image with respect to Green computing.An external partner has been contracted to continu-ously monitor the relevant data and to report monthlyon energy savings and energy saving potentials. Asa result the external company gets access to the data.As the company does not want to reveal its internalsecrets the data is pseudonymized.

Next. In case passing (and thus disclosing)pseudonymized critical data is not applicable, the Fil-ter of Critical Data or the Anonymizer for CriticalData have to be considered.

CLOUD�DATA�PATTERNS�FOR�CONFIDENTIALITY

391

Page 7: CLOUD DATA PATTERNS FOR CONFIDENTIALITY … - Cloud Data...CLOUD DATA PATTERNS FOR CONFIDENTIALITY Steve Strauch 1, ... are business secrets of companies, ... to the public while keeping

Sketch Anonymizer for Personal Data

Traditional

Presentation Layer

Application

Business Layer*

Data Access Layer*

Private Cloud

Community Cloud

Deployment Models

Public Cloud

Database Layer

Database Layer

Business Layer*

Data Access Layer*

Legend

Dataflow

Migration

Partial Migration

Modified Component *

Ap

plic

atio

n L

ayer

s

Figure 6: Sketch of Anonymizer of Critical Data.

2.5 Anonymizer of Critical Data

Context. The private Cloud data storecontains critical and non-critical data.The business layer is partially movedto the public Cloud and needs access todata. To prevent disclosure and misuse

the critical data is anonymized before being passed tothe public Cloud. The logic implemented in the busi-ness layer is split into one requiring critical data andone, where critical data in anonymized form is suffi-cient for processing. This modification of the origi-nal business layer is denoted by “Business Layer*” inFigure 6. The party implementing or hosting the pro-cessing for which critical data in anonymized form issufficient may not be trusted. It is not required to re-late the anonymized processing results from the pub-lic business layer to the critical data.

Challenge. How can a private Cloud data storeensure passing critical data in anonymized form to thepublic Cloud?

Forces. It has to be ensured, that all data classifiedas critical is sufficiently anonymized in the data setpassed to the public Cloud in order to prevent that theanonymization can be broken, e. g., by using tools fordata analysis and data mining.

Solution. An Anonymizer of Critical Data isplaced within the Cloud infrastructure of the privateCloud data store. All requests to the private Clouddata store have to be directed to the anonymizer.The private Cloud data store is only reachable by theAnonymizer of Critical Data. Results of requests forcritical data originating off-premise are anonymized.

Sidebars. The data contained in the private Clouddata store has to be categorized into critical and non-critical data, e. g., by applying the security classifica-tion described in the security regulations of the NATOand annotating the data accordingly. Both parts of thedata access layer hosted traditionally and in the publicCloud have to be reconfigured to access the databaselayer only through the Anonymizer of Critical Data.

We denote the required reconfigurations as “Data Ac-cess Layer*” in Figure 6.

The method for anonymization and the respectivegranularity have to be configured. For instance, anoff-premise application may retrieve anonymized datafor each person and another may only use data aggre-gated for groups of 1 000 persons.

Results. Assuming the data transfer from andto the private database layer happens exclusivelythrough the Anonymizer of Critical Data, only non-critical data and anonymized data is passed to the pub-lic Cloud.

Example. The database layer of an application formanaging the customers of a health insurance built onOracle Corporation MySQL, version 5.1 is moved toAmazon VPC choosing an AMI running MySQL 5.1relational database on OpenSolaris. Thus, the data-base functionality remains the same. The databasecontains the personal information as well as the med-ical history of the customers. The business logic cal-culating general statistics is outsourced to the publicCloud. As there is no personal information of the cus-tomers required to calculate the general statistics suchas how many people were suffering from a particularillness in the last year, only anonymized data is passedby the Anonymizer of Critical Data to the businesslayer in the public Cloud.

Next. In case passing (and thus disclosing) anony-mized critical data is not applicable, the Filter of Crit-ical Data has to be considered.

2.6 Composition of Cloud Data Patterns

Patterns of a pattern language are re-lated to each other, they have to be con-sidered as a whole and to be compos-able (Hohpe and Woolf, 2003). Thus,we have chosen the form of a pieceof a puzzle for the pattern icons. Wedo not claim that all Cloud Data Pat-terns are composable with each other.

Whether two or more Cloud Data Patterns are com-posable depends on the semantic and functionalityof each of the patterns. Moreover, the specific re-quirements and context of the needed solution effectwhether a composition of patterns is required. A com-position of the Confidentiality Level Data Aggregatorand the Anonymizer of Critical Data allows for theoff-premise application part to access the private datastored in different data stores in anonymized form.The detailed investigation of compositions of CloudData Patterns is out of scope of this paper.

CLOSER�2012�-�2nd�International�Conference�on�Cloud�Computing�and�Services�Science

392

Page 8: CLOUD DATA PATTERNS FOR CONFIDENTIALITY … - Cloud Data...CLOUD DATA PATTERNS FOR CONFIDENTIALITY Steve Strauch 1, ... are business secrets of companies, ... to the public while keeping

3 RELATED WORK

Pattern languages to define reusable solutions for re-curring challenges have been first proposed in archi-tecture by Christopher Alexander et al. (1977, 1979).Various publications on patterns exist in computer sci-ence that describe recommendations on how to facerecurring challenges in different domains. In the fol-lowing selection we focus on enterprise integration,application architecture, Cloud computing, data, se-curity, and privacy in the order they are mentioned.

Hohpe and Woolf (2003) investigate problems de-velopers have to solve when building and deploy-ing messaging solutions for enterprise integration.We reuse the pattern form introduced by Hohpe andWoolf for describing the Cloud Data Patterns.

Fowler et al. (2002) provide forty reusable solu-tions for developing enterprise applications. In con-trast to our work, database hosting in the Cloud isnot considered. Similar to Fowler et al. our patternsare independent from the concrete Cloud technologyused.

ARISTA Networks, Inc. (2009) provides sevenpatterns for Cloud computing. The only pattern deal-ing with data in the Cloud is the Cloud Storage Pat-tern. Fehling et al. (2011) provide architectural pat-terns to design, build, and manage applications usingCloud services. They focus on the general architec-ture. Solutions for moving or building the databaselayer in the Cloud are not provided. Nock (2008) pro-vides 25 patterns for data access in enterprise appli-cations. Nock does not treat Cloud data stores as wedo.

Schumacher et al. (2006) present reusable solu-tions for securing applications and to integrate se-curity design in the broader engineering process,but do not deal with data pseudonymization, dataanonymization, and data filtering. Hafiz (2006)presents a privacy design pattern catalog. Hafizachieves anonymity by mixing data with datafrom other sources instead of providing a generalpseudonymization, anonymization, or filtering pat-tern.

The handling of personal data in the Cloud in Ger-many is regulated by §11 of the German Federal DataProtection Law (Federal Ministry of Justice, 1990).The law distinguishes between pseudonymization andanonymization, which we reuse in our confidentialitypatterns. To the best of our knowledge, the patternspresented in this paper have not been outlined by otherauthors.

4 CONCLUSIONS

This work presented five reusable solutions to facechallenges of avoiding disclosure of confidential datawhen moving the database layer to the Cloud or de-signing an application using a database layer in theCloud. The list of the patterns has been harvestedduring our work with industry partners and researchprojects. We do not claim that the list of patterns iscomplete.

The presentation of the patterns did not go intotechnical details. For instance, scalability and singlepoint of failure has not been treated. A possible scal-ability mechanism and a counter-measure is to imple-ment each pattern using a hotpool in the Cloud. Ahotpool consists of multiple instances of the compo-nent and a watchdog.

Our next step is a description of the general com-position method and an evaluation using existing ap-plications to be migrated to the Cloud.

ACKNOWLEDGEMENTS

This work has been funded by the EU (4CaaSt,FP7/258862) and the BMWi (CloudCycle,01MD11023). We thank Vasilios Andrikopoulos(IAAS), Gerd Breiter (IBM), and Olaf Zimmermann(IBM) for their valuable input.

REFERENCES

Alexander, C. (1979). The Timeless Way of Building. Ox-ford University Press.

Alexander, C. et al. (1977). A Pattern Language. Towns,Buildings, Construction. Oxford University Press.

Amazon.com, Inc. (2011a). Amazon Relational DatabaseService.

Amazon.com, Inc. (2011b). Amazon Virtual Private Cloud.ARISTA Networks, Inc. (2009). Cloud Networking: Design

Patterns for Cloud-Centric Application Environments.Armbrust, M. et al. (2009). Above the Clouds: A Berke-

ley View of Cloud Computing. Technical ReportUCB/EECS-2009-28, EECS Department, Universityof California, Berkeley.

Department of Defense Security Institute (1993). Inter-national Programs Security Handbook, Chapter 10:NATO Security Procedures.

Dunkel, J. et al. (2008). Systemarchitekturen fuer VerteilteAnwendungen. Hanser Verlag.

Federal Ministry of Justice (1990). German Federal DataProtection Law.

Fehling, C. et al. (2011). An Architectural Pattern Languageof Cloud-based Applications. PLoP’11. ACM.

CLOUD�DATA�PATTERNS�FOR�CONFIDENTIALITY

393

Page 9: CLOUD DATA PATTERNS FOR CONFIDENTIALITY … - Cloud Data...CLOUD DATA PATTERNS FOR CONFIDENTIALITY Steve Strauch 1, ... are business secrets of companies, ... to the public while keeping

Fowler, M. et al. (2002). Patterns of Enterprise ApplicationArchitecture. Addison-Wesley Professional.

Hafiz, M. (2006). A Collection of Privacy Design Patterns.PLoP’06.

Hohpe, G. and Woolf, B. (2003). Enterprise IntegrationPatterns: Designing, Building, and Deploying Mes-saging Solutions. Addison-Wesley.

Mell, P. and Grance, T. (2009). Cloud Computing Defini-tion. National Institute of Standards and Technology.

Nock, C. (2008). Data Access Patterns: Database Interac-tions in Object Oriented Applications. Prentice HallPTR.

Oracle Corporation (2011). MySQL.Schumacher, M. et al. (2006). Security Patterns: Integrat-

ing Security and Systems Engineering. Wiley.Unger, T. and Bauer, T. (2008). Towards a Standardized

Task Management. MKWI’08. GITO-Verlag, Berlin.

CLOSER�2012�-�2nd�International�Conference�on�Cloud�Computing�and�Services�Science

394