cloud data management literature survey and paper critique team members frank paladino aravind...

25
Cloud Data Management Literature survey and Paper Critique Team Members Frank Paladino Aravind Yeluripiti

Upload: cecil-mason

Post on 25-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Cloud Data ManagementLiterature survey and Paper Critique

Team MembersFrank Paladino

Aravind Yeluripiti

Papers Selected

• P1:Data Management in the cloud: Limitations and Opportunities

• P2: Towards a Self-Adaptive Data Management System for Cloud Environments

• P3: CloudDB: One Size Fits All Revived• P4:Cloud Data Management for Online Games:

Potentials and Open Issues• P5: Data in the Cloud: The Changing Nature of

Managing Data Accessibility

P1:Data Management in the cloud: Limitations and Opportunities

• Data Management– Characteristics of cloud environment

• Elasticity – Ability to size according to need• Data Privacy and Security – Subject to local rules and regulations• Replication across long distances – availability and durability of data

– Applications to consider for cloud deployment• Transaction processing – ACID guarantees difficult to maintain• Analytical – Ideal candidate for cloud deployment

• Data Analysis– Map-Reduce like software– Shared Nothing Parallel Databases– Desired Features

• Efficiency – Query performance• Fault Tolerance – Tasks may be reassigned as needed• Heterogeneous Environments – Ability to run on multiple nodes in parallel• Encrypted Data – Ability to operate on sensitive data• Ability to Interface with Business Intelligence Software for visualization & query generation

– Neither solution implements all desired features.– A hybrid solution may be needed.

P2: Towards a Self-Adaptive Data Management System for Cloud Environments

• Large scale implementations require guarantees of data availability and security.

• Optimize BLOBSeer, a data sharing system capable of handling massive amounts of unstructured data with self-management capability.– Elasticity – The ability to self configure.– Data Availability – The ability to self optimize.– Security – The ability to self protect.

• Three layer approach– Introspection layer – state and behavior– Monitoring layer – gather data from instrumented nodes– Instrumentation layer – generate and send information

P3: CloudDB: One Size Fits All Revived

• Develop a data management platform called CloudDB to provide a portfolio of products and offer them as services.

• Enable clients to scale accordingly as their needs and requirements evolve.

• System Architecture– Client data replicated in 3 underlying data stores optimized for

varying workload needs.• Relational – Traditional RDBMS to handle transactional workloads.• Key/Value – Scalability for read/write intensive workloads.• Columnar – Read optimized, throughput oriented for analytical (OLAP)

workloads.

– Workload Manager – Query dispatching and scheduling– Dispatcher – Submits the query to one of the valid replicas.

P4:Cloud Data Management for Online Games: Potentials and Open Issues

• Approach: Classify data into four data sets• Account Data, Game Data, State Data, Log Data

• Potentials:– Account data:

• Each operation a Transaction• Scale - not large: RDBMS as a service

– Log data• Scale – large, write once• Analyzed after a long time – strong consistency not important• Requirements easily met by Hadoop/cassandra

– Game data• Not a challenge – unless stored on client side• Manage them with traditional distributed file system or in HDFS

– State data• Managing in real time - Biggest challenge for disk redundant RDBMS• Cloud data – not intended to provide real-time support – continue with existing system.• Back up using cloud-based storage (cassandra)

• Open issues:– Data consistency, customized functionality, data partitioning, network traffic.

P5: Data in the Cloud: The Changing Nature of Managing Data Accessibility• Key Findings

– Fear of losing control over enterprise data is increasingly outweighed by the benefits offered by cloud-based application services

– Effective management and use of data represents a significant challenge for most organizations

– Cloud computing increases data management complexity through security and privacy issues

• Recommendations– IT organizations considering cloud based services:

• Examine the information delivery expectations across various corporate roles to determine data management needs

• Evaluate providers ability to support access to source data, transforming, moving, consolidating of data into internal or external applications

– Vendors offering data management and integration in the cloud• include performance and monitoring services and information delivery capabilities

optimized for managing data and services in the cloud• Ensure virtual environment meets ongoing business requirements

Paper Critique

• Paper selected:– Ziqiang Diao and Eike Schallehn. Cloud Data

Management for Online Games: Potentials and Open Issues. In Data Management in the Cloud (DMC). Köllen-Verlag, 2013. Accepted for publication.

• Authors:

Introduction

• MMOG– Massively Multiplayer Online Game– Support hundreds or thousands of players from all over

the world in parallel.– Players can

• choose a new identity• establish a new social network• compete or cooperate with other players• Even realize dreams that cannot be completed in real-life

– In 2011, US Americans overall spend 26 million hours per day and 2.6 billion dollars in total for playing MMOGs

Introduction….contd.

• Types of MMOGs– Role-playing – First-person shooter– Real-time strategy – Turn-based strategy– Simulation• Sports• Racing

– Casual• Music/Rhythm• Social

Problem

• MMORPGs– Massively Multiplayer Online Role-Playing Games– MMORPGs keep the virtual game environment running

even in the case of no players.– Account information, the state data of objects and

characters must be recorded on the server side in real-time.

– All of the player behavior in the game should be monitored and backed up in order to maintain the order of the virtual world.

– More concurrent players: E.g. World of Warcraft – millions of concurrent players

Problem….contd.

• Millions of concurrent players– Exacerbates the burden of managing data

• A qualified database system for data persistence– must guarantee data consistency– Also be efficient and scalable

• Existing RDBMS cannot fully satisfy all these requirements simultaneously.

• With the increasing data volume, – the storage system becomes a bottleneck, and – solving scalability and availability issues become a major

cost factor and development risk.

Solution?

• Cloud storage systems– Ability to support highly concurrent data accesses and

huge storage• In contrast to conventional DBMS– Cloud systems are generally designed for Web

applications that have • Different access characteristics• Require lower or different consistency levels.

• Need to analyze MMORPGs in more detail– To access the usability of Cloud storage systems– Identify open issues and possible solutions

Analysis: MMORPG’s current Data Management

Existing system• Distributed RDBS for data persistence

– Can commit complex transactions and are proved to be stable.• E.g. MySQL cluster

– Adopts a shared nothing architecture to ensure the system scalability– Automatically partitions data within a table based on primary keys

across all nodes– Each node

• helps clients to access correct shards to satisfy a query or commit a transaction• Data is replicated to multiple nodes to guarantee availability

– Applies two-phase commit (2PC) mechanism • to propagate data changes to the primary replica and one secondary replica

synchronously, and• Asynchronously modifies other replicas

– Can support real-time responses when tables are maintained in memory; can also be used an in-memory DBS in MMORPGs.

Data Management requirements of MMORPGs

• From system’s point of view, the essence of a game is– Data processing,– Storage, and – Transmission among databases, servers and players

• According to data management requirements - for the following considerations, – Data is classified into four data sets – Different classes should be managed according to their requirements

The case for Cloud-based Data Management for MMORPGs

RDBMS for data persistence Cloud-Storage Systems (Scalability and availability)

High performance

Not designed for managing data with large number of attributes e.g. table with 100s of columns

Can manage all attributes by applying a simplified data model as well as data redundancy

Scalability Limited by its complex schema, dataset volume growth has a significant impact on system performance

Proven to have a great potential for scalability

Flexible data model

Good at normalizing table schema and removing data redundancy, not at adapting to a dynamic schema and processing big data

Typically adopt a flexible data model, such as key-value data model. There is no fixed schema for items. Each item consists of a key and a dynamic set of attributes

Simplified data processing

Follow strict transaction mechanism, such as table-level or row-level atomicity, multi-version concurrency control, transaction isolation and rollback

Designed for web-applications, where strong consistency is not as necessary as in business applications. Generally do not support transaction processing

Proposal of an Architecture for Cloud-based MMORPGs

Using Cassandra for MMORPGs

• Features– Decentralized (peer-to-peer structure) – no network bottleneck– Provides column family based data model; simplified data

model – increased read performance – Adopts a shared-nothing architecture – scale up easily– Provides a quorum based data replication mechanism – ensure

availability and fault tolerance • Open problems

– Read Repair to guarantee data consistency– Need to develop new functions based on features of MMORPGs– Data partitioning – increases processing costs– Network Traffic – potential bandwidth bottlenecks

Novelty, Challenge and Interest

• Novelty– Cloud data management: new– Application to MMORPGs: even newer (paper yet to be

published) • Challenges– Having to deal with inherent complexities of a MMORPGs – Novelty implying no current cloud-based solution to compare.

• Interest– General interest in Computer Games, special interest in

MMOG and extreme special interest in MMORPGs– Opportunity to look at the domain from a data management

perspective.

Application to other data sets

• Data Set classification– Account Data– Game Data– State Data– Log Data

• The techniques described in this paper would be appropriate for any application which can have data sets broken down into separate units as in the case for MMORPRGs, which should be handled based on individual data set requirements

Positives and Negatives

• Positives– Novelty and application oriented discussion– Comprehensive analysis of MMORPGs data

management requirements– Detailed comparison of RDBMS and Cloud-based

systems• Negatives– Extreme Novelty– No Implementation of proposed solution– No comparison to other data sets

Future work

• Implementation and evaluation of results• Exploring alternate cloud based approaches to

MMORPGs • Exploring possible adaptations of the techniques

proposed to other applications • Embracing security in public cloud based

architectures instead of resorting to private cloud based implementations

• Focus on cooperation of multiple DMSs in one MMORPG and customization of a new Cloud storage system for MMORPGs

References• Abadi, Daniel J. Data Management in the Cloud: Limitations and

Opportunities. In IEEE Data Engineering Bulletin, 2009. • Alexandra Carpen-Amarie, Towards a Self-Adaptive Data Management System

for Cloud Environments IPDPSW '11 Proceedings of the 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum table of contents pages 2077-2080 IEEE Computer Society Washington, DC, USA ©2011ISBN: 978-0-7695-4577-6 doi>10.1109/IPDPS.2011.381

• Hakan Hacigümüs, Jun'ichi Tatemura, Wang-Pin Hsiung, Hyun J. Moon, Oliver Po, Arsany Sawires, Yun Chi, and Hojjat Jafarpour. Clouddb: One size fits all revived. In SERVICES, 2010.

• Ziqiang Diao and Eike Schallehn. Cloud Data Management for Online Games: Potentials and Open Issues. In Data Management in the Cloud (DMC). Köllen-Verlag, 2013. Accepted for publication.

• Eric Thoo ,Data in the Cloud: The Changing Nature of Managing Data Accessibility Garter RAS Core Research Note G00165291, 27 February 2009, RA2 12302009