database scalability, elasticity, and autonomy in the cloud agrawal et al

15
Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal et al. Oct 24, 2011

Upload: ifama

Post on 23-Feb-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal et al. Oct 24, 2011. Framing. Survey paper Identifies necessary qualities of cloud storage Scalability Sensible consistency / programming model Scale-down and migration Autonomic management - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Database Scalability, Elasticity, and Autonomy in the Cloud

Agrawal et al.

Oct 24, 2011

Page 2: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Framing

• Survey paper• Identifies necessary qualities of cloud storage– Scalability– Sensible consistency / programming model– Scale-down and migration– Autonomic management

• Pointers to different work in the space

Page 3: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Scalability

• Add more resources, get more performance– Handle more requests per second– Store more data

• Achievable with scale-up or scale-out– Scale-out is the only paradigm for the cloud

• App’s parallelism is limited by Amdahl’s Law

Page 4: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Finding the right design point

• What’s the right consistency / programming model?• Pure key-value stores are too weak– Only have transactions on single records

• Traditional RDBMs are too strong– Can’t just run MySQL at scale

• Instead, provide strong consistency within a portion of the data– Megastore– Vertica, Aster, Teradata, Greenplum, …

Page 5: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Data Fusion vs. Data Fission

Consistency

Weak Strong

Dynamo MySQLBigTable, PNUTS

Fusion Fission

Megastore,G-Store

Azure,ElasTraS,Rel Cloud

Page 6: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Data Fusion

• Start with a key-value store• Partition records into groups • Provide multi-record updates within a group• Cross-group operations handled separately• Assumes that cross-group ops are rare

Page 7: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Data Fission

• Start with a relational database• Partition tables into shards • Provide ACID within each shard• Cross-shard ops are expensive• Assumes that cross-shard ops are rare

Page 8: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

What’s the difference?

• Is Fusion vs. Fission a worthwhile distinction?• Seems like they both arrive at the same place• Megastore “Fusion” vs. ElasTras “Fission”– Shard tables based on a table’s primary key– Shard is co-located on the same machine– ACID transactions within a shard– Primary and secondary indexes– All Megastore is missing is an SQL interface!

Page 9: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

The difference

• Different targeted users– Fusion is for people who own datacenters– Fission is for people who want SQL in the cloud

• Different exposed API– Fusion is more explicit about performance– Fission tries to hide partitioning from user

• Anything else?

Page 10: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Elasticity

• Dynamically scaling up and down on-demand• Important with pay-as-you-go cloud pricing• Consolidate to reduce costs• Expand to increase performance• Need to move state and processing duties

around within the system

Page 11: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Live migration of databases

• Shared-disk– “Global disk” shared by all DB nodes– Just need to copy in-memory state– Iterative copy: sync up cached pages + transaction state

to minimize the availability hit• Shared-nothing– Each DB node is its own separate DB instance– Need to copy both local disk state and memory– Push/pull: gradually shift new requests to the new node,

sync state in the background

Page 12: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Database Autonomy

• Need management to be more automatic• Elasticity and load balancing based on usage

and ML predictions• Performance modeling– Migration costs (availability, performance, $$$)– Resource isolation (consolidated services)– SLAs

Page 13: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Questions?

Page 14: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Tree schema

• Primary table’s primary key used for sharding• Secondary tables are sharded into row groups– Row groups are co-located and transactional

• Global tables are write-rarely, and replicated on all nodes

Page 15: Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal  et al

Tree schema