a survey of advanced non-relational database systems: approaches and applications
DESCRIPTION
TRANSCRIPT
A Survey of Advanced Non-relational Database Systems: Approaches and Applications
Speaker: LIN Qian
http://www.comp.nus.edu.sg/~linqian
Outline
• Introduction
• Non-relational database system– Requirement– Concepts– Approaches– Optimization– Examples
• Comparison between RDBMS and non-relational database system
2
Problem
• The Web introduces a new scale for applications, in terms of:– Concurrent users (millions of reqs/second)– Data (peta-bytes generated daily)– Processing (all this data needs processing)– Exponential growth (surging unpredictable demands)
• Shortage of existing RDBMS– Oracle, MS SQL, Sybase, MySQL, PostgreSQL, …– Trouble when dealing with very large traffic – Even with their high-end clustering solutions
3
4
• Why?– Applications using normalized database schema require the
use of join, which doesn't perform well under lots of data and/or nodes
– Existing RDBMS clustering solutions require scale-up, which is limited and not really scalable when dealing with exponential growth (e.g., 1000+ nodes)
– Machines have upper limits on capacity
Problem
5
• Why not just use sharding?– Very complex and application-specific
• Increased complexity of SQL• Single point of failure• Failover servers more complex• Backups more complex• Operational complexity added
– Very problematic when adding/removing nodes– Basically, you end up denormalizing everything and loosing
all benefits of relational databases
Sharding: Split one or more tables by row across potentially multiple instances of the schema and database servers.
Problem
6
• Web applications dealing with high traffic and massive data– Web service providers
• Google, Yahoo!, Amazon, Facebook, Twitter, LinkedIn, …
– Scientific data analysis• Weather, Ocean, tide, geothermy, …
– Complex information processing• Financial, stock, telecommunication, …
Who faced this problem?
7
• A new kind of DBMS, capable of handling web scale– Possibly sacrificing some level of feature
• CAP theorem*: You can only optimize 2 out of these 3– Consistency - the system is in a consistent state after an
operation• All nodes see the same data at the same time• Strong consistency (ACID) vs. eventual consistency (BASE)
– Availability - the system is “always on”, no downtime• Node failure tolerance: All clients can find some available replica.• software/hardware upgrade tolerance
– Partition tolerance• The system continues to operate (read/write) despite arbitrary
message loss or failure of part of the system
Solution
* Eric A. Brewer, Towards Robust Distributed Systems, Proceedings of the 19th annual ACM symposium on Principles of Distributed Computing (PODC), 2000
8
• Various solutions & products– BigTable, LevelDB (developed at Google)– Hbase (developed at Yahoo!)– Dynamo (developed at Amazon)– Cassandra (developed at FaceBook)– Voldemort (developed at LinkedIn)– Riak, Redis, CouchDB, MongoDB, Berkeley DB, …
• Researches– NoDB, Walnut, LogBase, Albatross, Citrusleaf, HadoopDB– PIQL, RAMCloud
Non-relational database systems
9
• Massively scalable
• Extremely fast
• Highly available, decentralized and fault tolerant– no single-point-of-failure
• Transparent sharding (consistent hashing)
• Elasticity
• Parallel processing
• Dynamic schema
• Automatic conflict resolution
Benefits
10
• Allows sacrificing consistency (ACID)– at certain circumstances, but can deal with it
• Non-standard new API model
• Non-standard new Schema model
• New knowledge required to tune/optimize
• Less mature
Cost
11
• Data model: Key-Value store– (row:string, column:string, time:int64) → string– An opaque serialized object
• API model– Get(key)– Put(key, value)– Delete(key)– Execute(operation, key_list)
• Schema model– None– Kind of sparse table
Data/API/Schema model
12
• MapReduce*
– An API exposed by non-relational databases to process data– A functional programming pattern for parallelizing work– Brings the workers to the data
• excellent fit for non-relational databases
– Minimizes the programming to 2 simple functions• map & reduce
Data processing
*: Jeffrey Dean and Sanjay Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI), 2004.
13
• Exploits the characteristics of Cayley graphs to provide the scalability for supporting multiple distributed indexes of different types.
• Define a methodology to map various types of data and P2P overlays to a generalized Cayley graph structure.
• Propose self-tuning strategies to optimize the performance of the indexes defined over the generic Cayley overlay.
Optimization: Distributed indexing
14
• Albatross is a technique for live migration in a multitenant database which can migrate a live tenant database with no aborted transactions.– Phase 1: Begin Migration.– Phase 2: Iterative Copying.– Phase 3: Atomic Handover.
Optimization: Data migration
15
• High-performance embeddable database providing SQL, Java Object and Key-Value storage– Relational Storage - Support SQL.– Synchronization - extend the reach of existing applications
to mobile devices by supporting unparalleled performance and a robust data store on the mobile device.
– Replication - Provide a single-master multi-replica highly available database configuration.
Example: Oracle Berkeley DB
Storage engine
16
• Fully managed NoSQL database service providing fast and predictable performance with seamless scalability– Provisioned throughput
• Allocate dedicated resources to table to performance requirements, and automatically partitions data over a sufficient number of servers to meet request capacity.
– Consistency model• The eventual consistency option maximizes read throughput.
– Data Model• Attributes, Items and Tables
Example: Amazon DynamoDB
17
• Non-relational, distributed database running on top of HDFS providing Bigtable-like capabilities for Hadoop– Strongly consistent reads/writes– Automatic sharding– Hadoop/HDFS Integration– Block Cache and Bloom Filters– Operational Management
Example: HBase
18
• Scalable, fault-tolerant, and schema-free document-oriented database– Document Storage– Distributed Architecture with Replication– Map/Reduce Views and Indexes– ACID Semantics– Eventual Consistency– Built for Offline
Example: CouchDB
19
• A distributed database architected for availability, fault-tolerance, operational simplicity and scalability.– Operate in highly distributed environments– Scale simply and intelligently– Master-less– Highly fault-tolerant – Incredibly stable
Example: Riak
20
• Document-oriented NoSQL database system– Scale horizontally without compromising functionality– Document-oriented storage– Full index support – Master-slave replication– Rich, document-based queries
Example: MongoDB
21
• Transaction– Web apps can (usually) do without transactions / strong
consistency / integrity– Bigtable does not support transactions across multiple rows
• support single-row transactions• provide an interface for batching writes across row keys at the
clients
• Scalability– Parallel DBMS vs. MapReduce-base system
Comparison with RDBMS
THANK YOU!
22
Backup
23
24
• When you have a lot of data which needs to be highly available, you'll usually need to partition it across machines & also replicate it to be more fault-tolerant
• This means, that when writing a record, all replica's must be updated too
• Now you need to choose between:– Lock all relevant replicas during update => be less available– Don't lock the replicas => be less consistent
Example of the CAP theorem