a survey of advanced non-relational database systems: approaches and applications

A Survey of Advanced Non-relational Database Systems: Approaches and Applications

Speaker: LIN Qian

http://www.comp.nus.edu.sg/~linqian

Outline

• Introduction

• Non-relational database system– Requirement– Concepts– Approaches– Optimization– Examples

• Comparison between RDBMS and non-relational database system

2

Problem

• The Web introduces a new scale for applications, in terms of:– Concurrent users (millions of reqs/second)– Data (peta-bytes generated daily)– Processing (all this data needs processing)– Exponential growth (surging unpredictable demands)

• Shortage of existing RDBMS– Oracle, MS SQL, Sybase, MySQL, PostgreSQL, …– Trouble when dealing with very large traffic – Even with their high-end clustering solutions

3

4

• Why?– Applications using normalized database schema require the

use of join, which doesn't perform well under lots of data and/or nodes

– Existing RDBMS clustering solutions require scale-up, which is limited and not really scalable when dealing with exponential growth (e.g., 1000+ nodes)

– Machines have upper limits on capacity

Problem

5

• Why not just use sharding?– Very complex and application-specific

• Increased complexity of SQL• Single point of failure• Failover servers more complex• Backups more complex• Operational complexity added

– Very problematic when adding/removing nodes– Basically, you end up denormalizing everything and loosing

all benefits of relational databases

Sharding: Split one or more tables by row across potentially multiple instances of the schema and database servers.

Problem

6

• Web applications dealing with high traffic and massive data– Web service providers

• Google, Yahoo!, Amazon, Facebook, Twitter, LinkedIn, …

– Scientific data analysis• Weather, Ocean, tide, geothermy, …

– Complex information processing• Financial, stock, telecommunication, …

Who faced this problem?

7

• A new kind of DBMS, capable of handling web scale– Possibly sacrificing some level of feature

• CAP theorem*: You can only optimize 2 out of these 3– Consistency - the system is in a consistent state after an

operation• All nodes see the same data at the same time• Strong consistency (ACID) vs. eventual consistency (BASE)

– Availability - the system is “always on”, no downtime• Node failure tolerance: All clients can find some available replica.• software/hardware upgrade tolerance

– Partition tolerance• The system continues to operate (read/write) despite arbitrary

message loss or failure of part of the system

Solution

* Eric A. Brewer, Towards Robust Distributed Systems, Proceedings of the 19th annual ACM symposium on Principles of Distributed Computing (PODC), 2000

8

• Various solutions & products– BigTable, LevelDB (developed at Google)– Hbase (developed at Yahoo!)– Dynamo (developed at Amazon)– Cassandra (developed at FaceBook)– Voldemort (developed at LinkedIn)– Riak, Redis, CouchDB, MongoDB, Berkeley DB, …

• Researches– NoDB, Walnut, LogBase, Albatross, Citrusleaf, HadoopDB– PIQL, RAMCloud

Non-relational database systems

9

• Massively scalable

• Extremely fast

• Highly available, decentralized and fault tolerant– no single-point-of-failure

• Transparent sharding (consistent hashing)

• Elasticity

• Parallel processing

• Dynamic schema

• Automatic conflict resolution

Benefits

10

• Allows sacrificing consistency (ACID)– at certain circumstances, but can deal with it

• Non-standard new API model

• Non-standard new Schema model

• New knowledge required to tune/optimize

• Less mature

Cost

11

• Data model: Key-Value store– (row:string, column:string, time:int64) → string– An opaque serialized object

• API model– Get(key)– Put(key, value)– Delete(key)– Execute(operation, key_list)

• Schema model– None– Kind of sparse table

Data/API/Schema model

12

• MapReduce*

– An API exposed by non-relational databases to process data– A functional programming pattern for parallelizing work– Brings the workers to the data

• excellent fit for non-relational databases

– Minimizes the programming to 2 simple functions• map & reduce

Data processing

*: Jeffrey Dean and Sanjay Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI), 2004.

13

• Exploits the characteristics of Cayley graphs to provide the scalability for supporting multiple distributed indexes of different types.

• Define a methodology to map various types of data and P2P overlays to a generalized Cayley graph structure.

• Propose self-tuning strategies to optimize the performance of the indexes defined over the generic Cayley overlay.

Optimization: Distributed indexing

14

• Albatross is a technique for live migration in a multitenant database which can migrate a live tenant database with no aborted transactions.– Phase 1: Begin Migration.– Phase 2: Iterative Copying.– Phase 3: Atomic Handover.

Optimization: Data migration

15

• High-performance embeddable database providing SQL, Java Object and Key-Value storage– Relational Storage - Support SQL.– Synchronization - extend the reach of existing applications

to mobile devices by supporting unparalleled performance and a robust data store on the mobile device.

– Replication - Provide a single-master multi-replica highly available database configuration.

Example: Oracle Berkeley DB

Storage engine

16

• Fully managed NoSQL database service providing fast and predictable performance with seamless scalability– Provisioned throughput

• Allocate dedicated resources to table to performance requirements, and automatically partitions data over a sufficient number of servers to meet request capacity.

– Consistency model• The eventual consistency option maximizes read throughput.

– Data Model• Attributes, Items and Tables

Example: Amazon DynamoDB

17

• Non-relational, distributed database running on top of HDFS providing Bigtable-like capabilities for Hadoop– Strongly consistent reads/writes– Automatic sharding– Hadoop/HDFS Integration– Block Cache and Bloom Filters– Operational Management

Example: HBase

18

• Scalable, fault-tolerant, and schema-free document-oriented database– Document Storage– Distributed Architecture with Replication– Map/Reduce Views and Indexes– ACID Semantics– Eventual Consistency– Built for Offline

Example: CouchDB

19

• A distributed database architected for availability, fault-tolerance, operational simplicity and scalability.– Operate in highly distributed environments– Scale simply and intelligently– Master-less– Highly fault-tolerant – Incredibly stable

Example: Riak

20

• Document-oriented NoSQL database system– Scale horizontally without compromising functionality– Document-oriented storage– Full index support – Master-slave replication– Rich, document-based queries

Example: MongoDB

21

• Transaction– Web apps can (usually) do without transactions / strong

consistency / integrity– Bigtable does not support transactions across multiple rows

• support single-row transactions• provide an interface for batching writes across row keys at the

clients

• Scalability– Parallel DBMS vs. MapReduce-base system

Comparison with RDBMS

THANK YOU!

22

Backup

23

24

• When you have a lot of data which needs to be highly available, you'll usually need to partition it across machines & also replicate it to be more fault-tolerant

• This means, that when writing a record, all replica's must be updated too

• Now you need to choose between:– Lock all relevant replicas during update => be less available– Don't lock the replicas => be less consistent

Example of the CAP theorem

a survey of advanced non-relational database systems: approaches and applications

Documents

data processing mapreduce

dataapischema model

data migration albatross

robust data store

various types of data

nonrelational databases

data excellent fit

database servers