beyond aurora. scale-out sql databases for aws

21
© 2015 CLUSTRIX The First Scale-out SQL Database Engineered for Today’s Cloud Beyond Aurora. Scale-out SQL databases for AWS

Upload: clustrix

Post on 20-Jan-2017

347 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Beyond Aurora. Scale-out SQL databases for AWS

© 2015 CLUSTRIX

The First Scale-out SQL Database Engineered for Today’s Cloud

Beyond Aurora. Scale-out SQL databases for AWS

Page 2: Beyond Aurora. Scale-out SQL databases for AWS

05/01/2023

Magnus DataSCALE OUT RDBMS

Magnus Data

Page 3: Beyond Aurora. Scale-out SQL databases for AWS

05/01/2023

Agenda

Market Landscape of DB market Options to Scale DB Scale-Out Architecture Comparisons of solutions for high transaction relational databases

Page 4: Beyond Aurora. Scale-out SQL databases for AWS

05/01/2023

Generalized and Specialized

4High Concurrency/Write heavy /Real Time Analytics Historical Analytics Exploratory

Transactional Analytics

Traditional Databases

No SQL

DW/Analytical DBMS

Operational System/OLTP (New SQL)

Hadoop

Page 5: Beyond Aurora. Scale-out SQL databases for AWS

05/01/2023

Scale-Up vs. Scale-Out

5

Scale-Out databases

Transactions Per Second

Late

ncy

High

High

Scale-Up Databases (like Aurora and MySQL)

Page 6: Beyond Aurora. Scale-out SQL databases for AWS

05/01/2023

RDBMS Scaling Techniques

Scale-Up Master Slave Master Master MySQL Clustering Technologies Sharding Scale-Out

Page 7: Beyond Aurora. Scale-out SQL databases for AWS

05/01/2023

Options to Scale DBMS

DBMS

Scale Out

e.g., MongoDBNo transactionsMay have weak consistency (CAP)Application involves DB Coding

e.g. ClustrixDBACIDProven Scalability (Reads and Writes)Shared Nothing

Scale Up

e.g., AuroraReads Scalelimited scalability on writesNot Shared nothing scale out

Page 8: Beyond Aurora. Scale-out SQL databases for AWS

8Scaling-Up

Keep increasing the size of the (single) database server Pros

Simple, no application changes needed Cons

Expensive. At some point, you’re paying 5x for 2x the performance ‘Exotic’ hardware (128 cores and above) become price prohibitive Eventually you ‘hit the wall’, and you literally cannot scale-up

anymore

Page 9: Beyond Aurora. Scale-out SQL databases for AWS

9Scaling Reads: Master/Slave

Add a ‘Slave’ read-server(s) to your ‘Master’ database server Pros

Reasonably simple to implement. Read/write fan-out can be done at the proxy level

Cons Only adds Read performance Data consistency issues can occur, especially if the application isn’t

coded to ensure reads from the slave are consistent with reads from the master

Page 10: Beyond Aurora. Scale-out SQL databases for AWS

10Scaling Writes: Master/Master

Add additional ‘Master’(s) to your ‘Master’ database server Pros

Adds Write scaling without needing to shard Cons

Adds write scaling at the cost of read-slaves Adding read-slaves would add even more latency Application changes are required to ensure data consistency / conflict resolution

Page 11: Beyond Aurora. Scale-out SQL databases for AWS

11Scaling Reads & Writes: Sharding

SHARDO1 SHARDO2 SHARDO3 SHARDO4

Partitioning tables across separate database servers Pros

Adds both write and read scaling Cons

Loses the ability of an RDBMS to manage transactionality, referential integrity and ACID ACID compliance & transactionality must be managed at the application level Consistent backups across all the shards are very hard to manage Read and Writes can be skewed / unbalanced Application changes can be significant

A - K

L - O

P - S

T - Z

Page 12: Beyond Aurora. Scale-out SQL databases for AWS

12Scaling Reads & Writes: MySQL Cluster

Provides shared-nothing clustering and auto-sharding for MySQL. (designed for Telco deployments: minimal cross-node transactions, HA emphasis)

Pros Distributed, multi-master model Provides high availability and high throughput

Cons Only supports read-committed isolation Long-running transactions can block a node restart SBR replication not supported Range scans are expensive and lower performance than MySQL Unclear how it scales with many nodes

Page 13: Beyond Aurora. Scale-out SQL databases for AWS

13Application Workload Partitioning

Partition entire application + RDBMS stack across several “pods” Pros

Adds both write and read scaling Flexible: can keep scaling with addition of pods

Cons No data consistency across pods (only suited for cases where

it is not needed) High overhead in DBMS maintenance and upgrade Queries / Reports across all pods can be very complex Complex environment to setup and support

APP

APP

APP

APP

APP

APP

Page 14: Beyond Aurora. Scale-out SQL databases for AWS

DBMS Capacity, Elasticity and Resiliency14

Scale-up

Master – Slave

Master – Master

MySQL Cluster

Sharding

Scale-Out

DBMS ScalingMany cores – very expensive

Reads Only

Read / Write

Read / Write

Unbalanced Read/Writes

Read / Write

CapacitySingle Point Failure

Fail-over

Yes

Yes

Multiple points of failure

Yes

ResiliencyElasticityNo

No

No

No

No

Yes

None

Yes – for read scale

High – update conflict

None (or minor)

Very High

None

Application Impact

Page 15: Beyond Aurora. Scale-out SQL databases for AWS

05/01/2023

DBMS Architecture-Scale out

Shared Nothing Architecture

Compiler Map

Engine Data

Compiler Map

Engine Data

Compiler Map

Engine Data

Each Node Contains: Query Parser/Planner: distribute partial

query fragments to the nodes. Data Map: all nodes metadata about

data across the cluster Database Engine: all nodes can perform

all database operations (no leader, aggregator, leaf, data-only, etc nodes)

Data: Table Distributed: All table auto-redistributed

Page 16: Beyond Aurora. Scale-out SQL databases for AWS

16Bi

llion

s of

Row

s

DatabaseTables

S1 S2S2S3

S3 S4S4 S5S5

Intelligent Data Distribution

S1

ClustrixDB

Tables Auto Distributed across nodes Tunable amount of redundancy of data across

nodes Tables are auto distributed, auto-protected

Page 17: Beyond Aurora. Scale-out SQL databases for AWS

17

Query

Distributed Query Processing

ClustrixDBLoad

Balancer

TRXTRXTRX

Queries are fielded by any peer node Routed to node holding the data

Complex queries are split into steps and processed in parallel Automatically distributed for optimized performance

All nodes handle writes and reads Result is aggregated and returned to the user

Page 18: Beyond Aurora. Scale-out SQL databases for AWS

DBMS Capacity, Elasticity and Resiliency18

Features ClustrixDB AuroraWrite Scalability Writes scales by adding nodes Cannot add write nodes

High Concurrency Latency Low with High concurrency

Latency climbs quickly with high concurrency

ACID Yes YesOn-Demand Write Scale Yes NoAutomatically Distributed queries

Yes: No Application changes No: Read/Write fanout needed. Write contention on Master

Cloud/On Premises Yes No, only AWS CloudShared Nothing Storage Yes: Parallel data access No: Contention at high write

concurrency

Page 19: Beyond Aurora. Scale-out SQL databases for AWS

05/01/2023

Benchmark Results

0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,0000

10

20

30

Sysbench OLTP 90:10 Mix

Throughput (tps)

Aver

age

Late

ncy

(ms)

0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,0000

10

20

30

Sysbench OLTP 90:10 Mix

Clustrix 4 Node Aurora Mysql RDS

Throughput (tps)

Aver

age

Late

ncy

(ms)

Page 20: Beyond Aurora. Scale-out SQL databases for AWS

05/01/2023

Scalability Test

0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000 50,0000

10

20

30

Sysbench OLTP 90:10 Mix

Throughput (tps)

Aver

age

Late

ncy

(ms)

Page 21: Beyond Aurora. Scale-out SQL databases for AWS

21

Thank you.

Q&A