in-memory database performance on aws m4 instances

25
The Database for Real-Time & Historical Big Data Analytics MemSQL Workshop Carlos Bueno (15 Jun 2015)

Upload: memsql

Post on 08-Aug-2015

251 views

Category:

Technology


2 download

TRANSCRIPT

The Database for Real-Time & Historical Big Data Analytics

MemSQL Workshop

Carlos Bueno (15 Jun 2015)

2

Workshop Agenda

▪The Company▪The Landscape▪The Software▪Hands-on▪Scaling up▪Q & A

3

▪ Experienced leadership from Facebook, SQL Server, Oracle, Fusion-io

▪ In-Memory, distributed, relational database

▪ Solving the Enterprise Architecture Gap

▪ Horizontal scale-out with modern database innovation

▪ $50 million in funding

MemSQL the company

Going Real-Time is the Next Phase for Big Data

More Sensors

More Interconnectivity

More User Demand

…and companies are at risk of being left behind

4

Current Data Management Challenges

ETL

Batch Processing

Big Iron Appliances

5

MemSQL the software

▪ Distributed and Parallel▪ Shared-Nothing, Lock-Free▪ Data in memory or SSD▪ SQL all the way down

6

MemSQL Engine: “memsqld”

▪ Basic scaling unit of a cluster▪ A full, independent RDBMS▪ 50,000 inserts / sec on wide table▪ ~1M inserts / sec on skinny table▪ Millions of primary-key lookups / sec

MemSQL

7

MemSQL Engine: Aggregators Aggregate

Agg 1 Agg 2

Leaf 1 Leaf 2 Leaf 3 Leaf 4

8

MemSQL Engine: Leaves Hold Data

Agg 1 Agg 2

Leaf 1 Leaf 2 Leaf 3 Leaf 4

9

MemSQL Engine: Sharding and Joins

Agg 1 Agg 2

Leaf 1 Leaf 2 Leaf 3 Leaf 4

select * from lineitem L, orders Owhere L.orderkey = O.orderkey...

leaf1> using memsql_demo_0select * from lineitem L, orders Owhere L.orderkey = O.orderkey...

leaf2> using memsql_demo_1select * from lineitem L, orders Owhere L.orderkey = O.orderkey...

10

MemSQL Engine: Compiled Queries

Parse

In Cache?

Execute

Codegen

select * from foo where id=1234and name like ‘%jingleheimer%’;

SELECT * FROM foo WHERE id = @AND name LIKE ^

11

Durability: Transactions (MVCC)

Every write creates a new version of rowOld versions get garbage-collectedReads are never blockedRow-level locking for writesAllows online ALTER TABLE!Multi-statement transactions

v1

v2

v3

v0

v4

readers

readers

writer readers

(waiting writer)

12

Durability: Logging and Snapshots

Every write saved to transaction log on disk:/var/lib/memsql/data/logsPeriodic compaction into a snapshot file:/var/lib/memsql/data/snapshots

On restart data is loaded into RAMLatest two snapshots are kept by default

13

Durability: High Availability

Leaves are paired upPartitions replicated asyncAutomatically fails overUses 2X space

Leaf 1 Leaf 2 Leaf 4Leaf 3

Agg 1 Agg 2

14

Minimum System Requirements

▪ 8GB RAM (32+ recommended)▪ 4X 64-bit Intel CPU cores (8+ recommended)▪ 2X RAM free disk space (for backups & logs)▪ 10GB swap space▪ Hyperthreading OFF (physical CPU cores)

▪ Modern Linux (Centos 6+, Ubuntu 12+)▪ 1gbps switched network

15

Licensing

▪ Community Edition• Free Forever, Unlimited Scale• Full SQL features

▪ Enterprise Edition• Subscription basis, by RAM capacity• No limit on disk storage• Enterprise support• Replication / High Availability

16

17

MemSQL “Cluster in a box”

▪ AWS m4.2xlarge: 8 cores, 32GB RAM

18

MemSQL “Cluster in a box”

19

MemSQL “Cluster in a box”

20

MemSQL “Cluster in a box”

21

MemSQL “Cluster in a box”

22

MemSQL “Cluster in a box”

23

MemSQL Speed Test

24

MemSQL Web Console

Thank Youwww.memsql.com