acunu & ocaml: experience report, cufp

Post on 21-Nov-2014

1.952 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Tom WilkieFounder & VP Engineering

tom@acunu.com@tom_wilkie

Acunu & OCaml: Experience Report

What do we do?

Old hardware

1990

BTree File systems

RAID

Small databases

BTree indexes

What do we do?

BTree file systems

2010

New hardware

RAID

Write-optimised indexes

Distributed, shared-nothing databases

BTree file systems

New hardware

RAID

Write-optimised indexes

...

What do we do?

Castle

2011

Distributed, shared-nothing databases

New hardware

Castle

New hardware

...

What does this have to do with

Functional Programming?

Big Data Applications

Cross-Cluster Management UI

Am

azon S

3 c

om

pat

ible

...

Acunu Storage Core

Open API

Management

Deployment

Monitoring

......

...

...... ............

Java,Erlang,

COCaml

CPython, Bash,Perl

Management Stack

Miscd AlertsDFSd

Version

Collection

Disk

NamedObjects

Base

Castle

Routerdenumeration, routing, clustering

HTML5/JavaScript User Interface

Autogeneranted OCaml CLI

External Monitoring Tools (Munin etc)

Cassandrad

Keyspace

ColumnFamily

Clusterd

Cassandra

Host

Group

ServiceCassandra_Node

S3d

BigS3

S3_Node

Bucket

Another Routerdon a different machine

Filesystem

Statsd

Report

Stat

Source

Default_Report

Alert_Rule

Alert

Miscd AlertsDFSd

Version

Collection

Disk

NamedObjects

Base

Castle

Routerdenumeration, routing, clustering

HTML5/JavaScript User Interface

Autogeneranted OCaml CLI

External Monitoring Tools (Munin etc)

Cassandrad

Keyspace

ColumnFamily

Clusterd

Cassandra

Host

Group

ServiceCassandra_Node

S3d

BigS3

S3_Node

Bucket

Another Routerdon a different machine

Filesystem

Statsd

Report

Stat

Source

Default_Report

Alert_Rule

Alert

Bridges to other systems

Miscd AlertsDFSd

Version

Collection

Disk

NamedObjects

Base

Castle

Routerdenumeration, routing, clustering

HTML5/JavaScript User Interface

Autogeneranted OCaml CLI

External Monitoring Tools (Munin etc)

Cassandrad

Keyspace

ColumnFamily

Clusterd

Cassandra

Host

Group

ServiceCassandra_Node

S3d

BigS3

S3_Node

Bucket

Another Routerdon a different machine

Filesystem

Statsd

Report

Stat

Source

Default_Report

Alert_Rule

AlertClustering

Failure Detection

Monitoring

Alerting

Miscd AlertsDFSd

Version

Collection

Disk

NamedObjects

Base

Castle

Routerdenumeration, routing, clustering

HTML5/JavaScript User Interface

Autogeneranted OCaml CLI

External Monitoring Tools (Munin etc)

Cassandrad

Keyspace

ColumnFamily

Clusterd

Cassandra

Host

Group

ServiceCassandra_Node

S3d

BigS3

S3_Node

Bucket

Another Routerdon a different machine

Filesystem

Statsd

Report

Stat

Source

Default_Report

Alert_Rule

Alert

Routing & Aggregation

Successes / Failures

Prototype “Filesystem”

• CoW BTrees

• Mod List BTrees

• LSM Trees

• Doubling Arrays

• Fractional Cascading

• Stratified DAs

• Multidimensional keys

• Z curve packing

Aim: Investigate algorithms for KV

storage

Doubling Array

2

9

2 9

Doubling Array

11

8 8 11

2 9 2 8 9 11

Inserts

etc...

Similar to log-structured merge trees (LSM), cache-oblivious lookahead array (COLA), ...

B = “block size”, say 8KB at 100 bytes/entry ~= 100 entries

Update Range Query(Size Z)

Log Structured B-Tree

O(logB N)random IOs

O(Z/B) random IOs

Doubling Array O((log N)/B)sequential IOs

O(Z/B) sequential IOs

~ log (2^30)/log 100= 5 IOs/update

~ log (2^30)/100= 0.2 IOs/update

8KB @ 100MB/s = 13k IOs/s

8KB @ 100MB/s, w/ 8ms seek = 100 IOs/s

13k / 0.2 = 65k updates/s

100 / 5 = 20 updates/s

BTree Disk Trace

Time (s)

Bloc

k In

dex

Time (secs)

Bloc

k In

dex

Doubling Array Disk Trace

# inserted kvps

Inse

rtio

n R

ate

(kvp

s/s)

OCaml Prototype Performance

The Dark Side...

Java Prototype Performance

Time (s)

Inse

rt R

ate

(key

s/s)

What about Castle?

Castle Performance

One more thing...

SNAPSHOTS*

* And clones!

I’ll explain how....

http://bit.ly/rduBia

“Castle: Re-inventing Storage For Big Data”

London, 27th September

Questions?tom@acunu.com

@tom_wilkie

http://www.acunu.comhttp://bitbucket.org/acunuhttp://github.com/acunu

References[LSM] The Log-Structured Merge-Tree (LSM-Tree)Patrick O'Neil, Edward Cheng, Dieter Gawlick, Elizabeth O'Neil

http://staff.ustc.edu.cn/~jpq/paper/flash/1996-The%20Log-Structured%20Merge-Tree%20%28LSM-

Tree%29.pdf

[COLA] Cache-Oblivious Streaming B-trees, Michael A. Bender et al

http://www.cs.sunysb.edu/~bender/newpub/BenderFaFi07.pdf

[DSST] Making Data Structures Persistent - J. R. Driscoll, N. Sarnak, D. D. Sleator, R. E. Tarjan, Making Data Structures Persistent, Journal of Computer and System Sciences, Vol. 38, No. 1, 1989

http://www.cs.cmu.edu/~sleator/papers/making-data-structures-persistent.pdf

Stratified B-trees and versioned dictionaries, - Andy Twigg, Andrew Byde, Grzegorz Miłoś, Tim Moreton, John Wilkes, Tom Wilkie, HotStorage’11

http://www.usenix.org/event/hotstorage11/tech/final_files/Twigg.pdf

[RDA] Random duplicate storage strategies for load balancing in multimedia servers, 2000, Joep Aerts and Jan Korst and Sebastian Egner

http://www.win.tue.nl/~joep/IPL.ps

Apache, Apache Cassandra, Cassandra, Hadoop, and the eye and elephant logos are trademarks of the

Apache Software Foundation.

top related