membase meetup chicago - january 2011

Post on 17-Dec-2014

1.600 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Chicago

What is Membase?

Membase is a distributed database

3

Membase Servers

In the data center

Web application server

Application user

On the administrator console

Web application serverWeb application server

Five minutes or less to a working cluster• Downloads for Linux and Windows• Start with a single node• One button press joins nodes to a clusterEasy to develop against• Just SET and GET – no schema required• Drop it in. 10,000+ existing applications

already “speak membase” (via memcached)• Practically every language and application

framework is supported, out of the boxEasy to manage• One-click failover and cluster rebalancing• Graphical and programmatic interfaces• Configurable alerting

Membase is Simple, Fast, Elastic

4

Membase is Simple, Fast, Elastic

5

Predictable• “Never keep an application waiting”• Quasi-deterministic latency and throughputLow latency• Built-in Memcached technology

High throughput• Multi-threaded• Low lock contention• Asynchronous wherever possible• Automatic write de-duplication

Membase is Simple, Fast, Elastic

6

Zero-downtime elasticity• Spread I/O and data across commodity

servers (or VMs) • Consistent performance with linear cost• Dynamic rebalancing of a live clusterAll nodes are created equal• No special case nodes• Any node can replace any other node, online• Clone to growExtensible• Filtered TAP interface provides hook points

for external systems (e.g. full-text search, backup, warehouse)

• Data bucket – engine API for specialized container types

Built-in Memcached Caching Layer

7

Memcached

Membase Database

Membase Cache

Membase Database

Memcached Mode Membase Mode

Fact: Membase development team has also contributed over half of the code to the Memcached project.

Use Cases

Ad targeting

9

eventsprofiles, campaigns

profiles, real time campaign statistics

40 milliseconds to come up with an answer.

2

3

1

Search and Gaming Portal

10

Database

Membase Architecture

Clustering

• Underlying cluster functionality based on erlang OTP

• Have a custom, vector clock based way of storing and propagating...– Cluster topology– vBucket mapping

• Collect statistics from many nodes of the cluster– Identify hot keys, resource

utilization

12

TAP

• A generic, scalable method of streaming mutations from a given server– As data operations arrive, they can be sent to arbitrary TAP

receivers

• Leverages the existing memcached engine interface, and the non-blocking IO interfaces to send data

• Three modes of operation

13

Data setDataMutations

Data setDataMutations

Data set

Membase data flow – under the hood

14

SET request arrives at KEY’s master server

Listener-Sender

Master server for KEY Replica Server 2 for KEYReplica Server 1 for KEY

3 3

1SET acknowledgement returned to application2

DiskDisk Disk

RAM

mem

base

sto

rage

eng

ine

DiskDisk Disk

4

A Membase Node

15

Membase Node

REST Interfacememcached operations(any client)

memcached smart client operations

16

ns_servermembase(memcached + membase engine)

moxi ns_server

vbucketmigrator

TAP

A Membase Node: Component View

ns_servermembase(memcached + membase engine)

moxi ns_server

vbucketmigratorTAP

memcached operationswith tap commands

memcached operations

Client

port 11211 memcached operations

moxi + Client

port 11210 memcached operations REST/comet

cluster topology and vbucket map

Clients, nodes and other nodes

17

Data buckets are secure membase “slices”

18

Membase data servers

In the data center

Web application server

Application user

On the administrator console

Bucket 1Bucket 2

Aggregate Cluster Memory and Disk Capacity

vBucket mapping

19

Disk > Memory

Buc

ket C

onfig

urat

ion

mem_high_wat

mem_low_wat

memory quota

20

Dataset may have many items infrequently accessed. However, memcached has different behavior (LRU) than wanted with membase.

Still, traditional (most) RDBMS implementations are not 100% correct for us either. The speed of a miss is very, very important.

(Intermission)

22

Thanks!

Membase Demo

Key-Value Patterns

Key-Value

25

Key-Value

25

Items have:KeyValueExpirationFlagsCAS (more on this later)

Operations include:Get/SetIncrement/DecrementAppend/Prepend

Key-Value

25

Key-Value

Image courtesy http://www.flickr.com/photos/brenda-starr/3509344100/sizes/m/in/photostream/

(with a replica )25

Membase Datatypes

26

Membase Datatypes

• byte[]– Does your data have

1s and 0s?

26

Membase Datatypes

• byte[]– Does your data have

1s and 0s?

26

“Any customer can have a car painted any colour that he wants so long as it is black.”

Membase Datatypes

• byte[]– Does your data have

1s and 0s?

26

“Any customer can have a car painted any colour that he wants so long as it is black.”

• Items do have flags– Many clients use flags

– Data type options• Google protobuf• Thrift• Avro

Transactions

• Lock == slow me down• CAS operations

– Optimistic locking• Very useful with complex

datatypes– Imagine two clients trying to

update a complex item• You’re likely using CAS

already... if you use a CPU

27

User 1

Fail!

User 2Success

Common Use: Sessions

• Web user sessions– Highly read, less writes in many case– Protocol advantage of memcached

• Options already for PHP, Ruby and Java

• Application state– Not necessarily “entity” style things– May be appropriate for a “cache” pool

28

Common Use (cache): Rate Limiting

• Want to provide API calls into the system– Twitter search– Google search services

• Use the atomic increment– Set an item with a unique ID– Upon API request,

increment and check• HTTP 420: go away and come

back later

29

Your Users

Your App

¡Ouch!

Looking Ahead

Beyond key-value • Indexing/Range Queries• Advanced Data Structures• Sub-object direct manipulation

Validation and In-flight transformation• Block mutations failing validation• Enrich or transform objects

Connectors (Integrate easily with other systems)• Solr• Hadoop

Frequent Requests: New Access Methods

31

Feedback: your turn!

• Querying and indexing?

• Use cases you can share?

32

NodeCode - What is it?

Method for extending & customizing Membase

Separate code modules

Defined interface to datapath and cluster manager

Notification on events• Synchronous• Asynchronous

33

Simple• Packaged modules for easy install and enable• Library of “off the shelf” modules• Module monitoring• Straight forward development and debuggingFast• Low latency/high-throughput• Per-bucket process isolation• Don’t break data manager performance/correctnessElastic• Automatically migrate and instantiate on rebalance• Provide support for migration of internal data• Leverage native Membase engine for internal data storage

NodeCode – Drivers

34

Block-level architecture

35

36

Q&AMatt Ingenthron

matt@membase.comTwitter: ingenthrIRC: #membase

top related