amazon's highly available key-value store

16
Dynamo: Amazon’s Highly Available Key-value Store DECANIA ET AL.

Upload: others

Post on 26-Mar-2022

12 views

Category:

Documents


0 download

TRANSCRIPT

Dynamo: Amazon’s Highly Available Key-value StoreDECANIA ET AL.

Introduction

Briefly about DynamoDeveloped by Amazon

Used for primary key access to data

«Reliability at massive scale»

Tight control over tradeoffs

Data is stored using consistent hashing

Performed «without any downtime during thebusy holiday shopping season»

Each service running Dynamo runs a separate instance

Background

Assumptions and Requirements1. Data is uniquely identified by a primary key

2. The ACID properties are important – save for some slack in C

3. Dynamo must function on commodity hardware

4. Dynamo should only be used internally – non-hostile environment

Key principles for designThe call stack for a client request usually has more than one level

Designed as an eventually consistent system

Writes are never rejected

Incremental scaleability

Symmetry

Decentrailization

Heterogenecity

Related work

Dynamo compared to other systemsThis section compares Dynamo to several other systems (in terms of system requirements)

Most important:◦ Always writeable

◦ Key/value access

The 99.9th percentile of read and write operations should be «a few hundred milliseconds»

System Architecture

The Dynamo interface (1)Two operations

◦ get(key)

◦ put(key, context, object)

All keys are hashed to a 128 bit number, creating a «ring»

Dynamo nodes are spread out on this ring, and responsible for a part of the ring

The context contains metadata about theobject

The Dynamo interface (2)Some important notation

◦ 𝑁 is the number of replicas to store in thesystem

◦ 𝑆 is the total number of nodes in the system

A vector clock of length 𝑁 keeps track ofversioning

Object 1 is the ancestor of object 2 iff theentire vector clock of 1 is less-than-or-equal to the clock of 2

Calling get() and put() (1)All nodes are able to accept get and put for all keys

Two strategies for selecting a node◦ Generic load balancer – separate node for forwarding requests to the right position in the ring

◦ Partition-aware client library - every node on the ring forwards a request to the right node

A read/write is successful when a certain number of nodes has responded◦ 𝑊 is the total number of nodes that must accept a write

◦ 𝑅 is the total number of nodes that must respond before responding to a read. If several object versions are in the response, they are all returned to the caller

Implementation

Implemented in Java

A coordinator handles read and write requests◦ Related to (the previously mentioned) preference list

“Typical” (N,R,W) values are (3,2,2)

Lessons Learned

Some key takeawaysUse an internal buffer

Split the nodes evenly out on the Dynamo ring -> Removes the need of a load balancer / coordinator

Divergent versions are not a problem in practice

Give priority to read / write requests

Each application can (and should!) fine tune (N,W,R) setting

Dynamo has been very successful