production nosql in an hour: introduction to amazon dynamodb (dat101) | aws re:invent 2013

DAT101 - Production NoSQL in an Hour:

Introduction to Amazon DynamoDB

Amazon DynamoDB is the result of everything we’ve learned from building large-scale, non-relational databases for Amazon.com and

building highly scalable and reliable cloud computing services at AWS.”

What is Amazon DynamoDB?

Design Philosophy

Flexible Data Model

Access and Query Model • Two primary key options

• Hash key: Key lookups: “Give me the status for user abc”

• Composite key (Hash with Range): “Give me all the status updates for user ‘abc’

that occurred within the past 24 hours”

• Support for multiple data types – String, number, binary… or sets of strings, numbers, or binaries

• Supports both strong and eventual consistency – Choose your consistency level when you make the API call

– Different parts of your app can make different choices

• Local Secondary Indexes

High Availability and Durability

I want to build a production-ready database…

This used to be the only way…

You Choose:

• Memory

• CPU

• Hard drive specs

• Software

• …

To get the database

performance you want:

• Throughput rate

• Latency

• …

This used to be the only way…

You Choose:

• Memory

• CPU

• Hard drive specs

• Software

• …

To get the database

performance you want:

• Throughput rate

• Latency

• …

Provisioned Throughput Model

Tell us the performance you want

Let us handle the rest

Every DynamoDB table has:

• Provisioned write capacity

• Provisioned read capacity

• No limit on storage

Change your throughput capacity as needed

Pay for throughput capacity and storage used

Seamless Scalability

Change scale with the click of a button

Capacity Forecasting is Hard

When you run your own database, you need to:

• Try to forecast the scale you need

• Invest time and money learning how to scale your

database

• React quickly if you get it wrong

Timid Forecasting:

Plan for a lot more capacity than you probably need

Benefits:

• Safety – you know you’re ready

Risks:

• Buy too much capacity

• Lose development resources to

scale testing/planning

• Do more work than necessary

Aggressive Forecasting:

Cut it close! Plan for less capacity. Hope you don’t need

more… Benefits:

• Lower costs if all goes well

Risks:

• Last-minute scaling emergencies

• How does your database behave at an

unexpected scale?

Typical Holiday Season Traffic at Amazon

Capacity

Actual

traffic

Unused

Capacity

Reduce Costs by Matching Capacity to Your Needs

Actual

traffic

Capacity we can

provision with

DynamoDB

Capacity we needed before

DynamoDB

Reduce Forecasting Risk by using

DynamoDB

Reduce Forecasting Risk by using

DynamoDB

What does DynamoDB handle for me?

Focus on your building your app, not running your database

Try it out!

aws.amazon.com/dynamodb

David R. Albrecht, Senior Engineer in Operations, Crittercism

November 13, 2013

Production NoSQL in an hour

Mobile application performance management

HTTP Requests

600 million devices

None of this adds differentiating

business value.

#import "Crittercism.h” [Crittercism enableWithAppID: @"<YOUR_CRITTERCISM_APP_ID>"]; [Crittercism setUsername:(NSString *)username];

Metadata: session id via usernames

we tried a lot of things

most of them failed

Our first attempt: sharded MongoDB on EC2

orange apple durian Each shard:

2x m2.4xlarge, EBS opt

Gross: 2x 3200 GB

Net: 1.6 TB, RAID 10

EBS standard: $704/mo

EC2 compute: $2650/mo

Price floor: $1.45/GB-mo

But storage capacity wasn’t the problem!

Second attempt: Redis ring

Each shard:

2x m2.4xlarge

Gross: 2x 64 GB RAM

Net: 64 GB RAM

O(10k) iops performance

EC2 compute: $2650/mo

Price floor: $41.45/GB-mo,

but is an ops nightmare.

Master

Consistent hashing: Karger et al.

Lesson: db scaling is 2d

capacity

A horizontally-scalable, tabular, indexed database with user-defined

consistency semantics.

Benefit: Pay only for consumed capacity

Benefit: load spike insurance

Benefit: application-appropriate scaling

capacity

Benefit: no operational burden

Lessons learned

• Database scaling is a 2D

problem

• Don't try to roll your own

sharding scheme

• Dynamo works for us.

david@crittercism.com

100 Billion (with a B) Requests a Day with

Amazon DynamoDB

Valentino Volonghi, AdRoll Chief Architect

November 13, 2013

Pixel “fires”

Serve ad?

Pixel “fires”

Serve ad?

Ad served

If you can’t reply in 100ms… It doesn’t matter anymore!

But you really only get 40ms!

Network 40

Buffer 20

Processing 40

Big picture slide

Data must be available all over the world!

7/2011 - ~50GB/day

4/2013 - ~5TB/day

10/2013 - ~20TB/day

What were our requirements?

Key-Value Store Requirements

• <10ms random key lookup with 100bytes values

• 5-10B items stored

• Scale up and down without performance hit

• ~100% uptime, this is money for us

• Consistent and sustained read/write throughput

Why DynamoDB instead of…

• Hbase: hbck like rain, really hard to manage

• Cassandra: still immature when we needed it

• Redis: limited by available memory, no

clustering

• Riak: great product, not fast enough for us

• MongoDB: not consistent write throughput

But the real reason…

They all require people to manage them!

And they all are hard to run in the cloud!

DynamoDB by Our Numbers

• 4 regions in use with live traffic replication

• 120B+ key fetches worldwide per day

• 1.5TB of data stored per region

• 30B+ items stored in reach region

• <3ms uniform query latency, <10ms 99.95%

What did we learn after all?

Batch operations as much as possible!

Query with GetItem – Update with UpdateItem

Low write throughput – Key splitting when exceeding max size – Write contention

HashKey

KeyValue

Query with Query – Update with BatchPutItem

HashAndRangeKey

KeyValue

Properly balance your structures!

• Evenly distribute keys in

hash range

• All values should be about

the same size

• Cache reads for a few

seconds

• Buffer writes, when

necessary

• Exponential back-off

retries

Tips for Optimum Performance

What do you mean you don’t care about the money?

Why do we pay so much for snacks again?

Snacks DynamoDB

We have this huge database

Pretty much always available

And we barely know it’s there

Please give us your feedback on this

presentation

As a thank you, we will select prize

winners daily for completed surveys!

DAT101

production nosql in an hour: introduction to amazon dynamodb (dat101) | aws re:invent 2013

Technology

serverless single page web apps, part...

(bdt307) running nosql on amazon ec2 | aws re:invent 2014

nosql&databases - wordpress.com ·...

nosql and aws dynamodb

autoscale dynamodb with dynamic dynamodb

dynamodb gsg

utilizando nosql para big data com dynamodb

aws re:invent 2016: relational and nosql databases on aws:...

(gam301) real-time game analytics with amazon kinesis,...

aws re:invent 2016: how telltale games migrated its story...

(sdd424) simplifying scalable distributed applications using...

sql to nosql best practices with amazon dynamodb - aws...

aws re:invent 2016: streaming etl for rds and dynamodb...

nosql systems for big data management · nosql systems for...

aws re:invent 2016: how dataxu scaled its attribution system...

dynamodb media server (2012-06-12 dynamodb+emr...

comparaison de l'utilisation d'amazon dynamodb et d'apache...

nosql revolution: under the covers of distributed systems at...

amazon dynamodb テーブル設計と実践 tips · pdf...

aws の nosql 入門〜amazon elasticache, amazon...