production nosql in an hour: introduction to amazon dynamodb (dat101) | aws re:invent 2013

76
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc. DAT101 - Production NoSQL in an Hour: Introduction to Amazon DynamoDB

Upload: amazon-web-services

Post on 29-Nov-2014

2.191 views

Category:

Technology


1 download

DESCRIPTION

Amazon DynamoDB is a fully-managed, zero-admin, high-speed NoSQL database service. Amazon DynamoDB was built to support applications at any scale. With the click of a button, you can scale your database capacity from a few hundred I/Os per second to hundreds of thousands of I/Os per second or more. You can dynamically scale your database to keep up with your application's requirements while minimizing costs during low-traffic periods. The service has no limit on storage. You also learn about Amazon DynamoDB's design principles and history.

TRANSCRIPT

Page 1: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

DAT101 - Production NoSQL in an Hour:

Introduction to Amazon DynamoDB

Page 2: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Introduction to Amazon DynamoDB

Page 3: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Amazon DynamoDB is the result of everything we’ve learned from building large-scale, non-relational databases for Amazon.com and

building highly scalable and reliable cloud computing services at AWS.”

Page 4: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

What is Amazon DynamoDB?

Page 5: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Design Philosophy

Page 6: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Design Philosophy

Page 7: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Design Philosophy

Page 8: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Design Philosophy

Page 9: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Design Philosophy

Page 10: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Flexible Data Model

Page 11: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Access and Query Model • Two primary key options

• Hash key: Key lookups: “Give me the status for user abc”

• Composite key (Hash with Range): “Give me all the status updates for user ‘abc’

that occurred within the past 24 hours”

• Support for multiple data types – String, number, binary… or sets of strings, numbers, or binaries

• Supports both strong and eventual consistency – Choose your consistency level when you make the API call

– Different parts of your app can make different choices

• Local Secondary Indexes

Page 12: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

High Availability and Durability

Page 13: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

I want to build a production-ready database…

Page 14: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

This used to be the only way…

You Choose:

• Memory

• CPU

• Hard drive specs

• Software

• …

To get the database

performance you want:

• Throughput rate

• Latency

• …

Page 15: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

This used to be the only way…

You Choose:

• Memory

• CPU

• Hard drive specs

• Software

• …

To get the database

performance you want:

• Throughput rate

• Latency

• …

Page 16: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Provisioned Throughput Model

Tell us the performance you want

Let us handle the rest

Page 17: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Provisioned Throughput Model

Every DynamoDB table has:

• Provisioned write capacity

• Provisioned read capacity

• No limit on storage

Page 18: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Provisioned Throughput Model

Page 19: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Provisioned Throughput Model

Change your throughput capacity as needed

Pay for throughput capacity and storage used

Page 20: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Seamless Scalability

Change scale with the click of a button

Page 21: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Capacity Forecasting is Hard

When you run your own database, you need to:

• Try to forecast the scale you need

• Invest time and money learning how to scale your

database

• React quickly if you get it wrong

Page 22: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Timid Forecasting:

Plan for a lot more capacity than you probably need

Benefits:

• Safety – you know you’re ready

Risks:

• Buy too much capacity

• Lose development resources to

scale testing/planning

• Do more work than necessary

Page 23: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Aggressive Forecasting:

Cut it close! Plan for less capacity. Hope you don’t need

more… Benefits:

• Lower costs if all goes well

Risks:

• Last-minute scaling emergencies

• How does your database behave at an

unexpected scale?

Page 24: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Typical Holiday Season Traffic at Amazon

Capacity

Actual

traffic

Page 25: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

76%

24%

Unused

Capacity

Page 26: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Reduce Costs by Matching Capacity to Your Needs

Actual

traffic

Capacity we can

provision with

DynamoDB

Capacity we needed before

DynamoDB

Page 27: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Reduce Forecasting Risk by using

DynamoDB

Page 28: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Reduce Forecasting Risk by using

DynamoDB

Page 29: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

What does DynamoDB handle for me?

Page 30: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Focus on your building your app, not running your database

Page 31: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Try it out!

aws.amazon.com/dynamodb

Page 32: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

David R. Albrecht, Senior Engineer in Operations, Crittercism

November 13, 2013

Production NoSQL in an hour

Page 33: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Mobile application performance management

Page 34: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013
Page 35: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

HTTP Requests

Page 36: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

600 million devices

Page 37: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013
Page 38: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013
Page 39: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013
Page 40: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013
Page 41: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

None of this adds differentiating

business value.

Page 42: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

#import "Crittercism.h” [Crittercism enableWithAppID: @"<YOUR_CRITTERCISM_APP_ID>"]; [Crittercism setUsername:(NSString *)username];

Metadata: session id via usernames

Page 43: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013
Page 44: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

we tried a lot of things

most of them failed

Page 45: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Our first attempt: sharded MongoDB on EC2

AZ 1

AZ 2

orange apple durian Each shard:

2x m2.4xlarge, EBS opt

Gross: 2x 3200 GB

Net: 1.6 TB, RAID 10

Cost:

EBS standard: $704/mo

EC2 compute: $2650/mo

Price floor: $1.45/GB-mo

Page 46: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

But storage capacity wasn’t the problem!

Page 47: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Second attempt: Redis ring

Each shard:

2x m2.4xlarge

Gross: 2x 64 GB RAM

Net: 64 GB RAM

O(10k) iops performance

Cost:

EC2 compute: $2650/mo

Price floor: $41.45/GB-mo,

but is an ops nightmare.

Master

Slave

Consistent hashing: Karger et al.

Page 48: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Lesson: db scaling is 2d

iops

capacity

RAM

SSD

HDD

Page 49: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

A horizontally-scalable, tabular, indexed database with user-defined

consistency semantics.

Page 50: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Benefit: Pay only for consumed capacity

Page 51: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Benefit: load spike insurance

Page 52: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Benefit: application-appropriate scaling

iops

capacity

Page 53: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Benefit: no operational burden

Page 54: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Lessons learned

• Database scaling is a 2D

problem

• Don't try to roll your own

sharding scheme

• Dynamo works for us.

Page 56: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

100 Billion (with a B) Requests a Day with

Amazon DynamoDB

Valentino Volonghi, AdRoll Chief Architect

November 13, 2013

Page 57: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Pixel “fires”

Page 58: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Pixel “fires”

Serve ad?

Page 59: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Pixel “fires”

Serve ad?

Ad served

Page 60: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

If you can’t reply in 100ms… It doesn’t matter anymore!

But you really only get 40ms!

Network 40

Buffer 20

Processing 40

Page 61: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Big picture slide

Data must be available all over the world!

Page 62: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

7/2011 - ~50GB/day

4/2013 - ~5TB/day

10/2013 - ~20TB/day

Page 63: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

What were our requirements?

Page 64: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Key-Value Store Requirements

• <10ms random key lookup with 100bytes values

• 5-10B items stored

• Scale up and down without performance hit

• ~100% uptime, this is money for us

• Consistent and sustained read/write throughput

Page 65: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Why DynamoDB instead of…

• Hbase: hbck like rain, really hard to manage

• Cassandra: still immature when we needed it

• Redis: limited by available memory, no

clustering

• Riak: great product, not fast enough for us

• MongoDB: not consistent write throughput

Page 66: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

But the real reason…

They all require people to manage them!

And they all are hard to run in the cloud!

Page 67: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

DynamoDB by Our Numbers

• 4 regions in use with live traffic replication

• 120B+ key fetches worldwide per day

• 1.5TB of data stored per region

• 30B+ items stored in reach region

• <3ms uniform query latency, <10ms 99.95%

Page 68: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

What did we learn after all?

Page 69: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Batch operations as much as possible!

Page 70: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Query with GetItem – Update with UpdateItem

Low write throughput – Key splitting when exceeding max size – Write contention

HashKey

KeyValue

Query with Query – Update with BatchPutItem

HashAndRangeKey

KeyValue

Page 71: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Properly balance your structures!

Page 72: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

• Evenly distribute keys in

hash range

• All values should be about

the same size

• Cache reads for a few

seconds

• Buffer writes, when

necessary

• Exponential back-off

retries

Tips for Optimum Performance

Page 73: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

What do you mean you don’t care about the money?

Page 74: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Why do we pay so much for snacks again?

Snacks DynamoDB

Page 75: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

We have this huge database

Pretty much always available

And we barely know it’s there

Page 76: Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS re:Invent 2013

Please give us your feedback on this

presentation

As a thank you, we will select prize

winners daily for completed surveys!

DAT101