production nosql in an hour: introduction to amazon dynamodb (dat101) | aws re:invent 2013

Post on 29-Nov-2014

2.191 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Amazon DynamoDB is a fully-managed, zero-admin, high-speed NoSQL database service. Amazon DynamoDB was built to support applications at any scale. With the click of a button, you can scale your database capacity from a few hundred I/Os per second to hundreds of thousands of I/Os per second or more. You can dynamically scale your database to keep up with your application's requirements while minimizing costs during low-traffic periods. The service has no limit on storage. You also learn about Amazon DynamoDB's design principles and history.

TRANSCRIPT

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

DAT101 - Production NoSQL in an Hour:

Introduction to Amazon DynamoDB

Introduction to Amazon DynamoDB

Amazon DynamoDB is the result of everything we’ve learned from building large-scale, non-relational databases for Amazon.com and

building highly scalable and reliable cloud computing services at AWS.”

What is Amazon DynamoDB?

Design Philosophy

Design Philosophy

Design Philosophy

Design Philosophy

Design Philosophy

Flexible Data Model

Access and Query Model • Two primary key options

• Hash key: Key lookups: “Give me the status for user abc”

• Composite key (Hash with Range): “Give me all the status updates for user ‘abc’

that occurred within the past 24 hours”

• Support for multiple data types – String, number, binary… or sets of strings, numbers, or binaries

• Supports both strong and eventual consistency – Choose your consistency level when you make the API call

– Different parts of your app can make different choices

• Local Secondary Indexes

High Availability and Durability

I want to build a production-ready database…

This used to be the only way…

You Choose:

• Memory

• CPU

• Hard drive specs

• Software

• …

To get the database

performance you want:

• Throughput rate

• Latency

• …

This used to be the only way…

You Choose:

• Memory

• CPU

• Hard drive specs

• Software

• …

To get the database

performance you want:

• Throughput rate

• Latency

• …

Provisioned Throughput Model

Tell us the performance you want

Let us handle the rest

Provisioned Throughput Model

Every DynamoDB table has:

• Provisioned write capacity

• Provisioned read capacity

• No limit on storage

Provisioned Throughput Model

Provisioned Throughput Model

Change your throughput capacity as needed

Pay for throughput capacity and storage used

Seamless Scalability

Change scale with the click of a button

Capacity Forecasting is Hard

When you run your own database, you need to:

• Try to forecast the scale you need

• Invest time and money learning how to scale your

database

• React quickly if you get it wrong

Timid Forecasting:

Plan for a lot more capacity than you probably need

Benefits:

• Safety – you know you’re ready

Risks:

• Buy too much capacity

• Lose development resources to

scale testing/planning

• Do more work than necessary

Aggressive Forecasting:

Cut it close! Plan for less capacity. Hope you don’t need

more… Benefits:

• Lower costs if all goes well

Risks:

• Last-minute scaling emergencies

• How does your database behave at an

unexpected scale?

Typical Holiday Season Traffic at Amazon

Capacity

Actual

traffic

76%

24%

Unused

Capacity

Reduce Costs by Matching Capacity to Your Needs

Actual

traffic

Capacity we can

provision with

DynamoDB

Capacity we needed before

DynamoDB

Reduce Forecasting Risk by using

DynamoDB

Reduce Forecasting Risk by using

DynamoDB

What does DynamoDB handle for me?

Focus on your building your app, not running your database

Try it out!

aws.amazon.com/dynamodb

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

David R. Albrecht, Senior Engineer in Operations, Crittercism

November 13, 2013

Production NoSQL in an hour

Mobile application performance management

HTTP Requests

600 million devices

None of this adds differentiating

business value.

#import "Crittercism.h” [Crittercism enableWithAppID: @"<YOUR_CRITTERCISM_APP_ID>"]; [Crittercism setUsername:(NSString *)username];

Metadata: session id via usernames

we tried a lot of things

most of them failed

Our first attempt: sharded MongoDB on EC2

AZ 1

AZ 2

orange apple durian Each shard:

2x m2.4xlarge, EBS opt

Gross: 2x 3200 GB

Net: 1.6 TB, RAID 10

Cost:

EBS standard: $704/mo

EC2 compute: $2650/mo

Price floor: $1.45/GB-mo

But storage capacity wasn’t the problem!

Second attempt: Redis ring

Each shard:

2x m2.4xlarge

Gross: 2x 64 GB RAM

Net: 64 GB RAM

O(10k) iops performance

Cost:

EC2 compute: $2650/mo

Price floor: $41.45/GB-mo,

but is an ops nightmare.

Master

Slave

Consistent hashing: Karger et al.

Lesson: db scaling is 2d

iops

capacity

RAM

SSD

HDD

A horizontally-scalable, tabular, indexed database with user-defined

consistency semantics.

Benefit: Pay only for consumed capacity

Benefit: load spike insurance

Benefit: application-appropriate scaling

iops

capacity

Benefit: no operational burden

Lessons learned

• Database scaling is a 2D

problem

• Don't try to roll your own

sharding scheme

• Dynamo works for us.

david@crittercism.com

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

100 Billion (with a B) Requests a Day with

Amazon DynamoDB

Valentino Volonghi, AdRoll Chief Architect

November 13, 2013

Pixel “fires”

Pixel “fires”

Serve ad?

Pixel “fires”

Serve ad?

Ad served

If you can’t reply in 100ms… It doesn’t matter anymore!

But you really only get 40ms!

Network 40

Buffer 20

Processing 40

Big picture slide

Data must be available all over the world!

7/2011 - ~50GB/day

4/2013 - ~5TB/day

10/2013 - ~20TB/day

What were our requirements?

Key-Value Store Requirements

• <10ms random key lookup with 100bytes values

• 5-10B items stored

• Scale up and down without performance hit

• ~100% uptime, this is money for us

• Consistent and sustained read/write throughput

Why DynamoDB instead of…

• Hbase: hbck like rain, really hard to manage

• Cassandra: still immature when we needed it

• Redis: limited by available memory, no

clustering

• Riak: great product, not fast enough for us

• MongoDB: not consistent write throughput

But the real reason…

They all require people to manage them!

And they all are hard to run in the cloud!

DynamoDB by Our Numbers

• 4 regions in use with live traffic replication

• 120B+ key fetches worldwide per day

• 1.5TB of data stored per region

• 30B+ items stored in reach region

• <3ms uniform query latency, <10ms 99.95%

What did we learn after all?

Batch operations as much as possible!

Query with GetItem – Update with UpdateItem

Low write throughput – Key splitting when exceeding max size – Write contention

HashKey

KeyValue

Query with Query – Update with BatchPutItem

HashAndRangeKey

KeyValue

Properly balance your structures!

• Evenly distribute keys in

hash range

• All values should be about

the same size

• Cache reads for a few

seconds

• Buffer writes, when

necessary

• Exponential back-off

retries

Tips for Optimum Performance

What do you mean you don’t care about the money?

Why do we pay so much for snacks again?

Snacks DynamoDB

We have this huge database

Pretty much always available

And we barely know it’s there

Please give us your feedback on this

presentation

As a thank you, we will select prize

winners daily for completed surveys!

DAT101

top related