2011 july-gtug-high-replication-datastore

34
Ikai Lan plus.ikailan.com NYC GTUG July 27, 2011 High Replication Datastore Wednesday, July 27, 2011

Upload: ikailan

Post on 05-Dec-2014

2.255 views

Category:

Technology


0 download

DESCRIPTION

Slides from my N

TRANSCRIPT

Page 1: 2011 july-gtug-high-replication-datastore

Ikai Lan plus.ikailan.com

NYC GTUGJuly 27, 2011

High Replication Datastore

Wednesday, July 27, 2011

Page 2: 2011 july-gtug-high-replication-datastore

About the speaker

• Ikai Lan

• Developer Relations at Google based out of San Francisco, CA

• Twitter: @ikai

• Google+: plus.ikailan.com

Wednesday, July 27, 2011

Page 3: 2011 july-gtug-high-replication-datastore

Agenda

• What is App Engine?

• What is High Replication datastore?

• Underneath the hood

Wednesday, July 27, 2011

Page 4: 2011 july-gtug-high-replication-datastore

What is App Engine?

Wednesday, July 27, 2011

Page 5: 2011 july-gtug-high-replication-datastore

Infrastructure

Platform

Software

Source: Gartner AADI Summit Dec 2009

Wednesday, July 27, 2011

Page 6: 2011 july-gtug-high-replication-datastore

Infrastructure

Platform

Software

Source: Gartner AADI Summit Dec 2009

Wednesday, July 27, 2011

Page 7: 2011 july-gtug-high-replication-datastore

Infrastructure

Platform

Software

Source: Gartner AADI Summit Dec 2009

Wednesday, July 27, 2011

Page 8: 2011 july-gtug-high-replication-datastore

Infrastructure

Platform

Software

Source: Gartner AADI Summit Dec 2009

Wednesday, July 27, 2011

Page 9: 2011 july-gtug-high-replication-datastore

SDK & “The Cloud”

Hardware

Networking

Operating system

Application runtime

Java, Python, Go

Static file serving

Wednesday, July 27, 2011

Page 10: 2011 july-gtug-high-replication-datastore

Scales dynamically

App Server

Wednesday, July 27, 2011

Page 11: 2011 july-gtug-high-replication-datastore

Scales dynamically

App Server

App Server

App Server

Wednesday, July 27, 2011

Page 12: 2011 july-gtug-high-replication-datastore

Customer: WebFilings

Disruptive multi-tenant App Engine application adopted by Fortune 500 companies.

Wednesday, July 27, 2011

Page 13: 2011 july-gtug-high-replication-datastore

Customer: The Royal Wedding

Peaked at 32,000 requests per second with no disruption!

Wednesday, July 27, 2011

Page 14: 2011 july-gtug-high-replication-datastore

>200K Apps

>100K Developers

>1.5B daily pageviews

Wednesday, July 27, 2011

Page 15: 2011 july-gtug-high-replication-datastore

App Engine Datastore

Schemaless, non-relational datastore built on top of Google’s Bigtable technology

Enables rapid development and scalability

Wednesday, July 27, 2011

Page 16: 2011 july-gtug-high-replication-datastore

High Replication

• strongly consistent

• multi datacenter

•High reliability

• consistent performance

• no data loss

Wednesday, July 27, 2011

Page 17: 2011 july-gtug-high-replication-datastore

How do I use HR?

• Create a new application! Just remember the rules

• Fetch by key and ancestor queries exhibit strongly consistent behavior

• Queries without an ancestor exhibit eventually consistent behavior

Wednesday, July 27, 2011

Page 18: 2011 july-gtug-high-replication-datastore

Strong vs. Eventual

• Strong consistency means immediately after the datastore tells us the data has been committed, a subsequent read will return the data written

• Eventual consistency means that some time after the datastore tells us data has been committed, a read will return written data - immediate read may or may not

Wednesday, July 27, 2011

Page 19: 2011 july-gtug-high-replication-datastore

This is strongly consistent

DatastoreService datastore = DatastoreServiceFactory .getDatastoreService();

Entity item = new Entity("Item"); item.setProperty("data", 123);

Key key = datastore.put(item);

// This exhibits strong consistency. // It should return the item we just saved. Entity result = datastore.get(key);

Wednesday, July 27, 2011

Page 20: 2011 july-gtug-high-replication-datastore

This is strongly consistent

// Save the entity root Entity root = new Entity("Root"); Key rootKey = datastore.put(root);

// Save the child Entity childItem = new Entity("Item", rootKey); childItem.setProperty("data", 123); datastore.put(childItem);

Query strongConsistencyQuery = new Query("Item"); strongConsistencyQuery.setAncestor(rootKey); strongConsistencyQuery.addFilter("data", FilterOperator.EQUAL, 123);

FetchOptions opts = FetchOptions.Builder.withDefaults();

// This query exhibits strong consistency. // It will return the item we just saved. List<Entity> results = datastore.prepare(strongConsistencyQuery) .asList(opts);

Wednesday, July 27, 2011

Page 21: 2011 july-gtug-high-replication-datastore

This is eventually consistent

Entity item = new Entity("Item"); item.setProperty("data", 123); datastore.put(item);

// Not an ancestor query Query eventuallyConsistentQuery = new Query("Item"); eventuallyConsistentQuery.addFilter("data", FilterOperator.EQUAL, 123);

FetchOptions opts = FetchOptions.Builder.withDefaults();

// This query exhibits eventual consistency. // It will likely return an empty list. List<Entity> results = datastore.prepare(eventuallyConsistentQuery) .asList(opts);

Wednesday, July 27, 2011

Page 22: 2011 july-gtug-high-replication-datastore

Why?

• Reads are transactional

• On a read, we try to determine if we have the latest version of some data

• If not, we catch up the data on the node to the latest version

Wednesday, July 27, 2011

Page 23: 2011 july-gtug-high-replication-datastore

To understand this ...

• We need some understanding of Paxos ...

• ... which necessitates some understanding of transactions

• ... which necessitates some understanding of entity groups

Wednesday, July 27, 2011

Page 24: 2011 july-gtug-high-replication-datastore

Entity Groups

Blog

Entry

User

Blog

Entry Entry

CommentCommentComment

Entity group root

Wednesday, July 27, 2011

Page 25: 2011 july-gtug-high-replication-datastore

Entity groups // Save the entity root Entity root = new Entity("Root"); Key rootKey = datastore.put(root);

// Save the child Entity childItem = new Entity("Item", rootKey); childItem.setProperty("data", 123); datastore.put(childItem);

Query strongConsistencyQuery = new Query("Item"); strongConsistencyQuery.setAncestor(rootKey); strongConsistencyQuery.addFilter("data", FilterOperator.EQUAL, 123);

FetchOptions opts = FetchOptions.Builder.withDefaults();

// This query exhibits strong consistency. // It will return the item we just saved. List<Entity> results = datastore.prepare(strongConsistencyQuery) .asList(opts);

Wednesday, July 27, 2011

Page 26: 2011 july-gtug-high-replication-datastore

Entity groups // Save the entity root Entity root = new Entity("Root"); Key rootKey = datastore.put(root);

// Save the child Entity childItem = new Entity("Item", rootKey); childItem.setProperty("data", 123); datastore.put(childItem);

Query strongConsistencyQuery = new Query("Item"); strongConsistencyQuery.setAncestor(rootKey); strongConsistencyQuery.addFilter("data", FilterOperator.EQUAL, 123);

FetchOptions opts = FetchOptions.Builder.withDefaults();

// This query exhibits strong consistency. // It will return the item we just saved. List<Entity> results = datastore.prepare(strongConsistencyQuery) .asList(opts);

Wednesday, July 27, 2011

Page 27: 2011 july-gtug-high-replication-datastore

Optimistic locking

Datastore

Client B reads data. It's current

version is 11

Modify data. Increment version

to 12

Modify data. Increment version

to 12

Client A reads data. It's current

version is 11

Client B tries to save data.

Success!Client ! tries to

save data. Datastore version is

higher or equal than my

version - FAIL

Wednesday, July 27, 2011

Page 28: 2011 july-gtug-high-replication-datastore

// Save the entity root Entity root = new Entity("Root"); Key rootKey = datastore.put(root);

// Save the child Entity childItem = new Entity("Item", rootKey); childItem.setProperty("data", 123); datastore.put(childItem);

Query strongConsistencyQuery = new Query("Item"); strongConsistencyQuery.setAncestor(rootKey); strongConsistencyQuery.addFilter("data", FilterOperator.EQUAL, 123);

FetchOptions opts = FetchOptions.Builder.withDefaults();

// This query exhibits strong consistency. // It will return the item we just saved. List<Entity> results = datastore.prepare(strongConsistencyQuery) .asList(opts);

Transactional reads

Wednesday, July 27, 2011

Page 29: 2011 july-gtug-high-replication-datastore

Transactional reads

Datastore

Blog EntryVersion 11

CommentParent: Entry

Version 11

CommentParent: EntryVersion 12

Still being committed

Client A reads data

Version 12 has not finished committing - Datastore returns version 11

Client B transactionally

writing data

Wednesday, July 27, 2011

Page 30: 2011 july-gtug-high-replication-datastore

Paxos simplifiedNode A Node B

Node C Node D

Datastore Client

Give me the newest data

Is my dataup to date?

1. If the data is up to date, return it

2. if the data is NOT up to date, "catch up" the databy applying the jobs in the journal and return the latestdata

Wednesday, July 27, 2011

Page 31: 2011 july-gtug-high-replication-datastore

More reading

• My example was grossly oversimplified

• More details can be found here:

http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf

Wednesday, July 27, 2011

Page 32: 2011 july-gtug-high-replication-datastore

Contradictory advice

• Entity groups must be as big as possible to cover as much related data as you can

• Entity groups must be small enough such that your write rate per entity group never goes above one write/second

Wednesday, July 27, 2011

Page 33: 2011 july-gtug-high-replication-datastore

Summary

• Remember the rules of strong consistency and eventual consistency

• Group your data into entity groups when possible and use ancestor queries

Wednesday, July 27, 2011

Page 34: 2011 july-gtug-high-replication-datastore

Questions?

• Twitter: @ikai

• Google+: plus.ikailan.com

Wednesday, July 27, 2011