2011 july-gtug-high-replication-datastore
DESCRIPTION
Slides from my NTRANSCRIPT
Ikai Lan plus.ikailan.com
NYC GTUGJuly 27, 2011
High Replication Datastore
Wednesday, July 27, 2011
About the speaker
• Ikai Lan
• Developer Relations at Google based out of San Francisco, CA
• Twitter: @ikai
• Google+: plus.ikailan.com
Wednesday, July 27, 2011
Agenda
• What is App Engine?
• What is High Replication datastore?
• Underneath the hood
Wednesday, July 27, 2011
What is App Engine?
Wednesday, July 27, 2011
Infrastructure
Platform
Software
Source: Gartner AADI Summit Dec 2009
Wednesday, July 27, 2011
Infrastructure
Platform
Software
Source: Gartner AADI Summit Dec 2009
Wednesday, July 27, 2011
Infrastructure
Platform
Software
Source: Gartner AADI Summit Dec 2009
Wednesday, July 27, 2011
Infrastructure
Platform
Software
Source: Gartner AADI Summit Dec 2009
Wednesday, July 27, 2011
SDK & “The Cloud”
Hardware
Networking
Operating system
Application runtime
Java, Python, Go
Static file serving
Wednesday, July 27, 2011
Scales dynamically
App Server
Wednesday, July 27, 2011
Scales dynamically
App Server
App Server
App Server
Wednesday, July 27, 2011
Customer: WebFilings
Disruptive multi-tenant App Engine application adopted by Fortune 500 companies.
Wednesday, July 27, 2011
Customer: The Royal Wedding
Peaked at 32,000 requests per second with no disruption!
Wednesday, July 27, 2011
>200K Apps
>100K Developers
>1.5B daily pageviews
Wednesday, July 27, 2011
App Engine Datastore
Schemaless, non-relational datastore built on top of Google’s Bigtable technology
Enables rapid development and scalability
Wednesday, July 27, 2011
High Replication
• strongly consistent
• multi datacenter
•High reliability
• consistent performance
• no data loss
Wednesday, July 27, 2011
How do I use HR?
• Create a new application! Just remember the rules
• Fetch by key and ancestor queries exhibit strongly consistent behavior
• Queries without an ancestor exhibit eventually consistent behavior
Wednesday, July 27, 2011
Strong vs. Eventual
• Strong consistency means immediately after the datastore tells us the data has been committed, a subsequent read will return the data written
• Eventual consistency means that some time after the datastore tells us data has been committed, a read will return written data - immediate read may or may not
Wednesday, July 27, 2011
This is strongly consistent
DatastoreService datastore = DatastoreServiceFactory .getDatastoreService();
Entity item = new Entity("Item"); item.setProperty("data", 123);
Key key = datastore.put(item);
// This exhibits strong consistency. // It should return the item we just saved. Entity result = datastore.get(key);
Wednesday, July 27, 2011
This is strongly consistent
// Save the entity root Entity root = new Entity("Root"); Key rootKey = datastore.put(root);
// Save the child Entity childItem = new Entity("Item", rootKey); childItem.setProperty("data", 123); datastore.put(childItem);
Query strongConsistencyQuery = new Query("Item"); strongConsistencyQuery.setAncestor(rootKey); strongConsistencyQuery.addFilter("data", FilterOperator.EQUAL, 123);
FetchOptions opts = FetchOptions.Builder.withDefaults();
// This query exhibits strong consistency. // It will return the item we just saved. List<Entity> results = datastore.prepare(strongConsistencyQuery) .asList(opts);
Wednesday, July 27, 2011
This is eventually consistent
Entity item = new Entity("Item"); item.setProperty("data", 123); datastore.put(item);
// Not an ancestor query Query eventuallyConsistentQuery = new Query("Item"); eventuallyConsistentQuery.addFilter("data", FilterOperator.EQUAL, 123);
FetchOptions opts = FetchOptions.Builder.withDefaults();
// This query exhibits eventual consistency. // It will likely return an empty list. List<Entity> results = datastore.prepare(eventuallyConsistentQuery) .asList(opts);
Wednesday, July 27, 2011
Why?
• Reads are transactional
• On a read, we try to determine if we have the latest version of some data
• If not, we catch up the data on the node to the latest version
Wednesday, July 27, 2011
To understand this ...
• We need some understanding of Paxos ...
• ... which necessitates some understanding of transactions
• ... which necessitates some understanding of entity groups
Wednesday, July 27, 2011
Entity Groups
Blog
Entry
User
Blog
Entry Entry
CommentCommentComment
Entity group root
Wednesday, July 27, 2011
Entity groups // Save the entity root Entity root = new Entity("Root"); Key rootKey = datastore.put(root);
// Save the child Entity childItem = new Entity("Item", rootKey); childItem.setProperty("data", 123); datastore.put(childItem);
Query strongConsistencyQuery = new Query("Item"); strongConsistencyQuery.setAncestor(rootKey); strongConsistencyQuery.addFilter("data", FilterOperator.EQUAL, 123);
FetchOptions opts = FetchOptions.Builder.withDefaults();
// This query exhibits strong consistency. // It will return the item we just saved. List<Entity> results = datastore.prepare(strongConsistencyQuery) .asList(opts);
Wednesday, July 27, 2011
Entity groups // Save the entity root Entity root = new Entity("Root"); Key rootKey = datastore.put(root);
// Save the child Entity childItem = new Entity("Item", rootKey); childItem.setProperty("data", 123); datastore.put(childItem);
Query strongConsistencyQuery = new Query("Item"); strongConsistencyQuery.setAncestor(rootKey); strongConsistencyQuery.addFilter("data", FilterOperator.EQUAL, 123);
FetchOptions opts = FetchOptions.Builder.withDefaults();
// This query exhibits strong consistency. // It will return the item we just saved. List<Entity> results = datastore.prepare(strongConsistencyQuery) .asList(opts);
Wednesday, July 27, 2011
Optimistic locking
Datastore
Client B reads data. It's current
version is 11
Modify data. Increment version
to 12
Modify data. Increment version
to 12
Client A reads data. It's current
version is 11
Client B tries to save data.
Success!Client ! tries to
save data. Datastore version is
higher or equal than my
version - FAIL
Wednesday, July 27, 2011
// Save the entity root Entity root = new Entity("Root"); Key rootKey = datastore.put(root);
// Save the child Entity childItem = new Entity("Item", rootKey); childItem.setProperty("data", 123); datastore.put(childItem);
Query strongConsistencyQuery = new Query("Item"); strongConsistencyQuery.setAncestor(rootKey); strongConsistencyQuery.addFilter("data", FilterOperator.EQUAL, 123);
FetchOptions opts = FetchOptions.Builder.withDefaults();
// This query exhibits strong consistency. // It will return the item we just saved. List<Entity> results = datastore.prepare(strongConsistencyQuery) .asList(opts);
Transactional reads
Wednesday, July 27, 2011
Transactional reads
Datastore
Blog EntryVersion 11
CommentParent: Entry
Version 11
CommentParent: EntryVersion 12
Still being committed
Client A reads data
Version 12 has not finished committing - Datastore returns version 11
Client B transactionally
writing data
Wednesday, July 27, 2011
Paxos simplifiedNode A Node B
Node C Node D
Datastore Client
Give me the newest data
Is my dataup to date?
1. If the data is up to date, return it
2. if the data is NOT up to date, "catch up" the databy applying the jobs in the journal and return the latestdata
Wednesday, July 27, 2011
More reading
• My example was grossly oversimplified
• More details can be found here:
http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf
Wednesday, July 27, 2011
Contradictory advice
• Entity groups must be as big as possible to cover as much related data as you can
• Entity groups must be small enough such that your write rate per entity group never goes above one write/second
Wednesday, July 27, 2011
Summary
• Remember the rules of strong consistency and eventual consistency
• Group your data into entity groups when possible and use ancestor queries
Wednesday, July 27, 2011
Questions?
• Twitter: @ikai
• Google+: plus.ikailan.com
Wednesday, July 27, 2011