couchconf_app development with indexes and queries
TRANSCRIPT
2
What we’ll talk about
• Working with JSON documents• Runtime-driven patterns for• linking between documents• embedding data• fetching multiple documents• Considerations with multi-datacenter
deployments
4
Couchbase Server is a Document Database
http://martinfowler.com/bliki/AggregateOrientedDatabase.html
5
Document Database
• Easy to distribute data• Makes sense to application programmers
This synergy between the programming model and the distribution model is very valuable. It allows the database to use its knowledge of how the application programmer clusters the data to help performance across the cluster.
http://martinfowler.com/bliki/AggregateOrientedDatabase.html
6
JSON Documents
• Maps more closely to external API• CRUD Operations, lightweight schema
• Stored under an identifier key
{ “fields” : [“with basic types”, 3.14159, true], “like” : “your favorite language”}
client.set(“mydocumentid”, myDocument);mySavedDocument = client.get(“mydocumentid”);
7
Meta + Document Body
{ "brewery": "New Belgium Brewing", "name": "1554 Enlightened Black Ale", "abv": 5.5, "description": "Born of a flood...", "category": "Belgian and French Ale", "style": "Other Belgian-Style Ales", "updated": "2010-07-22 20:00:20"}
{ "id" : "beer_Enlightened_Black_Ale”, ...{
Documentuser data,
can be anything
unique ID
Metadataidentifier,
expiration, etc
“vintage” date format from an SQL dump >_<
8
No More ALTER table
• No More alter table• More productive developers!• Emergent schema—the next session is
about views• This session is about working directly with
documents from interactive application code • Schema driven by code
10
Realtime Interactive CRUD
• Requirement: high-performance with high-concurrency and dynamic scale
• Key/value API is the interactive low-latency path
11
Runtime Driven Schema
• What’s in the database looks more like your code• Thinking about throughput, latency, update and read patterns is
the new data modeling• Data flows get more attention than data at rest
• When should I split a data-structure into multiple documents?• Generally the more useful your document is as a standalone
entity, the better.• Documents that grow without bound are bad
13
Let’s Add Comments and Ratings to the Beer
• Challenge linking items together• Whether to grow an existing item or store independent
documents• No transactionality between documents!
{ "brewery": "New Belgium Brewing", "name": "1554 Enlightened Black Ale", "abv": 5.5, "description": "Born of a flood...", "category": "Belgian and French Ale", "style": "Other Belgian-Style Ales", "updated": "2010-07-22 20:00:20"}
I give that a 5!
good w/ burgers
tastes like college!
14
Let’s Add Comments and Ratings to the Beer
• We’ll put comments in their own document• And add the ratings to the beer document itself.
{ "type": "comment", "about_id": "beer_Enlightened_Black_Ale", "user_id": 525, "text": "tastes like college!", "updated": "2010-07-22 20:00:20"}
I give that a 5!
{ "brewery": "New Belgium Brewing", "name": "1554 Enlightened Black Ale", "abv": 5.5, "description": "Born of a flood...", "category": "Belgian and French Ale", "style": "Other Belgian-Style Ales", "updated": "2010-07-22 20:00:20", “ratings” : { “525” : 5, “30” : 4, “1044” : 2 }, “comments” : [ “f1e62”, “6ad8c” ]}
15
Do it: save the comment document
• Set at the id “f1e62”
client.set(“f1e62”,{
});
create a new document
{ "id": "f1e62"}
"type": "comment", "about_id”: "beer_Enlightened_Black_Ale", "user_id": 525, "text": "tastes like college!", "updated": "2010-07-22 20:00:20"
16
{ "brewery": "New Belgium Brewing", "name": "1554 Enlightened Black Ale", "abv": 5.5, "description": "Born of a flood...", "category": "Belgian and French Ale", "style": "Other Belgian-Style Ales", "updated": "2010-07-22 20:00:20", “ratings” : { “525” : 5, “30” : 4, “1044” : 2 }, “comment_ids” : [ “f1e62”, “6ad8c” ]}
Link between comments and beers
{ "type": "comment", "about_id": "beer_Enlightened_Black_Ale", "user_id": 525, "text": "tastes like college!", "updated": "2010-07-22 20:00:20"}
link to comments
link to beer
{ "id": "f1e62"}
17
How to: look up comments from a beer
• SERIALIZED LOOP
figure http://www.ibm.com/developerworks/webservices/library/ws-sdoarch/
beer = client.get(“beer:A_cold_one”);beer.comment_ids.each { |id| comments.push(client.get(id));}
• FAST MULTI-KEY LOOKUPbeer = client.get(“beer:A_cold_one”);comments = client.multiGet(beer.comment_ids)
• ASYNC VIEW QUERYcomments = client.query(“myapp”,“by_comment_on”, {:key => “beer:A_cold_one”});
18
How to: add a rating to a beer
• Other users are ratings beers also, so we use a CAS update– we don’t want to accidentally overwrite another users rating that is being saved at the
same time as ours• Best practice is to use a lambda so the client can retry
cb.cas("mykey") do |doc| doc["ratings"][current_user.id] = my_rating docend
Actor 1 Actor 2
Couchbase Server
CAS mismatch & retry
Success
19
Object Graph With Shared Interactive Updates
• Challenge: higher level data structures• Objects shared across multiple users• Mixed object sets (updating some private and some
shared objects)
figure http://www.ibm.com/developerworks/webservices/library/ws-sdoarch/
20
Get With Lock (GETL)
• Often referred to as “GETL”• Pessimistic concurrency control• Locks have a short TTL• Locks released with CAS operations• Useful when working with object graphs
22
You Want Datacenter Affinity
• No ACID across documents, need resilient code• Locks and counters should be per-datacenter• Impacts operations like INCR DECR and CAS
US DATA CENTER EUROPE DATA CENTER ASIA DATA CENTER
Replication Replication
Replication
23
Conclusion and Next Session Summary
• JSON documents• Runtime-driven schema
NEXT UP: Views
• See inside the data• Practical patterns