nosql - apacheconarchive.apachecon.com/.../b_1000_eifrem_nosql.pdf · 2011-11-11 · analogously,...
TRANSCRIPT
![Page 1: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/1.jpg)
NOSQLthe past, present and future
Emil Eifrem
@emileifrem
#neo4j
1Thursday, November 10, 2011
![Page 2: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/2.jpg)
So what’s the plan?
๏The title page already told you, but out of order(*). Actual order:
•The Present
•The Past
•And The Future of NOSQL
๏Then lunch.
2
(*) Out of order? Well, the title page did inorder traversal (left node, root, right node), and since most folks aren’t graph geeks I’m going to go with the more intuitive preorder traversal (root, left, right) in the actual presentation.
2Thursday, November 10, 2011
![Page 3: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/3.jpg)
Anyway...
NOSQL: The Present
3
3Thursday, November 10, 2011
![Page 4: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/4.jpg)
First off: the name
4
๏WE ALL HATES IT, M’KAY?
4Thursday, November 10, 2011
![Page 5: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/5.jpg)
NOSQL is NOT...
๏ NO to SQL
๏ NEVER SQL
5
5Thursday, November 10, 2011
![Page 6: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/6.jpg)
Not Only SQL
6
NOSQL is simply
6Thursday, November 10, 2011
![Page 7: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/7.jpg)
7
Four trends
NOSQL - Why now?
7Thursday, November 10, 2011
![Page 8: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/8.jpg)
Source: IDC 2007
Trend 1:data set size
200740
8Thursday, November 10, 2011
![Page 9: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/9.jpg)
200740
2010
988
Trend 1:data set size
Source: IDC 20079Thursday, November 10, 2011
![Page 10: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/10.jpg)
Trend 2: Connectedness
101990
Info
rmat
ion
conn
ectiv
ity
2000 2010 2020
web 1.0 web 2.0 “web 3.0”
10Thursday, November 10, 2011
![Page 11: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/11.jpg)
Trend 2: Connectedness
11
Text documents
1990
Info
rmat
ion
conn
ectiv
ity
2000 2010 2020
web 1.0 web 2.0 “web 3.0”
11Thursday, November 10, 2011
![Page 12: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/12.jpg)
Trend 2: Connectedness
12
Text documents
1990
Info
rmat
ion
conn
ectiv
ity
Hypertext
2000 2010 2020
web 1.0 web 2.0 “web 3.0”
12Thursday, November 10, 2011
![Page 13: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/13.jpg)
Trend 2: Connectedness
13
Text documents
1990
Info
rmat
ion
conn
ectiv
ity Folksonomies
Tagging
User-generated content
Wikis
RSS
Blogs
Hypertext
2000 2010 2020
web 1.0 web 2.0 “web 3.0”
13Thursday, November 10, 2011
![Page 14: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/14.jpg)
Trend 2: Connectedness
14
Text documents
1990
Info
rmat
ion
conn
ectiv
ity Folksonomies
Tagging
User-generated content
Wikis
RSS
Blogs
Hypertext
2000 2010 2020
web 1.0 web 2.0 “web 3.0”
Ontologies
RDF
GiantGlobal
Graph (GGG)
14Thursday, November 10, 2011
![Page 15: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/15.jpg)
Trend 2: Connectedness
15
Text documents
1990
Info
rmat
ion
conn
ectiv
ity Folksonomies
Tagging
User-generated content
Wikis
RSS
Blogs
Hypertext
2000 2010 2020
web 1.0 web 2.0 “web 3.0”
Ontologies
RDF
GiantGlobal
Graph (GGG)
15Thursday, November 10, 2011
![Page 16: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/16.jpg)
Trend 3: Semi-structure
16
๏ Individualization of content
• In the salary lists of the 1970s, all elements had exactly one job
• In the salary lists of the 2000s, we need 5 job columns! Or 8? Or 15?
๏All encompassing “entire world views”
• Store more data about each entity
๏Trend accelerated by the decentralization of content generation that is the hallmark of the age of participation (“web 2.0”)
16Thursday, November 10, 2011
![Page 17: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/17.jpg)
Aside: RDBMS performance
17Data complexity
Perfo
rman
ce
Relational database
Requirement of application
17Thursday, November 10, 2011
![Page 18: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/18.jpg)
Aside: RDBMS performance
18Data complexity
Perfo
rman
ce
Relational database
Requirement of application
18Thursday, November 10, 2011
![Page 19: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/19.jpg)
Aside: RDBMS performance
19Data complexity
Perfo
rman
ce
Salary List Relational database
Requirement of application
19Thursday, November 10, 2011
![Page 20: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/20.jpg)
Aside: RDBMS performance
20Data complexity
Perfo
rman
ce
Majority ofWebapps
Salary List Relational database
Requirement of application
20Thursday, November 10, 2011
![Page 21: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/21.jpg)
Aside: RDBMS performance
21Data complexity
Perfo
rman
ce
Majority ofWebapps
Social network
Location-based services
Salary List
}custom
Relational database
Requirement of application
21Thursday, November 10, 2011
![Page 22: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/22.jpg)
Trend 4: Architecture
22
DB
Application
1980s: Application (<-- note lack of s)
22Thursday, November 10, 2011
![Page 23: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/23.jpg)
Trend 4: Architecture
23
DB
Application
1990s: Database as integration hub
Application Application
23Thursday, November 10, 2011
![Page 24: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/24.jpg)
DBDB DB
Trend 4: Architecture
24
Service
2000s: (moving towards) Decoupled serviceswith their own backend
Service Service
24Thursday, November 10, 2011
![Page 25: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/25.jpg)
Why NOSQL Now?
๏Trend 1: Size
๏Trend 2: Connectedness
๏Trend 3: Semi-structure
๏Trend 4: Architecture
25
25Thursday, November 10, 2011
![Page 26: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/26.jpg)
26
Four product categories
NOSQL
26Thursday, November 10, 2011
![Page 27: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/27.jpg)
Category 1: Key-Value stores
27
๏Lineage:
• “Dynamo: Amazon’s Highly Available Key-Value Store” (2007)
๏Data model:
•Global key-value mapping
•Think: Globally available HashMap/Dict/etc
๏Examples:
• Project Voldemort
•Riak
• 27Thursday, November 10, 2011
![Page 28: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/28.jpg)
Category II: ColumnFamily (BigTable) stores
28
๏Lineage:
• “Bigtable: A Distributed Storage System for Structured Data” (2006)
๏Data model:
•A big table, with column families
๏Examples:
•HBase
•HyperTable
•Cassandra28Thursday, November 10, 2011
![Page 29: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/29.jpg)
Category III: Document databases
29
๏Lineage:
• Lotus Notes
๏Data model:
•Collections of documents
•A document is a key-value collection
๏Examples:
•CouchDB
•MongoDB
29Thursday, November 10, 2011
![Page 30: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/30.jpg)
Document db: An example
30
๏How would we model a blogging software?
๏One stab:
•Represent each Blog as a Collection of Post documents
•Represent Comments as nested documents in the Post documents
30Thursday, November 10, 2011
![Page 31: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/31.jpg)
Document db: Creating a blog post
31
import com.mongodb.Mongo;import com.mongodb.DB;import com.mongodb.DBCollection;import com.mongodb.BasicDBObject;import com.mongodb.DBObject;// ...Mongo mongo = new Mongo( "localhost" ); // Connect to MongoDB// ...DB blogs = mongo.getDB( "blogs" ); // Access the blogs databaseDBCollection myBlog = blogs.getCollection( "Thobe’s blog" );
DBObject blogPost = new BasicDBObject();blogPost.put( "title", "ApacheCon 2011" );blogPost.put( "pub_date", new Date() );blogPost.put( "body", "Publishing a post about ApacheCon in my
MongoDB blog!" );blogPost.put( "tags", Arrays.asList( "conference", "names" ) );blogPost.put( "comments", new ArrayList() );
myBlog.insert( blogPost );
31Thursday, November 10, 2011
![Page 32: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/32.jpg)
Retrieving posts// ...import com.mongodb.DBCursor;// ...
public Object getAllPosts( String blogName ) {DBCollection blog = db.getCollection( blogName );return renderPosts( blog.find() );
}
private Object renderPosts( DBCursor cursor ) {// order by publication date (descending)cursor = cursor.sort( new BasicDBObject( "pub_date", -1 ) );// ...
}
32
32Thursday, November 10, 2011
![Page 33: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/33.jpg)
Category IV: Graph databases
33
๏Lineage:
• Euler and graph theory
๏Data model:
•Nodes with properties
•Typed relationships with properties
๏Examples:
• InfiniteGraph
•Neo4j
33Thursday, November 10, 2011
![Page 34: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/34.jpg)
Property Graph model
34
34Thursday, November 10, 2011
![Page 35: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/35.jpg)
Property Graph model
35
LIVES WITHLOVES
OWNSDRIVES
LOVES
35Thursday, November 10, 2011
![Page 36: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/36.jpg)
Property Graph model
36
LIVES WITHLOVES
OWNSDRIVES
LOVESname: “James”age: 32twitter: “@spam”
name: “Mary”age: 35
brand: “Volvo”model: “V70”
property type: “car”
36Thursday, November 10, 2011
![Page 37: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/37.jpg)
Graphs are whiteboard friendly
37Image credits: Tobias Ivarsson
An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database.With a Graph Database the model from the whiteboard is implemented directly.
37Thursday, November 10, 2011
![Page 38: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/38.jpg)
Graphs are whiteboard friendly
38
thobe
Wardrobe Strength
Joe project blog
Hello Joe
Neo4j performance analysis
Modularizing Jython
Image credits: Tobias Ivarsson
An application domain model outlined on a whiteboard or piece of paper would be translated to an ER-diagram, then normalized to fit a Relational Database.With a Graph Database the model from the whiteboard is implemented directly.
38Thursday, November 10, 2011
![Page 39: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/39.jpg)
Graph db: Creating a social graph
39
GraphDatabaseService graphDb = new EmbeddedGraphDatabase(GRAPH_STORAGE_LOCATION );
Transaction tx = graphDb.beginTx();try {
Node mrAnderson = graphDb.createNode();mrAnderson.setProperty( "name", "Thomas Anderson" );mrAnderson.setProperty( "age", 29 );
Node morpheus = graphDb.createNode();morpheus.setProperty( "name", "Morpheus" );morpheus.setProperty( "rank", "Captain" );
Relationship friendship = mrAnderson.createRelationshipTo(morpheus, SocialGraphTypes.FRIENDSHIP );
tx.success();} finally {
tx.finish();}
39Thursday, November 10, 2011
![Page 40: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/40.jpg)
Graph db: How do I know this person?Node me = ...Node you = ...
PathFinder shortestPathFinder = GraphAlgoFactory.shortestPath(Traversals.expanderForTypes(
SocialGraphTypes.FRIENDSHIP, Direction.BOTH ),/* maximum depth: */ 4 );
Path shortestPath = shortestPathFinder.findSinglePath(me, you);
for ( Node friend : shortestPath.nodes() ) {System.out.println( friend.getProperty( "name" ) );
}
40
40Thursday, November 10, 2011
![Page 41: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/41.jpg)
Four emerging NOSQL categories
๏ Key-Value stores
๏ ColumnFamiy stores
๏ Document databases
๏ Graph databases
41
41Thursday, November 10, 2011
![Page 42: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/42.jpg)
Scaling to size vs. Scaling to complexity
42
Coping with Size
Coping with Complexity
Key/Value stores
ColumnFamily stores
Document databases
Graph databases
My subjective view: > 90% of use cases
100+ billion of nodesand relationships
42Thursday, November 10, 2011
![Page 43: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/43.jpg)
43
a brief excursion into the past
NOSQL
43Thursday, November 10, 2011
![Page 44: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/44.jpg)
44
2006 2007 2008 2009 2010 2011 2012 2013 2014 ... 20192005
NOSQL
1976 1977 1978 1979 1980 1981 1982 1983 1984 ... 19891975
SQL
Graph theory researched
Bigtable paper
Dynamo paper
Cambrian explosion of
projects
Oracle announces a NOSQL db
“NOSQL” coined $1.5B
“NOSQL” market?
Only four left?
Only the “Big
Four” left
$1.5BRDBMSmarket
Oracle takes VC funding
Cambrian explosion of
vendors
Oracle starts selling
products
All databases still
in-house
Oracle incorporated
“SEQUEL” invented
RDBMS theory
researched
44Thursday, November 10, 2011
![Page 45: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/45.jpg)
45
The Future
NOSQL
45Thursday, November 10, 2011
![Page 46: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/46.jpg)
Four trends
46
๏ More ACIDity
• Mongo adding durable logging storage in 1.7
• Tunable consistency in Apache Cassandra
• Roger Bodamer
‣ uptime ( CP + average developer )>=uptime ( AP + average developer )http://www.slideshare.net/iammutex/q-con-sf10rogerbodamer
• Makes sense - why push the burden to the developer when eventually consistency is not needed in most scenarios?
46Thursday, November 10, 2011
![Page 47: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/47.jpg)
47
๏ More query languages
• In the past year, many prominent NOSQL databases have invested heavily in query languages
•Cassandra: CQL
•Couchbase: UnQL
•Neo4j: Cypher
•Mongo’s had it from the get go? <--- One reason for their popularity?
Four trends
47Thursday, November 10, 2011
![Page 48: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/48.jpg)
48
๏ More schemas?
•Analogously, why push the full burden of schema freedom to the developer?
•Over time, I believe we will see more schema-like support in most NOSQL stores
•At least in document databases and graph databases, who have the richest models
•Granted, we haven’t really seen that yet
Four trends
48Thursday, November 10, 2011
![Page 49: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/49.jpg)
49
๏ Polyglot persistence will drive middleware support
•The era of the One Size Fits All Database is over
• Ergo, any given system will typically work at runtime with multiple databases
•That’s all fine and dandy, except it’s not because it’s a pain
•This trend will demand a lot of middleware support
Four trends
49Thursday, November 10, 2011
![Page 50: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/50.jpg)
Middleware support?๏Lemme tell you the story about Mike and his restaurant site
50
50Thursday, November 10, 2011
![Page 51: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/51.jpg)
Domain & data model
51
Restaurant@Entitypublic class Restaurant {
@Id @GeneratedValue private Long id; private String name; private String city; private String state; private String zipCode;
UserAccount@Entity@Table(name = "user_account")public class UserAccount { @Id @GeneratedValue private Long id; private String userName; private String firstName; private String lastName; @Temporal(TemporalType.TIMESTAMP) private Date birthDate; @ManyToMany(cascade = CascadeType.ALL) private Set<Restaurant> favorites;
51Thursday, November 10, 2011
![Page 52: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/52.jpg)
Step 1: Buildsing a web site
52
MySQL
Tomcat
One box
52Thursday, November 10, 2011
![Page 53: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/53.jpg)
Step II: Whoa, ppl are actually using it?
53
MySQL
Tomcat
Two boxes
53Thursday, November 10, 2011
![Page 54: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/54.jpg)
Step III: That’s a LOT of pages served...
54
MySQL
Tomcat n boxesTomcat Tomcat
1 box
54Thursday, November 10, 2011
![Page 55: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/55.jpg)
Step IV: Our DB is completely overwhelmed...
55
MySQL (m)
Tomcat n boxesTomcat Tomcat
MySQL (s) n boxes
55Thursday, November 10, 2011
![Page 56: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/56.jpg)
Step V: Our DBs are STILL overwhelmed
?56
56Thursday, November 10, 2011
![Page 57: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/57.jpg)
What does the site look like now?
57
57Thursday, November 10, 2011
![Page 58: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/58.jpg)
Step V: Our DBs are STILL overwhelmed๏Turns out the problem is due to joins
๏A while back Mike introduced a new feature
•Recommend restaurants based on the user’s friends (and friends of friends)
•Whoa, recommendations aren’t just simple get and put!
•They’re killing us with joins
๏What about sharding?
๏What about SSDs?58
58Thursday, November 10, 2011
![Page 59: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/59.jpg)
Polyglot persistence (Not Only SQL)๏How did we get into this situation?
๏Well, data sets are increasingly less uniform
•Parts of Mike’s data fits well in an RDBMS
•But parts of it is graph-shaped
‣It fits much better in a graph database!
‣And I’m sure that there is or will be very key-value-esque parts of the dataset
๏Simple, just store some of it in a graph db and some of it in MySQL!But what does the code look like? 59
59Thursday, November 10, 2011
![Page 60: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/60.jpg)
We were here
60
Restaurant@Entitypublic class Restaurant {
@Id @GeneratedValue private Long id; private String name; private String city; private String state; private String zipCode;
UserAccount@Entity@Table(name = "user_account")public class UserAccount { @Id @GeneratedValue private Long id; private String userName; private String firstName; private String lastName; @Temporal(TemporalType.TIMESTAMP) private Date birthDate; @ManyToMany(cascade = CascadeType.ALL) private Set<Restaurant> favorites;
60Thursday, November 10, 2011
![Page 61: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/61.jpg)
Then we added recommendations
61
Recommendation
@RelationshipEntitypublic class Recommendation { @StartNode private UserAccount user; @EndNode private Restaurant restaurant; private int stars; private String comment;
Restaurant@Entity@NodeEntity(partial = true)public class Restaurant { @Id @GeneratedValue private Long id; private String name; private String city; private String state; private String zipCode;
UserAccount@Entity@Table(name = "user_account")@NodeEntity(partial = true)public class UserAccount { @Id @GeneratedValue private Long id; private String userName; private String firstName; private String lastName; @Temporal(TemporalType.TIMESTAMP) private Date birthDate; @ManyToMany(cascade = CascadeType.ALL) private Set<Restaurant> favorites;
@GraphProperty String nickname; @RelatedTo(type = "friends", elementClass = UserAccount.class) Set<UserAccount> friends; @RelatedToVia(type = "recommends", elementClass = Recommendation.class) Iterable<Recommendation> recommendations;
61Thursday, November 10, 2011
![Page 62: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/62.jpg)
Then we added recommendations
62
Recommendation
@RelationshipEntitypublic class Recommendation { @StartNode private UserAccount user; @EndNode private Restaurant restaurant; private int stars; private String comment;
Restaurant@Entity@NodeEntity(partial = true)public class Restaurant { @Id @GeneratedValue private Long id; private String name; private String city; private String state; private String zipCode;
UserAccount@Entity@Table(name = "user_account")@NodeEntity(partial = true)public class UserAccount { @Id @GeneratedValue private Long id; private String userName; private String firstName; private String lastName; @Temporal(TemporalType.TIMESTAMP) private Date birthDate; @ManyToMany(cascade = CascadeType.ALL) private Set<Restaurant> favorites;
@GraphProperty String nickname; @RelatedTo(type = "friends", elementClass = UserAccount.class) Set<UserAccount> friends; @RelatedToVia(type = "recommends", elementClass = Recommendation.class) Iterable<Recommendation> recommendations;
62Thursday, November 10, 2011
![Page 63: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/63.jpg)
And we’re back to talking about Middleware๏This example was from the Spring Data project
• Specifically Spring Data Neo4j
•Available at: http://www.springsource.org/spring-data/neo4j
• Spring Data Neo4j 2.0 will be released RSN, RC out NOW
๏But this really should be available in any middleware stack
•Ask Redhat / JBoss
•Ask David & the brogrammers
•Ask Sun / Oracle
•Ask Microsoft 63
63Thursday, November 10, 2011
![Page 64: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/64.jpg)
Four NOSQL trends
๏ More ACIDity
๏ More query languages
๏ More schemas
๏ More middleware support
64
64Thursday, November 10, 2011
![Page 65: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/65.jpg)
65
Conclusion
65Thursday, November 10, 2011
![Page 66: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/66.jpg)
Ait, what’s your point?๏There’s an explosion of ‘nosql’ databases out there
• Some are immature and experimental
• Some are coming out of years of battle-hardened production
๏NOSQL is about finding the right tool for the job
• Sometimes that’s an RDBMS
•But increasingly commonly a NOSQL db is the perfect fit
๏We will have heterogenous data backends in the future
•Now the rest of the stack needs to step up and help developers cope with that 66
66Thursday, November 10, 2011
![Page 67: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/67.jpg)
Not Only SQLis here
67
Key takeaway
67Thursday, November 10, 2011
![Page 68: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/68.jpg)
Not Only SQLis here to stay
68
Key takeaway
68Thursday, November 10, 2011
![Page 69: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/69.jpg)
Not Only SQLis FUN - dl & play
around now!69
Key takeaway
69Thursday, November 10, 2011
![Page 70: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/70.jpg)
70
Image credits: Lost! Sorry... :(
Questions?
70Thursday, November 10, 2011
![Page 71: NOSQL - ApacheConarchive.apachecon.com/.../B_1000_Eifrem_NoSQL.pdf · 2011-11-11 · Analogously, why push the full burden of schema freedom to the developer? • Over time, I believe](https://reader034.vdocuments.mx/reader034/viewer/2022042412/5f2c61c74047990b2c76b78b/html5/thumbnails/71.jpg)
http://neotechnology.com
71Thursday, November 10, 2011