Download - CCB12 Navigating the NoSQL Ladnscape
![Page 1: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/1.jpg)
1
Navigating the NoSQL Landscape
Tugdual “tug” GrallTechnology Evangelist@tgrall
![Page 2: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/2.jpg)
2
What we’ll talk about
• Why RDBMS are not enough?
• What are the different NoSQL taxonomies?
• Which “NoSQL” is right for me?
![Page 3: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/3.jpg)
3
Growth is the New Reality
• Instagram gained nearly 1 million users overnight when they expanded to Android
![Page 4: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/4.jpg)
4
Does it work with RDBMS backend?
Application Scales OutJust add more commodity web servers
Database Scales UpGet a bigger, more complex server
Note – Relational database technology is great for what it is great for, but it is not great for this.
![Page 5: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/5.jpg)
5
Some alternative to scale out your RDBMS
Scale out your RDBMS• Run many SQL Servers• Data are sharded
(most of the time using client code)• Memcached for faster response time
![Page 6: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/6.jpg)
6
Scale Out with RDBMS
Is this a good approach to scale?• Lot of components to deploy• Scale by Hand
– Caching– Sharding/Replication
Learn From Others This Scenario Costs Time and Money. Scaling SQL is potentially disastrous when going Viral: Very risky time for major code changes and migrations...You have no Time when skyrocketing up!
![Page 7: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/7.jpg)
7
Lacking market solutions, users forced to invent
DynamoOctober 2007
CassandraAugust 2008
VoldemortFebruary 2009
BigtableNovember 2006
Very few organizations want to (fewer can) build and maintain database software technology.But every organization building interactive web applications needs this technology.
• No schema required before inserting data• No schema change required to change data format• Auto-sharding without application participation• Distributed queries• Integrated main memory caching• Data synchronization (mobile, multi-datacenter)
![Page 8: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/8.jpg)
8
Other
All of these
Costs
High latency/low performance
Inability to scale out data
Lack of flexibility/rigid schemas
11%
12%
16%
29%
35%
49%
Source: Couchbase NoSQL Survey, December 2011, n=1351
What is the biggest data management problem driving your use of NoSQL in the coming year?
Survey: Schema inflexibility #1 adoption driver
![Page 9: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/9.jpg)
9
NoSQL database matches application logic tier architectureData layer now scales with linear cost and constant performance.
Application Scales OutJust add more commodity web servers
Database Scales OutJust add more commodity data servers
Scaling out flattens the cost and performance curves.
NoSQL Database Servers
![Page 10: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/10.jpg)
10
NOSQL TAXONOMY
![Page 11: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/11.jpg)
11
The Key-Value Store – the foundation of NoSQL
Key
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
OpaqueBinaryValue
![Page 12: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/12.jpg)
12
Memcached – the NoSQL precursor
Key
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
OpaqueBinaryValue
Memcached
In-memory onlyLimited set of operationsBlob Storage: Set, Add, Replace, CASRetrieval: GetStructured Data: Append, Increment
“Simple and fast.”
Challenges: cold cache, disruptive elasticity
![Page 13: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/13.jpg)
13
NoSQL catalog
Key-Value
Memcached
Cach
e(m
emor
y on
ly)
Dat
abas
e(m
emor
y/di
sk)
![Page 14: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/14.jpg)
14
Redis – More “Structured Data” commands
Key
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
“Data Structures”BlobListSet
Hash…
Redis
Disk Persistence (eventual consistency on the disk)Vast set of operationsBlob Storage: Set, Add, Replace, CASRetrieval: Get, Pub-SubStructured Data: Strings, Hashes, Lists, Sets,Sorted listsExample operations for a SetAdd, count, subtract sets, intersection, is member, atomic move from one set to another
![Page 15: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/15.jpg)
15
NoSQL catalog
Key-Value
Memcached Redis
Data Structure
Cach
e(m
emor
y on
ly)
Dat
abas
e(m
emor
y/di
sk)
![Page 16: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/16.jpg)
16
Membase – From key-value cache to database
Disk-based with built-in memcached cacheCache refill on restartMemcached compatible (drop in replacement)Highly-available (data replication)Add or remove capacity to live cluster
“Simple, fast, elastic.”
MembaseKey
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
OpaqueBinaryValue
![Page 17: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/17.jpg)
17
NoSQL catalog
Key-Value
Memcached
Membase
Redis
Data Structure
Cach
e(m
emor
y on
ly)
Dat
abas
e(m
emor
y/di
sk)
![Page 18: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/18.jpg)
18
Couchbase – document-oriented database
Key
{ “string” : “string”, “string” : value, “string” : { “string” : “string”, “string” : value }, “string” : [ array ]}
Auto-shardingDisk-based with built-in memcached cacheCache refill on restartMemcached compatible (drop in replace)Highly-available (data replication)Add or remove capacity to live cluster
When values are JSON objects (“documents”):Create indices, views and query against the views
JSONOBJECT
(“DOCUMENT”)
Couchbase
![Page 19: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/19.jpg)
19
NoSQL catalog
Key-Value
Memcached
Membase
Redis
Data Structure Document
Couchbase
Cach
e(m
emor
y on
ly)
Dat
abas
e(m
emor
y/di
sk)
![Page 20: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/20.jpg)
20
MongoDB – Document-oriented database
Key
{ “string” : “string”, “string” : value, “string” : { “string” : “string”, “string” : value }, “string” : [ array ]}
Disk-based with in-memory “caching”BSON (“binary JSON”) format and wire protocolMaster-slave replicationAuto-shardingValues are BSON objectsSupports ad hoc queries – best when indexed
BSONOBJECT
(“DOCUMENT”)
MongoDB
![Page 21: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/21.jpg)
21
NoSQL catalog
Key-Value
Memcached
Membase
Redis
Data Structure Document
MongoDB
Couchbase
Cach
e(m
emor
y on
ly)
Dat
abas
e(m
emor
y/di
sk)
![Page 22: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/22.jpg)
22
Cassandra – Column overlays
Disk-based systemClustered External caching required for low-latency reads“Columns” are overlaid on the dataNot all rows must have all columnsSupports efficient queries on columnsRestart required when adding columns
CassandraKey
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
OpaqueBinaryValue
Column 1
Column 2
Column 3 (not present)
![Page 23: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/23.jpg)
23
NoSQL catalog
Key-Value
Memcached
Membase
Redis
Data Structure Document Column
MongoDB
Couchbase Cassandra
Cach
e(m
emor
y on
ly)
Dat
abas
e(m
emor
y/di
sk)
![Page 24: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/24.jpg)
24
Neo4j – Graph database
Disk-based systemExternal caching required for low-latency readsNodes, relationships and pathsProperties on nodesDelete, Insert, Traverse, etc.
Neo4j
Key
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
OpaqueBinaryValue
Key
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
OpaqueBinaryValue
Key
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
OpaqueBinaryValue
Key
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
OpaqueBinaryValue
Key
101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101
101100101000100010011101101100101000100010011101101100101000100010011101
OpaqueBinaryValue
![Page 25: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/25.jpg)
25
NoSQL catalog
Key-Value
Memcached
Membase
Redis
Data Structure Document Column Graph
MongoDB
Couchbase Cassandra
Cach
e(m
emor
y on
ly)
Dat
abas
e(m
emor
y/di
sk)
Neo4j
![Page 26: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/26.jpg)
26
NoSQL catalog
Key-Value
Memcached
Membase
Redis
Data Structure Document Column Graph
MongoDB
Couchbase Cassandra
Cach
e(m
emor
y on
ly)
Dat
abas
e(m
emor
y/di
sk)
Neo4j
HBase InfiniteGraph
Coherence
![Page 27: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/27.jpg)
27
![Page 28: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/28.jpg)
28
What about Hadoop?
![Page 29: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/29.jpg)
29
Hadoop : Big Data Swiss Army Knife
• Oozie: Workflow, coordination• Sqoop : Data connector to import/export data• Hive : SQL-Like interface• Pig : High level programming language• Mahout : Machine learning library• Whirr : Hadoop management tools for cloud services• Flume : Aggregator• Map Reduce : Framework to process large volume of data• HBase : Key Value data store• Zookeeper : Centralized configuration management• HDFS : Distributed file system
![Page 30: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/30.jpg)
30
So what? Hadoop & Couchbase
click streamevents
profiles, campaigns
profiles, real time campaign statistics
40 milliseconds to respond with the decision.
2
3
1
![Page 31: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/31.jpg)
31
WHICH ONE IS RIGHT FOR ME ?
![Page 32: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/32.jpg)
32
Other
All of these
Costs
High latency/low performance
Inability to scale out data
Lack of flexibility/rigid schemas
11%
12%
16%
29%
35%
49%
Source: Couchbase NoSQL Survey, December 2011, n=1351
What is the biggest data management problem driving your use of NoSQL in the coming year?
Survey: Schema inflexibility #1 adoption driver
![Page 33: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/33.jpg)
33
Lack of Flexibility / Rigid Schema
• Aggregate Data Models (Martin Fowler)– Flexible Data Structure– Optimized Access– Easy to distribute data
o::1001
{uid: ji22jd,customer: Ann,line_items: [ { sku: 0321293533, quan: 3, unit_price: 48.0 }, { sku: 0321601912, quan: 1, unit_price: 39.0 }, { sku: 0131495054, quan: 1, unit_price: 51.0 } ],payment: { type: Amex, expiry: 04/2001,
last5: 12345 }}
http://martinfowler.com/bliki/AggregateOrientedDatabase.html
![Page 34: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/34.jpg)
34
Use Cases
Key Value • Session Management• User Profile/Preferences• Shopping Cart
Document • Event Logging• Content Management • Web Analytics• E-Commerce Application
Columns • Event Logging• Content Management• Counters
Graph • Connected Data / Social Networks• Routing, Dispatch• Recommendations based on Social Graph
Thanks to Martin Fowler
![Page 35: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/35.jpg)
35
Production Environment
US DATA CENTER
EMEA DC
APAC DC
![Page 36: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/36.jpg)
36
Scale out your data
• Modify cluster topology should be simple– Add, Remove, Configure Nodes on a running system
• What is the impact of topology changes?– Sharding, Caching of the data– Availability of the service during cluster changes
• More hardware = More failures– Availability, reliability of the system: failover support
![Page 37: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/37.jpg)
37
Add Nodes
Two servers added to cluster One-click operation
Docs automatically rebalanced across cluster Even distribution of
docs Minimum doc
movement Cluster map updated
App database calls now distributed over larger # of servers
User Configured Replica Count = 1
Read/Write/Update Read/Write/Update
Doc 7
Doc 9
Doc 3
Active Docs
Replica Docs
Doc 6
COUCHBASE CLIENT LIBRARY
CLUSTER MAP
APP SERVER 1
COUCHBASE CLIENT LIBRARY
CLUSTER MAP
APP SERVER 2
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2
Doc 7
Doc 1
SERVER 3
Doc 3
Doc 9
Doc 7
Doc 8 Doc 6
Doc 3
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
DOC
Doc 9
Doc 5
DOC
DOC
DOC
Doc 1
Doc 8 Doc 2
Replica Docs Replica Docs Replica Docs
Active Docs Active Docs Active Docs
SERVER 4 SERVER 5
Active Docs Active Docs
Replica Docs Replica Docs
COUCHBASE SERVER CLUSTER
![Page 38: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/38.jpg)
38
Fail Over Node
App servers happily accessing docs on Server 3
Server fails App server requests to server 3 fail Cluster detects server has failed
Promotes replicas of docs to active Updates cluster map
App server requests for docs now go to appropriate server
Typically rebalance would follow
User Configured Replica Count = 1
Doc 7
Doc 9
Doc 3
Active Docs
Replica Docs
Doc 6
COUCHBASE CLIENT LIBRARY
CLUSTER MAP
APP SERVER 1
COUCHBASE CLIENT LIBRARY
CLUSTER MAP
APP SERVER 2
Doc 4
Doc 2
Doc 5
SERVER 1
Doc 6
Doc 4
SERVER 2
Doc 7
Doc 1
SERVER 3
Doc 3
Doc 9
Doc 7 Doc 8
Doc 6
Doc 3
DOC
DOC
DOCDOC
DOC
DOC
DOC DOC
DOC
DOC
DOC DOC
DOC
DOC
DOC
Doc 9
Doc 5DOC
DOC
DOC
Doc 1
Doc 8
Doc 2
Replica Docs Replica Docs Replica Docs
Active Docs Active Docs Active Docs
SERVER 4 SERVER 5
Active Docs Active Docs
Replica Docs Replica Docs
COUCHBASE SERVER CLUSTER
![Page 39: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/39.jpg)
39
Performance
• What is my working set?• How cache is working?
– Put your data in RAM
• How to design my data model?– Aggregate Model– Easy to change
![Page 40: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/40.jpg)
40
![Page 41: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/41.jpg)
41
Management and Monitoring
• Do not forget about Operations!– Service Reliability Engineering Team will thank you!
• Manage your cluster easily:– Command Line, Administration Console to change cluster
toplogy
• Monitor “your NoSQL”– Analyze the overall status of your cluster– View and fix bottlenecks
![Page 42: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/42.jpg)
42
Conclusion
• One Size Does Not Fit All
• Overview of the the NoSQL types
• Choose the right solution– Developer Productivity– Large Scale Data
![Page 43: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/43.jpg)
43
QUESTIONS?
![Page 44: CCB12 Navigating the NoSQL Ladnscape](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554f5066b4c905423f8b51d5/html5/thumbnails/44.jpg)
44
Couchbase automatically distributes data across commodity servers. Built-in caching enables apps to read and write data with sub-millisecond latency. And with no schema to manage,
Couchbase effortlessly accommodates changing data management requirements.
Simple. Fast. Elastic. NoSQL.