Download - Couchbase Live Europe 2015: Couchbase 101
©2014 Couchbase Inc.
Agenda
2
Where does Couchbase fit in? Key Concepts Operations Cluster-wide operations Look at a Live Cluster
©2014 Couchbase Inc.
Big Data = Operational + Analytic (NoSQL + Hadoop)
3
Online
Web/Mobile/IoT apps
Millions of customers/consumers
Offline, batch-oriented
Analytics apps
Hundreds of business analysts
©2014 Couchbase Inc.
Couchbase meets today’s & tomorrow’s requirements
4
Flexible data model
Consistent performance at scale
High availability
Easy, affordable scalability
24x365
©2014 Couchbase Inc.
Enterprises use Couchbase to enable key objectives
5
360 Degree Customer View
Profile Management
Catalog Fraud Detection
Content Management
Internet of Things
Digital Communication
Real Time Big Data
Mobile Applications
Personalization
©2014 Couchbase Inc.
Couchbase can act as a
Key-Value Store Document Store
2014-06-23-10:15am : 75F
2014-06-23-11:30am : 77F
2014-06-23-02:00pm : 82F
0001:
{firstname: “Dipti”,
lastname: “Borkar”,
language: “English”,
time_zone: “PST”,
zip: 94403
}
Key - UTF-8 string up to 250 bytes
Value - can be 0 bytes – 20 MB (best practice < 1 MB)©2014 Couchbase, Inc. 7
©2014 Couchbase Inc.
Similar to primary keys in relational databases
Documents are partitioned based on the document ID
ID based document lookup is extremely fast
Must be unique
Fundamentals
JSON
Binary - integers, strings, booleans
Common binary values include serialized objects, compressed XML, compressed text, encrypted values
Document ID or Key
Value
CAS Value (unique identifier for concurrency)
TTL
Flags (optional client library metadata)
Revision #
Metadata
©2014 Couchbase Inc.
Can Represent Complex Objects and Data Structures
Very simple notation, lightweight, compact, readable
The most common API return type for Integrations
Facebook, Twitter, you name it, return JSON
Native to Javascript (can be useful)
Can be inserted straight into Couchbase (faster development)
Serialization and Deserialization are very fast
Benefits of JSON
9
©2014 Couchbase Inc.
Storing and retrieving documents
©2014 Couchbase, Inc. 10
Couchbase Cluster
Server Nodes
User/application data
Which live on
Data Buckets
DocumentsRead from / Written to
That form a
Clients
Servers
Dynamically scalable
Based on hash partitioning
©2014 Couchbase Inc.
User Objectstring uid
string firstname
string lastname
int age
array favorite_colors
string email
u::[email protected]{ “uid”: 123456,
“firstname”: “John”,“lastname”: “Smith”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]”
}
User Objectstring uid
string firstname
string lastname
int age
array favorite_colors
string email
u::[email protected]{ “uid”: 123456,
“firstname”: “John”,“lastname”: “Smith”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]”
}
add()
get()
Objects Serialized to JSON and Back
©2014 Couchbase, Inc. 11
©2014 Couchbase Inc.
Couchbase provides a complete Data Management solution
12
High availability cache
Key-value store
Document database
Embedded database
Sync management
Multi-purpose capabilities support a broad range of apps and use cases
Enterprises often start with cache, then broaden usage to other apps and use cases
©2014 Couchbase Inc.
What makes Couchbase unique?
13
Performance & scalability leader
Sub millisecond latency with high throughput; memory-centric architecture
Multi-purpose
Simplified administration
Easy to deploy & manage; integrated Admin Console, single-click cluster expansion & rebalance
Cache, key value store, document database, and local/mobile database in single platform
Always-on availability
Data replication across nodes, clusters, and data centers
Enterprises choose Couchbase for several key advantages
24x365
©2014 Couchbase Inc.
Couchbase Server Architecture
15
QueryEngine
Object-managed
Cache
Storage Engine
DATA MANAGER
11210 / 11211Data access ports
8092Query API
HTTP
REST management API/Web UI
Replication, Rebalance, Shard State Manager
Erlang /OTP
CLUSTER MANAGER
8091Admin Console
©2014 Couchbase Inc.
Single Node Operations - Write
16
33 2Managed Cache
Dis
k Q
ueu
e
Disk
Replication Queue
App Server
Memory-to-Memory Replication to other node
Doc
Doc Doc
©2014 Couchbase Inc.
Managed Cache
Disk
Single Node Operations - Read
17
Managed Cache
Doc 1
Get Doc 1
Doc 1Doc 1
App Server
Dis
k Q
ueu
e
Replication Queue
Memory-to-Memory Replication to other node
©2014 Couchbase Inc.
Disk
Managed Cache
Single Node Operations – Cache Ejection
18
Doc 1
Doc 1
Doc 2Doc 3Doc 4Doc 5Doc 6
Doc 2Doc 3Doc 4Doc 5Doc 6App Server
Dis
k Q
ueu
e
Replication Queue
Memory-to-Memory Replication to other node
©2014 Couchbase Inc.
Single Node Operations – Cache Miss
19
33 2
Dis
k Q
ueu
e
Disk
Replication Queue
App Server
Memory-to-Memory Replication to other node
Doc 1
Doc 2Doc 3Doc 4Doc 5Doc 6
Doc 2Doc 3Doc 4Doc 5Doc 6
Doc 1
Doc 1Doc 1
Managed Cache
Get Doc 1
©2014 Couchbase Inc.
Auto sharding – Bucket and vBuckets
21
Each bucket has active and replica data sets
Each data set has 1024 Virtual Bucket (vBuckets)
Documents get logically mapped to vBuckets
Document IDs always get hashed to the same virtual bucket
Virtual buckets to do not have a fixed physical server location
Mapping between the virtual buckets and physical server is called the cluster map
Each virtual bucket contains 1/1024th portion of the data set
vB
Data buckets
vB
1 ….. 1024
Virtual buckets
©2014 Couchbase Inc.
Cluster Map
©2014 Couchbase, Inc. 22
Hash function (KEY)
vB1 vB2 vB3 vB4 vB5 vB1024
Ph
ysi
cal
serv
ers
A B C
Add node to scale out
Lo
gic
al
Pa
rtit
ion
s
Cluster Map
New Cluster Map
DocumentsRead from / Written to
©2014 Couchbase Inc.
read/write/update
Active
SERVER 1
Active
SERVER 2
Active
SERVER 3
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
Shard
5
Shard
2
Shard
9
Shard
Shard
Shard
Shard
4
Shard
7
Shard
8
Shard
Shard
Shard
Shard
1
Shard
3
Shard
6
Shard
Shard
Shard
Replica Replica Replica
Shard
4
Shard
1
Shard
8
Shard
Shard
Shard
Shard
6
Shard
3
Shard
2
Shard
Shard
Shard
Shard
7
Shard
9
Shard
5
Shard
Shard
Shard
Multi-Node Operations
©2014 Couchbase, Inc. 23
• Docs distributed evenly across servers
• Each server stores both active and replica docs- Only one server active at a time
• Client library provides app with simple interface to database
• Cluster map provides map to which server doc is on- App never needs to know
• App reads, writes, updates docs
• Multiple app servers can access same document at same time
©2014 Couchbase Inc.
SERVER 4 SERVER 5
Replica
Active
Replica
Active
read/write/update
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
Active
SERVER 1
Shard
9
Shard
Replica
Shard
4
Shard
1
Shard
8
Shard
Shard
Shard
Active
SERVER 2
Shard
8
Shard
Replica
Shard
6
Shard
3
Shard
2
Shard
Shard
Shard
Active
SERVER 3
Shard
6
Shard
Replica
Shard
7
Shard
9
Shard
5
Shard
Shard
Shard
read/write/update
Shard
5
Shard
2
Shard
Shard
Shard
4
Shard
7
Shard
Shard
Shard
1
Shard
3
Shard
Shard
Adding Nodes
©2014 Couchbase, Inc. 24
• Two servers added withone-click operation
• Docs automatically rebalance across cluster- Even distribution of docs- Minimum doc movement
• Cluster map updated
• App database calls now distributed over larger number of servers
©2014 Couchbase Inc.
SERVER 4 SERVER 5
Replica
Active
Replica
ActiveActive
SERVER 1
Shard 5
Shard 2
Shard 9Shard
Shard
Shard
Replica
Shard 4
Shard 1
Shard 8Shard
Shard
Shard
Active
SERVER 2
Shard 4
Shard 7 Shard 8
Shard
Shard Shard
Replica
Shard 6
Shard 3 Shard 2
Shard
Shard Shard
Active
SERVER 3
Shard 1
Shard 3
Shard 6Shard
Shard
Shard
Replica
Shard 7
Shard 9
Shard 5Shard
Shard
Shard
• App servers accessing Shards
• Requests to Server 3 fail
• Cluster detects server failedo Promotes replicas of
Shards to activeo Updates cluster map
• Requests for docs now go to appropriate server
• Typically rebalance would follow
Shard 1 Shard 3
Shard
Managing failures
25
App Server 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
App Server 2
©2014 Couchbase Inc.
XDCR: Cross Data Center Replication
Application can access both clusters (master – master)
Scales out linearly
Different from intra-cluster replication (“CP” versus “AP”)
©2014 Couchbase Inc.
XDCR: Flexible topologies
One-one, one-many, many-one
Differently sized and resourced clusters supported
©2014 Couchbase Inc.
33 2
XDCR after Write
33
Managed Cache
Dis
k Q
ueu
e
Disk
Replication Queue
App Server
Couchbase Server Node
Doc 1
Doc 1
XDCR Queue
Doc 1Doc 1
(New in 3.0) Memory-to-Memory Replication to remote cluster
Memory-to-Memory Replication to other node
©2014 Couchbase Inc.
Indexing and Querying Features
©2014 Couchbase, Inc. 34
Index and Query Distributed indexing and querying Secondary indexes of JSON document content Flexible querying of indexes
Incremental Map-Reduce Distributed simple real-time analytics Only considers changes due to updated data
Full Text Search Robust integration with ElasticSearch / Solr cluster Flexible full text search and faceted search
©2014 Couchbase Inc.
33 2
View processing after write
35
Managed Cache
Dis
k Q
ueu
e
Disk
Replication Queue
App Server
Couchbase Server Node
Doc 1
Doc 1
To other node
View engine Doc 1Doc 1
©2014 Couchbase Inc.
Active
SERVER 1
Shard
5
Shard
2
Shard
Shard
Replica
Shard
4
Shard
1
Shard
Shard
Shard
1
Active
SERVER 3
Shard
5
Shard
2
Shard
Shard
Replica
Shard
4
Shard
1
Shard
Shard
Shard
1
Active
SERVER 2
Shard
5
Shard
2
Shard
Shard
Replica
Shard
4
Shard
1
Shard
Shard
Shard
1
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
Couchbase Server Architecture - Views
©2014 Couchbase, Inc. 36
• Indexing work is distributed amongst nodes
• Large data set possible
• Parallelize the effort
• Each node has index for data stored on it
• Queries combine the results from required nodes
©2014 Couchbase Inc.
Introduction to N1QL – SQL for Documents
Next generation, NoSQL query language
SQL-like : SELECT * FROM WHERE/LIKE/JOIN/GROUP/etc, CREATE INDEX
Extended for JSON to support nested and hierarchical data structures
Support for views and newly-developed secondary indexes
Query (DQL), Manipulation (DML), Description (DDL)
ODBC/JDBC drivers in development
Built into Couchbase Server:
Single installation package
Multi-threaded, stateless query and indexing components
Leverages high-performance, high-scale Couchbase buckets
Coming in 2015, preview at query.couchbase.com
©2014 Couchbase Inc.
N1QL Architecture
Single node installation, services defined dynamically
Query service access Index and Data to formulate response
All queries and direct access is topology aware and dynamically scalable