introduction to nosql and couchbase
TRANSCRIPT
Introduction to NoSQL – Couchbase 4.5 and Couchbase Mobile 1.2
Cécile Le PapeSolutions Architect
High-Availability Caching
RDBMS
Application LayerUser Requests
Cache Misses and Write Requests
Read-Write Requests
Couchbase Distributed Cache
Use Case 1
Application objects Popular search query results Session information Heavily accessed web
landing pages
High-Availability Caching
Speed up RDBMS Consistently low response times
for document / key lookups High-availability 24x7x365 Replacement for entire caching tier
Data cached in Couchbase? Application characteristic
Use Case 1http://www.Look.PopularSearchWuerycom
Look Something Search
WEB % of clicks % of clicks
something 56.3 28
DoSomething.com 13.4 25.08
SomethingFishy.org 9.8 14.68Popular
Use Case 2
Session Store
Session Store
Extremely fast access to session data using unique session ID
Easy scalability to handle fast growing number of users and user-generated data
Always-on functionality for global user base
Application characteristic
Use Case 2
Session values or Cookies (stored as key-value pairs)
Examples include: items in a shopping cart, flights selected, search results, etc.
Data stored in Couchbase?
Use Case 3
Globally Distributed User Profile Store
http://www.ProfileStore.com
e enim nec felis rhoncus, ac volutpat magna blandit. Nunc facilisis turpis eget dolor mollis, id tincidunt dui mattis. Nunc sodales elementum turpis, vel interdum ante congue quis. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Aliquam erat volutpat. Nullam suscipit diam nec tortor pharetra, vitae adipiscing dolor pretium. Integer ac porta tortor. Vestibulum imperdiet quam laoreet nisl scelerisque, a tempus tortor tincidunt. Mauris suscipit dui ac urna dignissim, vitae aliquet velit convallis. Phasellus lobortis felis eu magna vulputate dapibus. Ut ornare ut quam a vulputatullam et dui odio. Nulla pharetra, velit ac convallis semper, dolor turpis porta nunc, in egestas mauris leo a nisi. Pellentesque fringilla sagittis magna vitae imperdiet. Mauris ac leo ut tellus aliquet interdum. Interdum et malesuada fames ac ante ipsum primis in faucibus. Nunc cursus odio sit amet elit mollis, et sollicitudin lacus accumsan. Nulla facilisi. Fusce et vehicula sem. Curabitur interdum vestibulum nulla id accumsan. Integer ut tortor in ligula semper vehicula. Vestibulum ut nibh ultrices, venenatis metus at, adipiscing ipsum. Donec quis consequat lectus.Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Donec a diam tempus, aliquet ipsum eu, vestibulum sapien. Donec eleifend lectus sit amet luctus facilisis. Morbi porttitor, orci sit amet placerat tempus, nisi justo dictum augue, ac dignissim elit enim eget dolor. Praesent pulvinar ipsum arcu, eu posuere eros luctus nec. Vestibulum odio eros, ultrices non metus sit amet, tristique malesuada augue. Pellentesque lacinia dolor nec diam eleifend mollis. Vestibulum sit amet ultrices diam. Aliquam lacinia accumsan eros id hendrerit. Cras placerat laoreet urna scelerisque rutrum. Duis ornare mi ac augue varius, sit amet accumsan leo lacinia. Vivamus nec egestas neque. Quisque interdum enim molestie urn.
turpis eget dolor mollis, id tincidunt dui mattis. Nunc sodales elementum turpis, vel interdum ante congue quis. Pellentesque habitant morbi tristique senectus et netus et malesuada
Welcome back Laura!You have 3 items in your shopping cart waiting for you.
LOGIN
ID:
PASS:
Globally Distributed User Profile Store
Extremely fast access to individual profiles Always online system as multiple
applications access user profiles Flexibility to add and update user attributes Easy scalability to handle fast growing
number of users
User profile with unique ID User setting / preferences User’s network User application state
Data stored in Couchbase? Application characteristic
Use Case 3
Laura930
********
Data Aggregation
Flexibility to store any kind of content Flexibility to handle schema changes Full-text Search across data set High speed data ingestion Scales horizontally as more content gets
added to the system
Social media feeds: Twitter, Facebook, LinkedIn
Blogs, news, press articles Data service feeds: Hoovers,
Reuters Data form other systems
Data stored in Couchbase? Application characteristic
Use Case 4
in
Ft
NEWS
Blog
Use Case 5
Content and Metadata
Nature, Field, Summer, Farm, Sky, Environment, Landscaped, Grass, Green,Blue, Oilseed, Rape, Agriculture, Scenics, Land, Spring, Non-Urban Scene,Environmental, Conservation, Sun, Meadow, Horizon, Season, Cloud, Landscapes, Travel Locations, Pasture, Cultivated Land, Stratoshpere, cloudy day, Oliseed Rape, Rural Scene, Vibrant Color, No People, Beauty In Nature,Gold, Color Image, Beauty, Idyllic, Multicolored, Yellow, Colors, Cloudscape,Outdoors, Plant, Sunlight, Horizon Over Land
Content and metadata store
Content and Metadata Store
Flexibility to store any kind of content Fast access to content metadata (most
accessed objects) and content Full-text Search across data set Scales horizontally as more content gets
added to the system
Content metadata Content: Articles, text Landing pages for website Digital content: eBooks,
magazine, research material
Data stored in NoSQL? Application characteristic
Use Case 5http://www.LandingPage.com
ebookMag
Macro Trends Driving NoSQL Technology
NoSQL+ +
More Data More Users Interactive Apps
Why The Digital Economy Needed A New Database Solution
Question: What are the biggest problems with Relational Database that are driving adoption of NoSQL?
LACK OF FLEXIBILITY/ HAS RIGID SCHEMAS
INABILITY TO SCALE OUTPERFORMANCE
CHALLENGES
49%
69%
50%
47%
44%
COST
ALL OF THE ABOVE
35%
29%
16%
12%
Agile Development
Hotel Descriptions
Reviews
User Profiles
Reviews points to users
Hotels points to reviews
{ “ID”: 1, “NAME”: “Fairmont San Francisco”,…}
{“REVIEW_ID”: 1, “REVIEW”: “Loved Hotel…”,…}
{ “REVIEW_ID”: 2, “REVIEW”: “Nice, but …”,…}
{ “USER_ID”: 1, “DISPLAY”: “Ted’s Trip…”,…}
{ “USER_ID”: 2, “DISPLAY”: “WhatWhat …”,…}
Must support flexible schemas to make development agile
Must Dynamically Scale Apps to Support Millions of Users
Scalability
RDBMS Scales UpGet a bigger, more complex server
Users
Application Scales OutJust add more commodity web servers
Users
System CostApplication Performance
System CostApplication Performance
Won’t scale beyond this point
Consumers & Employees Demand Highly Responsive Apps
Performance
Application layer
RDBMSCache Application layer
RDBMSCacheCouchbase
Apps Must Now Stay Online 24 x 365
Availability
JSONJSON
JSONJSON
24/7
http://www.mypage.com
turpis eget dolor mollis, id tincidunt dui mattis. Nunc sodales elementum turpis, vel interdum ante congue quis. Pellentesque habitant morbi tristique senectus et netus et malesuada Well, this is embarrassing.
We are having some difficulties and we apologies for the inconvenience.
NoSQL Considerations
NoSQL Considerations
Accessing data• No standards exist yet• Typically via SDKs or over HTTP• Check if the programing language of your choice is
supported.
App Server
App Server
App Server
Consistency– Consistent only at the document level– Most documents stores currently don’t support multi-document
transactions– Analyze your application needs
Availability– Each node stores active and replica data (Couchbase)– Each node is either a master or slave (MongoDB)
NoSQL considerations
Operations– Monitoring the system– Backup and restore the system– Upgrades and maintenance – Support
App Server
App ServerClient
Ease of Scaling– Ease of adding and reducing capacity– Single node type– App availability on topology changes
Indexing and Querying– Secondary indexes – Aggregates Grouping – Basic querying / Ad hoc querying
3rd party or user defined structure (Twitter feeds) Support for unlimited data growth (Viral apps) Data with non-homogenous structure Need to quickly and often change data structure Variable length documents Sparse data records Hierarchical data
Where is NoSQL a good fit?
Low latency critical (ex. 1millisecond) High throughput (ex. 200000 ops / sec) Large number of users Unknown demand with sudden growth of users/data Predominantly direct document access Read / Mixed / Write heavy workloads
Where is NoSQL a good fit?
21©2014 Couchbase, Inc.
©2015 Couchbase Inc. 22
Why Couchbase
©2015 Couchbase Inc. 23
Key Capabilities
• Multiple data models• N1QL - SQL-Like query
language • Multiple indexes• SDKs, ODBC / JDBC
drivers and frameworks
• Push-button scalability• Consistent high-performance • Always on 24x7 with HA - DR• Easy Administration with Web
UI, Rest API and CLI
Combines the Flexibility of JSON, the Power of SQL and the Scale of NoSQL
©2015 Couchbase Inc. 24
Couchbase Server Defined
Couchbase Server is the first NoSQL database that enables you to develop with agility and operate at any scale.
Managed Cache Key-Value Store Document Database
Embedded Database
Sync Management
©2015 Couchbase Inc. 25
Digital Economy customers
1 billion+ user profiles
Replication across 7 data centers
740 server nodes300K reads, 20K
writes / sec, sustained
50M Unique monthly visitors
2.5B Monthly page views
Replaced SQL Server
and MongoDB
12TB data 16M entries every
five minutes 400K ops/sec. on
four nodes
1 billion+ documents
10TB+ data Sub-2ooms
response time
©2015 Couchbase Inc. 26
Develop With Agility
©2015 Couchbase Inc. 27
The Power Of The Flexible JSON SchemaAbility to store data in multiple ways• Denormalized single document, as opposed to normalizing data across multiple table• Dynamic Schema to add new values when needed
©2015 Couchbase Inc. 28
Accessing Data From Couchbase
Key access using Document ID
• Operations are extremely fast with consistent low latency
• Reads and writes are evenly distributed across Data Service nodes
• Data is cached in built-in Managed Caching layer and stored in persistent storage layer
Queries using N1QL
• SQL-like : SELECT * FROM WHERE/LIKE/GROUP/etc.,
• JOINs• Powerful Extensions
(nest, unnest) for JSON to support nested and hierarchical data structures.
• Multiple access paths – Views and global secondary indexes
• ODBC/JDBC drivers available
Views using static queries
• Pre-computed complex Map-Reduce queries
• Incrementally updated to power analytics, reporting and dashboards
• Strong for complex custom aggregations
©2015 Couchbase Inc. 29
Application To Database Interaction
©2015 Couchbase Inc. 30
N1QL, next generation NoSQL query language
• SQL-like : SELECT * FROM WHERE/LIKE/GROUP/etc., • JOINS• Powerful Extensions (nest, unnest) for JSON to support
nested and hierarchical data structures.• Multiple access paths – Views and global secondary
indexes• ODBC/JDBC drivers available
©2015 Couchbase Inc. 31
How A JOIN In N1QL Works
©2015 Couchbase Inc. 32
How A JOIN In N1QL Works
©2015 Couchbase Inc. 33
How A JOIN In N1QL Works
©2015 Couchbase Inc. 34
Couchbase Global Indexing Service
• Indexes partitioned independently from data
• Scaled independent of data• ForestDB storage engine
cbq> CREATE INDEX purch_customID on purchases(customerID);
cbq> CREATE INDEX purch_type on purchases(type);
©2015 Couchbase Inc. 35
Operate At Any Scale
©2015 Couchbase Inc. 36
Storing And Retrieving Documents
©2015 Couchbase Inc. 37
Couchbase Architecture• Data Service – builds and
maintains Distributed Secondary Indexes (MapReduce Views)
• Indexing Engine – builds and maintains Global Secondary Indexes
• Query Engine – plans, coordinates, and executes queries against either Global or Distributed Indexes
• Cluster Manager – configuration, heartbeat, statistics, RESTful Management interface
©2015 Couchbase Inc. 38
Data Service: Writes And Cache Management
APPLICATION SERVER
MANAGED CACHE
DISK
DISKQUEUEDOC 1
DOC 2DOC 3DOC 4DOC 5
DOC 1
DOC 2 DOC 3 DOC 4 DOC 5
REPLICATION/XDCR/
CONNECTORS/VIEWS/
INDEXING
©2015 Couchbase Inc. 39
Query Execution Flow
1. Application submits N1QL query
2. Query is parsed, analyzed and plan is created
1
2
©2015 Couchbase Inc. 40
Query Execution Flow
3. Query Service makes request to Index Service
4. Index Service returns document keys and data
3
4
©2015 Couchbase Inc. 41
Query Execution Flow
5. If Covering Index, skip step 6
6. If filtering is required, fetch documents from Data Service
56
©2015 Couchbase Inc. 42
Query Execution Flow
7. Apply final logic (e.g. SORT, ORDER BY)
8. Return formatted results to application
7
8
©2015 Couchbase Inc. 43
Couchbase Clustering Architecture
©2015 Couchbase Inc. 44
Auto Sharding – Bucket And vBuckets A bucket is a logical, unique key space
Multiple buckets can exist within a single cluster of nodes
Each bucket has active and replica data sets (1, 2 or 3 extra copies) Each data set has 1024 Virtual Buckets (vBuckets) Each vBucket contains 1/1024th portion of the
data set vBuckets do not have a fixed physical server
location
Mapping between the vBuckets and physical servers is called the cluster map
Document IDs (keys) always get hashed to the same vbucket
Couchbase SDK’s lookup the vbucket -> server mapping
©2015 Couchbase Inc. 45
Cluster Map
©2015 Couchbase Inc. 46
Cluster Map
©2015 Couchbase Inc. 47
Data Services – Sharding and Replication
ACTIVE ACTIVE ACTIVE
REPLICA REPLICA REPLICA
Couchbase Server 1 Couchbase Server 2 Couchbase Server 3
ACTIVE ACTIVE
REPLICA REPLICA
Couchbase Server 4 Couchbase Server 5
SHARD5
SHARD2
SHARD SHARD
SHARD4
SHARD SHARD
SHARD1
SHARD3
SHARD SHARD
SHARD4
SHARD1
SHARD8
SHARD SHARD SHARD
SHARD6
SHARD3
SHARD2
SHARD SHARD SHARD
SHARD7
SHARD9
SHARD5
SHARD SHARD SHARD
SHARD7
SHARD
SHARD6
SHARD
SHARD8
SHARD9
SHARD
READ/WRITE/UPDATE
Application has single logical connection to cluster (client object)• Multiple nodes added or
removed at once• One-click operation• Incremental movement
of active and replica vBuckets and data
• Client library updated via cluster map
• Fully online operation, no downtime or loss of performance
• Strong Consistency enforced at document level
©2015 Couchbase Inc. 48
Modern Architecture – Multi-Dimensional Scaling
MDS is the architecture that enables independent scaling of data, query, and indexing workloads
while being managed as one cluster.
©2015 Couchbase Inc. 49
Modern Architecture – Multi-Dimensional Scaling
©2015 Couchbase Inc. 50
Modern Architecture – Multi-Dimensional Scaling
©2015 Couchbase Inc. 51
XDCR: Cluster Topology Aware
©2015 Couchbase Inc. 52
XDCR: Cluster Topology Aware
©2015 Couchbase Inc. 53
What’s new in Couchbase 4.1
Simplified Development
Connected Bigdata Experience
Improved Performance
Simplified Security Compliance
Improved HA & DR
Easy Admin
Simplified Familiar and Flexible Query with N1QLFull SQL Syntax through N1QL (INSERT/UPDATE/ DELETE and MERGE)
Integrated BI with ODBC/JDBC
Spatial Queries for Location Aware Applications
New Frameworks and Languages (LINQ,Spring, Go)
Surround Big-data - Spark SQL- Spark Streams- Kafka, - Sqoop, - Elastic, - SOLR
Faster Queries with Covering IndexesPrepared Statements for Low Latency query executionIndependent Scaling with Multi-dimensional Scaling
Global Secondary Indexes for Snappy Queries
Faster Reporting and Interactive Analytics with Views Queries…and more
Integrated Enterprise Identity Management with LDAP Integration
Security Forensics with Admin Auditing
Improved Data Protection with Lower latency XDCR
High Performance Global Data Distribution with XDCR Filtering
Deployment with High Performance Containers: Docker
Expanded Public and Private Cloud Support- AWS,- Google,- Azure,- Joyent,- Cisco,- Verizon
New Enterprise Platforms- SUSE - Oracle Ent.Linux
©2015 Couchbase Inc. 54
What’s new in Couchbase 4.5
Simplified Development
Improved Performance
Improved HA & DR
Simplified Security Compliance
Easy Dev-Ops
• Simplified N1QL Query Development with Integrated Query Workbench and Powerful Query Shell• Sub-Document Updates for Improved Performance and Efficiency• Batch Mutations through N1QL (INSERT, UPDATE, DELETE and MERGE)• Integrated Full-text Search [Preview]
• Memory-Optimized Global Indexes for Snappy Queries• High Performance Read-Your-Own-Write Consistency with N1QL• Faster Array operations with Powerful Array Indexing• Extended JOIN Operations for flexible cross document operations• Faster Queries with Covering Indexes & Prepared Statements• Improved Compaction Management with Circular Reuse
• High Scale Backup/Restore for the Enterprise• Last Writer Wins Conflict Resolution with XDCR [Preview]
• Role Based Access Control for Admins• Certificate Based Encryption (X509 Certs)
• Docker Support: Deployment with High Performance Containers
• Support for Debian 8 and RedHat Openshift
• Enhanced Clustering for Large Clusters (>100 Nodes)
©2015 Couchbase Inc. 55
Couchbase Mobile
©2015 Couchbase Inc. 56
What Is Couchbase Mobile?
• Faster development cycles• Less long term maintenance than
traditional solutions
• Enterprise class mobile/embedded NoSQL database + sync platform
• Fast and consistent access to data• Removed continual network
dependency
©2015 Couchbase Inc. 57
What’s new in Couchbase Mobile 1.2 Sync Gateway new features
– POST /{db}/_compact– POST /{db}/_purge– POST /{db}/_offline– POST /{db}/_online
Sync Gateway internally backed up by CBGT Couchbase Lite
– ForestDB Storage Engine (Developer Preview) - Preview the speed of our new ForestDB storage engine.
– Database Encryption - AES-256 on-disk encryption with your choice of provided storage library: SQLCipher or ForestDB.
– Improved Performance - Sync protocol enhancements, compression optimizations, and lower memory usage
Thank you
Q&A