couchbase 20-xdcr-deep-dive-12012012
TRANSCRIPT
2
XDCR: Cross Data Center Replication
US DATA CENTER
EUROPE DATA CENTER
ASIA DATA CENTER
http://blog.groosy.com/wp-content/uploads/2011/10/internet-map.jpg
3
Cross Data Center Replication – The basics
• Replicate your Couchbase data across clusters• Clusters may be spread across geos• Configured on a per-bucket basis• Supports unidirectional and bidirectional operation• Application can read and write from both clusters
(active – active replication)• Replication throughput scales out linearly• Different from intra-cluster replication
6
33 2
Single node - Couchbase Write Operation with XDCR2
Managed Cache
Dis
k Q
ueue
Disk
Replication Queue
App Server
Couchbase Server Node
Doc 1Doc 1
Doc 1
To other node
XDCR Queue
Doc 1
To other cluster
7
Internal Data Flow
1. Document written to managed cache
2. Document added to intra-cluster replication queue
3. Document added to disk queue
4. XDCR push replicates to other clusters
8
XDCR in action
COUCHBASE SERVER CLUSTERNYC DATA CENTERACTIVE
Doc
Doc 2
SERVER 1
Doc 9
SERVER 2 SERVER 3
RAM
Doc Doc Doc
ACTIVE
Doc
Doc
Doc RAM
ACTIVE
Doc
Doc
DocRAM
DISK
Doc Doc Doc
DISK
Doc Doc Doc
DISK
COUCHBASE SERVER CLUSTERSF DATA CENTER
ACTIVE
Doc
Doc 2
SERVER 1
Doc 9
SERVER 2 SERVER 3
RAM
Doc Doc Doc
ACTIVE
Doc
Doc
Doc RAM
ACTIVE
Doc
Doc
DocRAM
DISK
Doc Doc Doc
DISK
Doc Doc Doc
DISK
11
Continuous Reliable Replication
• All data mutations replicated to destination cluster• Multiple streams round-robin across vBuckets in
parallel (32 default)• Automatic resume after network disruption
12
Cluster Topology Aware
• Automatically handles node addition and removal in source and destination clusters
13
Efficient
• Couchbase Server de-duplicates writes to disk– With multiple updates to
the same document only the last version is written to disk
– Only this last change written to disk is passed to XDCR
• Document revisions are compared between clusters prior to transfer
14
Active-Active Conflict Resolution
• Couchbase Server provides strong consistency at the document level within a cluster
• XDCR provides eventual consistency across clusters• If a document is mutated on both clusters, both
clusters will pick the same “winner”• In case of conflict, document with the most updates
will be considered the “winner”
3 33
5
25
XDCR in the Cloud
• Server Naming– Optimal configuration using DNS name that resolves to
internal address for intra-cluster communication and public address for inter-cluster communication
• Security– XDCR traffic is not encrypted, plan topology accordingly– Consider 3rd party Amazon VPN solutions
29
Development and Testing
• Test code changes with actual production data without interrupting your production cluster
• Give developers local databases with real data, easy to dispose and recreate
Test and Dev Staging Production
30
Impact of XDCR on the cluster
Your clusters need to be sized for XDCR• XDCR is CPU intensive – Configure the number of parallel streams based on your CPU
capacity
• You are doubling your I/O usage– I/O capacity needs to be sized correctly
• You will need more memory particularly for bidirectional XDCR – Memory capacity needs to be sized correctly
31
Additional Resources
• Couchbase Server Manual - http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-tasks-xdcr.html
• Getting Started with XDCR blog - http://blog.couchbase.com/cross-data-center-replication-step-step-guide-amazon-aws