dcp - deep dive on the next generation streaming protocol: couchbase connect 2014

33
DCP: Deep Dive on the Next Generation Streaming Protocol Cihan Biyikoglu | Director of Product Management, Couchbase Mike Wiederhold | Software Engineer, Couchbase

Upload: couchbase

Post on 20-Aug-2015

1.094 views

Category:

Data & Analytics


5 download

TRANSCRIPT

Page 1: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

DCP: Deep Dive on the Next Generation Streaming Protocol

Cihan Biyikoglu | Director of Product Management, Couchbase

Mike Wiederhold | Software Engineer, Couchbase

Page 2: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 2

Agenda Architecture Overview

DCP within Couchbase Architecture DCP Benefits

Rebalance, XDCR, Views and more… DCP Under the Hood

DCP Protocol PropertiesHow DCP recovers from failuresand more…

Agenda

Page 3: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

DCP and Couchbase Server Architecture

Page 4: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

Data Sync is the Heart of Any Big Data System!

Fundamental piece of the architecture! - Data Sync maintains Data Redundancy for HA&DR

- Protect against failures – node, rack, region etc.

- Data Sync maintains Indexes! - Spatial, Full-text, - Indexing is key to building faster access paths to query data

DCP and Couchbase Server Architecture

Page 5: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 5

What is DCP?DCP is an innovative protocol that drive data sync for Couchbase Server v3.0.

• Increase data sync efficiency with massive data footprints • Remove slower Disk-IO from the data sync path• Improve latencies – replication for data durability• In future, will provide a programmable data sync protocol for external

stores outside Couchbase Server

In 3.0, DCP powers many critical components – Lets take a look!

What is DCP?

Page 6: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

read/write/update

Active

SERVER 1

Active

SERVER 2

Active

SERVER 3

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Shard 5

Shard 2

Shard 9

Shard

Shard

Shard

Shard 4

Shard 7

Shard 8

Shard

Shard

Shard

Shard 1

Shard 3

Shard 6

Shard

Shard

Shard

Replica Replica Replica

Shard 4

Shard 1

Shard 8

Shard

Shard

Shard

Shard 6

Shard 3

Shard 2

Shard

Shard

Shard

Shard 7

Shard 9

Shard 5

Shard

Shard

Shard

Couchbase Server Architecture – Data Sync with Replication

DCP Replicates Data Among Nodes

Page 7: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

SERVER 4 SERVER 5

Replica

Active

Replica

Active

read/write/update

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Active

SERVER 1

Shard 9 Shard

Replica

Shard 4

Shard 1

Shard 8

Shard

Shard

Shard

Active

SERVER 2

Shard 8 Shard

Replica

Shard 6

Shard 3

Shard 2

Shard

Shard

Shard

Active

SERVER 3

Shard 6 Shard

Replica

Shard 7

Shard 9

Shard 5

Shard

Shard

Shard

read/write/update

Shard 5

Shard 2

Shard

Shard

Shard 4

Shard 7

Shard

Shard

Shard 1

Shard 3

Shard

Shard

Couchbase Server Architecture

DCP Drives Rebuilding of Replicas Under Topology Changes

Page 8: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

Active

SERVER 1

Shard 5

Shard 2

Shard

Shard

Replica

Shard 4

Shard 1

Shard

Shard

Shard 1

Active

SERVER 3

Shard 5

Shard 2

Shard

Shard

Replica

Shard 4

Shard 1

Shard

Shard

Shard 1

Active

SERVER 2

Shard 5

Shard 2

Shard

Shard

Replica

Shard 4

Shard 1

Shard

Shard

Shard 1

APP SERVER 1COUCHBASE Client

LibraryCLUSTER

MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Couchbase Server Architecture - Views

DCP Powers View Maintenance

Page 9: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

SERVER 3SERVER 1 SERVER 2

Couchbase Server – San Francisco

SERVER 3SERVER 1 SERVER 2

Couchbase Server – New York

Couchbase Server Architecture

DCP Powers XDCR – Cross Data Center Replication

Page 10: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 10

Major difference between TAP and DCP:

Tap guarantees you will see all mutations at least once, but doesn’t guarantee any specific order Tap doesn’t have the ability to restart from anywhere De-duplication of items means that we cannot tell when we have a consistent view of the database

Tap vs. DCP

TAP DCP

Ordering No ordering guaranteed! Ordered

Restart-ability Not really! Granular Restart-ability

Consistency No snapshotting capabilities here!

Snapshots give a consistent view of the DB.

Performance No memory based support for Views & XDCR

Memory based all data synchronization components!

Page 11: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

New Features

Page 12: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 12

DCP allows us to build many features that were not possible with the TAP protocol Faster indexing Faster XDCR Incremental Backups Delta Recovery for Faster Rebalance Improved Durability with “ReplicateTo” And many more…

New Capabilities with DCP

Page 13: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 13

XDCR With DCP Up to 4x better on XDCR latency between clusters between 3.0 & 2.5.1

*Note: Absolute latency depend on WAN latency and bandwidth

Faster XDCR

0

100

200

300

400

500

600

2.5.1 3.0

90th Percentile. Replication Latency

Tim

e (

ms

)

Major Architectural Change:• In 2.x changes needed to be

persisted before the XDCR module replicated them

• In 3.0 DCP allows changes to be replicated from memory

The result is lower latency and higher throughput for XDCR traffic

90th percentile replication lag (ms), 5 -> 5 UniDir, 2 buckets x 500M x 1KB, 10K SETs/sec, LAN

Page 14: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 14

Views with DCPUp to 50x Faster Indexing from 2.5 to 3.0

*Note: Absolute latency depend on disk IO speed

Faster Views

05000

10000150002000025000300003500040000

2.5.1 3.0

Indexing Under Load

Tim

e (

ms

)

Major architectural changes:• In 2.x, the indexer required mutations to be

persisted before they were eligible for indexing• In 3.0 DCP allow mutations to be indexed as soon

as the index can consume them through a DCP stream

• Developers no longer need to “observe” for persistence before issuing a stale=false query

*95th percentile Indexing latency (ms), 1 bucket x 20M x 2KB, non-DGM, 1 view, 250 mutations/sec/node, 400 queries/sec

Page 15: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 15

Couchbase 3.0 supports incremental backups by taking advantage of the ordered property in DCP

Major advantages: Shorter backup times Smaller backup size

Incremental Backup

Page 16: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 16

Delta Node Recovery with DCP100sx Reduction in Time to Re-Add Node from 2.5 to 3.0

Note: The absolute performance improvement depend on data size and mutation count that needs to be caught up

Major Advantages: In 2.x adding back a server means completely rebuilding it! In 3.x the cluster manager utilizes the ordering property of DCP to resume

replica building from where the added server left off

The net result is shorter time to add failed nodes back into the cluster meaning less time spent rebalancing

Delta Recovery

Page 17: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

Durability with DCPUp to 150x Improvement on ReplicateTo latency between 3.0 and

2.5.1*Note: Absolute latency depend on network bandwidth and latency

Faster Durability with ReplicateTo

0100200300400500600700

2.5.1 3.0

95 Percentile ReplicateTo=1 La-tency

Tim

e (

ms

)Improved latency on ReplicateTo• DCP Improves replication speed and • Durability with replication and better

protects data!

*95th percentile ReplicateTo=1 latency (ms), 1 bucket x 200M x 1KB, 250 mutations/sec/node

Page 18: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

DCP Deep Dive

Page 19: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 19

Major Architecture Concepts:

Ordering Restart-ability Consistency (Snapshots) Performance (Memory-Based)

Major Architectural Concepts

Page 20: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 20

To build interesting features a streaming protocol needs to have a concept of when operations happened.

Couchbase operation ordering at the vBucket level:

Each mutation is assigned a sequence number Sequence numbers increase monotonically Sequence numbers are assigned on a per vBucket basis

Ordered Mutations

Page 21: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 21

In a distributed system failure is expected and we must be able to detect these failures to properly keep ordering in the system.

Failover Logs:

Each VBucket contains a failover log A failover log contains one or more failover entries Each failover entry contains a VBucket UUID/Sequence Number pair

Version Histories

Page 22: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 22

Version History (Failover Example)

VB UUID Seqno

X 0

VB UUID Seqno

DCP Replication Stream

X 0

Last Seqno: 100 Last Seqno: 0Last Seqno: 100

Active ReplicaActive

100Y

Last Seqno: 0

Page 23: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 23

Fine grained restartability:

Ordering allows Couchbase nodes to resume receiving data from any point If nodes contain different data histories we can reconcile the differences

using rollbacks

Restartability

Page 24: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 24

Restartability Example

VB UUID Seqno

X 0

VB UUID Seqno

DCP Replication Stream

X 0

Last Seqno: 80 Last Seqno: 150

Down Active

100Y

Replica

0 <= 80 <= 100

Y 100

DCP Replication Stream

Page 25: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 25

Restartability Example (With Rollback)

Page 26: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 26

Restartability Example (With Rollback)

VB UUID Seqno

X 0

VB UUID Seqno

DCP Replication Stream

X 0

Last Seqno: 110 Last Seqno: 150

Down Active

100Y

Replica

0 <= 110 <= 100

Last Seqno: 100

0 <= 100 <= 100

Y 100

DCP Replication Stream

Page 27: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 27

Agenda Architecture Overview

DCP within Couchbase Architecture DCP Benefits

Rebalance, XDCR, Views and more… DCP Under the Hood

DCP Protocol PropertiesHow DCP recovers from failuresand more…

Recap

Page 28: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

More on Couchbase Server 3.0

Today• Deep Dive – What’s New in Couchbase Server 3.0 – Cihan BiyikogluOctober 6 @ 4:20 pm - 5:05 pm

• A N1QL for Every Query: Extending SQL to a Document Database – Gerald SangudiOctober 6 @ 3:20 pm - 4:05 pm

• Deep Dive: Tunable Memory in Couchbase Server 3.0 – Chiyoung SeoOctober 6 @ 4:20 pm - 5:05 pm

• Best Practices: Securing a Couchbase Server Deployment – Don PintoOctober 6 @ 4:20 pm - 5:05 pm

• Deep Dive into DCP: A Streaming Replication Protocol – Mike Wiederhold and Cihan BiyikogluOctober 6 @ 5:10 pm - 5:55 pm

Tomorrow• A N1QL for Every Query: Extending SQL to a Document Database – Gerald SangudiOctober 7 @ 10:00 am - 10:45 am

• Ultra–High Availability and Disaster Recovery with Couchbase Server – Anil KumarOctober 7 @ 10:50 am - 11:35 am

• Deep Dive: Near Real-Time Map / Reduce with Views in Couchbase Server 3.0 - Sarath LakshmanOctober 7 @ 11:40 am - 12:25 pm

Page 29: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

Download Couchbase Server

29

Page 30: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

Questions

Page 31: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 31

Certain types of applications require knowing when they have a consistent view of the database up to a certain point in time. Snapshots allow these applications to know when this property is satisfied.

Why do we need snapshots?

Couchbase de-duplicates items with the same key both in-memory and on disk

This means replication streams may skip items if a newer one exists in the future

Consistency (Snapshots)

Page 32: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 32

Why do we need snapshots

Replicator

Indexer

{“_id”: “eid5294”,“name”: “mike”,“title”: “engineer”}

1{“_id”: “eid9421”,“name”: “jeff”,“title”: “manager”}

2{“_id”: “eid8302”,“name”: “mary”,“title”: “support”}

3{“_id”: “eid4104”,“name”: “sally”,“title”: “engineer”}

4{“_id”: “eid5294”,“name”: “mike”,“title”: “support”}

5

mike engineer

jeff manager

supportmaryDCP Indexing Stream

Get Latest Seqnos

DCP Indexing Stream

sally engineer

mike support

Page 33: DCP - Deep Dive on the Next Generation Streaming Protocol: Couchbase Connect 2014

©2014 Couchbase, Inc. 33

When items are inserted into Couchbase we cache an ordered list of the most recent items received.

Benefits:

Replication is fast because items are sent from memory and do not incur disk IO’s

Memory-Based Streaming