sizing your couchbase cluster: couchbase connect 2014

31
How Many Nodes? Properly Sizing your Couchbase Cluster Perry Krug | Senior Solutions Architect , Couchbase

Upload: couchbase

Post on 09-Jul-2015

930 views

Category:

Data & Analytics


1 download

DESCRIPTION

How many nodes? That is the million dollar question that we will answer during this session. Factors like RAM, Disk, CPU and others cross with your specific hardware and workload requirements, resulting in the ideal cluster size. This session will also discuss some specific architecture and deployment considerations as well as the effects of using different Couchbase features.

TRANSCRIPT

Page 1: Sizing Your Couchbase Cluster: Couchbase Connect 2014

How Many Nodes?Properly Sizing your Couchbase Cluster

Perry Krug | Senior Solutions Architect , Couchbase

Page 2: Sizing Your Couchbase Cluster: Couchbase Connect 2014

http://blog.couchbase.com/how-many-nodes-part-1-introduction-sizing-couchbase-server-20-cluster

Read this article

©2014 Couchbase, Inc. 2

Page 3: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Sizing = performance:

Serve reads out of RAM

Enough IO for writes and disk operations

Mitigate inevitable failures

Size Couchbase Server

©2014 Couchbase, Inc. 3

Reading Data Writing Data

Application Server

APlease store

document A

OK, I stored

document A

Application Server

Give me

document A

Here is

document A

A

Couchbase Server Couchbase Server

Page 4: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Scaling out permits matching of aggregate flow rates so queues do not grow

©2014 Couchbase, Inc.

Application ServerApplication Server Application Server

network networknetwork

Couchbase

Server

Couchbase

Server

Couchbase

Server

Page 5: Sizing Your Couchbase Cluster: Couchbase Connect 2014

5 Factors of Sizing

Page 6: Sizing Your Couchbase Cluster: Couchbase Connect 2014

5 Key Factors determine number of nodes needed:

1. RAM

2. Disk

3. CPU

4. Network

5. Data Distribution/Safety

(per-bucket, multiple buckets aggregate)

How many nodes?

©2014 Couchbase, Inc. 6

Couchbase Servers

Web application server

Application user

Page 7: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Working set depends on your application

©2014 Couchbase, Inc. 7

Page 8: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Key working set in RAM

for best read performance

1. Total RAM:

Managed document cache:

Working set

Metadata

Active+Replicas

Index caching (I/O buffer)

RAM sizing

©2014 Couchbase, Inc. 8

Page 9: Sizing Your Couchbase Cluster: Couchbase Connect 2014

File system cache availability for the index has a big impact on performace:

Test runs based on 10 million items with 16GB bucket quote and 4GB, 8GB system RAM availability for indexes

Performance results show that by doubling system cache availability

query latency reduces by half

throughput increases by 50%

Leave RAM free with quotas

RAM Sizing – View/Index cache (disk I/O)

©2014 Couchbase, Inc. 9

Page 10: Sizing Your Couchbase Cluster: Couchbase Connect 2014

2. Total RAM:

Sustained write rate

Rebalance capacity

Backups

XDCR

Views/Indexing

Compaction

Total dataset:

Index caching (I/O buffer)

Disk Sizing: Space and I/O

©2014 Couchbase, Inc. 10

I/O

Page 11: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Disk writes are buffered

Bursts of data expand the disk write queue

Sustained writes need corresponding throughput

Disk throughput affected by disk speed

SSD > 10K RPM > EBS

SSDs give a huge boost to write throughput and startup/warmup times

RAID can provide redundancy and increase throughput

Throughput = read/write+compaction+indexing+XDCR

2.1 introduces multiple disk threads

Best to configure different paths for data and indexes

Plan on about 3x space (append-only, compaction, backups, etc.)

Disk Sizing: Space and I/O

©2014 Couchbase, Inc. 11

Page 12: Sizing Your Couchbase Cluster: Couchbase Connect 2014

3. CPU

Disk writing

Views/compaction/XDCR

RAM r/w performance not impacted

Minimum production requirement: 4 cores

+1 per bucket

+1 core per Design Doc

+1 core per XDCR stream

Disk Sizing: Space and I/O

©2014 Couchbase, Inc. 12

Page 13: Sizing Your Couchbase Cluster: Couchbase Connect 2014

4. Network

Client traffic

Replication (writes)

Rebalancing

XDCR

Network sizing

©2014 Couchbase, Inc. 13Replication (multiply writes) and Rebalancing

Reads+Writes

Page 14: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Low latency, high throughput (LAN) – within cluster

Eliminate router hops:

Within Cluster nodes

Between clients and cluster

Check who else is sharing the network

Increase bandwidth by:

Add more nodes (will scale linearly)

Upgrade routers/switches/NIC’s/etc.

Network Considerations

©2014 Couchbase, Inc. 14

Page 15: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Servers fail, be prepared.

The more nodes, the less impact a failure will have.

4. Data Distribution/Safety (assuming one replica):

1 node = Single point of failure

2 nodes = +Replication

3+ nodes = Best for production

Autofailover

Upgrade-ability

Further scale-ability

Note: Many applications will need more than 3 nodes

Data Distribution

©2014 Couchbase, Inc. 15

Page 16: Sizing Your Couchbase Cluster: Couchbase Connect 2014

5 Key Factors determine number of nodes needed:

1. RAM

2. Disk

3. CPU

4. Network

5. Data Distribution/Safety

(per-bucket, multiple buckets aggregate)

How many nodes recap

©2014 Couchbase, Inc. 16

Couchbase Servers

Web application server

Application user

Page 17: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Deployment Considerations

Page 18: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Hardware requirements/recommendations are the intersection of what’s needed versus what’s available

RAM: At least ~4GB (highly dependent on data set)

Disk: Fastest “local” storage available

SSD is better

RAID 0 or 10, not 5

CPU (minimums): 4 cores

+1 per bucket

+1 core per Design Doc

+1 core per XDCR stream

Hardware Minimums

©2014 Couchbase, Inc. 18

Page 19: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Designed for commodity hardware

Scale out, not up… more smaller nodes better than less larger ones (can scale up later)

Tested and deployed in EC2

Physical hardware offers best performance and efficiency

Certain considerations with using VM’s:

RAM use inefficient/Disk IO usually not as fast

Local storage better than shared SAN

1 Couchbase VM per physical host

You will generally need more nodes

Don’t overcommit

Hardware Considerations

©2014 Couchbase, Inc. 19

Page 20: Sizing Your Couchbase Cluster: Couchbase Connect 2014

R3 instances best value for performance

Higher Ram-to-CPU ratios

Come with SSD’s

Disk Choice: SSD’s are best

Ephemeral is okay

Single EBS not great, use LVM/RAID

Views/indexes on ephemeral, main data on EBS or both on SSD

Backups: Use cbbackup locally on each node and migrate to EBS/S3

Can use EBS snapshots

Couchbase in AWS

©2014 Couchbase, Inc. 20

Page 21: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Deploy across AZ’s with rack/zone awareness

Use a EIP/public-hostname instead of private IP:

Easier connectivity from outside AWS

Easier restoration/better availability

Couchbase XDCR across regions must use hostname

In AWS as with any cloud/virtual deployment, you will likely need more nodes than you would with a physical infrastructure

Couchbase in AWS

©2014 Couchbase, Inc. 21

Page 22: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Effects of…

Page 23: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Effect on scale/sizing:

Increase the CPU and disk IO requirements

More complex views require more CPU

More view output requires more disk IO

More RAM should be left out of the quota for better IO caching

Indication

Indexes significantly behind data writes (or growing delays)

What to do:

Make sure you follow best practices in view writing

Add more nodes to distribute processing “work”

Look into SSD’s

Views/Indexes

©2014 Couchbase, Inc. 23

Page 24: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Effect on scale/sizing

XDCR is CPU Intensive

Disk IO will double

Memory needs to be sized accordingly (bi-directional may mean more data)

Effect on scale/sizing

XDCR is CPU Intensive

Indication

A rising XDCR queue on source

What to do:

More nodes on source and destination will drain queue faster (scales linearly)

Tune replication streams according to CPU availability

XDCR

©2014 Couchbase, Inc. 24

Page 25: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Effect on scale/sizing

More reads:

Individual documents will not be impacted (static working set)

Views may require faster disks, more disk IO caching

More writes will increase disk IO needs

Indication

Cache miss ratio rising

Growing disk write queue / XDCR queue

Compaction not keeping up

What to do

Revise sizing calculations and add more nodes if needed

Most applications don’t need to scale the number of nodes based upon normal workload variation.

As your workload grows…

©2014 Couchbase, Inc. s 25

Page 26: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Effect on scale/sizing

Your RAM needs will grow:

Metadata needs increase with item count

Is your working set increasing?

Your disk space will likely grow (duh?)

Indications

Dropping resident ratio

Rising ejections/cache miss ratio

What to do

Revise sizing calculations and add more nodes if needed

Remove un-needed data

This is the most common need for scaling and will most likely result in needing more nodes

As your dataset grows…

©2014 Couchbase, Inc. s 26

Page 27: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Yes there is resource utilization during a rebalance but a “properly” sized cluster should not have any effect on performance during a rebalance:

Distribution of data and work across all nodes

Managed caching layer separates RAM-based performance from IO utilization

Rebalance automatically manages working set in RAM

Rebalance automatically throttles itself if needed

Can be stopped midway without endangering data or progress

Proper sizing includes not maxing out all resources: leave some headroom in preparation

Rebalancing

©2014 Couchbase, Inc. s 27

Page 28: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Work with the Couchbase Team

Validate your “on-paper” numbers with testing

Constantly monitor production

Sizing is tricky business…

©2014 Couchbase, Inc. s 28

Page 29: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Gather your workload and dataset requirements

Item counts and sizes, read/write/delete ratios

Review our documentation and formulas

Test, Deploy, Monitor… rinse and repeat

Dive in…

©2014 Couchbase, Inc. s 29

Page 30: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Lots of details and best practices in our documentation:

http://www.couchbase.com/docs/

And my sizing blog:

http://blog.couchbase.com/how-many-nodes-part-1-introduction-sizing-couchbase-server-20-cluster

Want more?

©2014 Couchbase, Inc. s 30

Page 31: Sizing Your Couchbase Cluster: Couchbase Connect 2014

Thank you

[email protected]