large scale cassandra made better in containers · large scale cassandra made better in containers...

28
Large Scale Cassandra Made Better in Containers Chris Duchesne @cduchesne {code} by Dell EMC Aaron Spiegel @Spiegela {code} by Dell EMC

Upload: vanque

Post on 21-Sep-2018

217 views

Category:

Documents


1 download

TRANSCRIPT

Large Scale Cassandra Made Better in Containers

Chris Duchesne

@cduchesne {code} by Dell EMC

Aaron Spiegel

@Spiegela {code} by Dell EMC

© Copyright 2017 Dell Inc. 2

Open source at Dell EMC

– Contribute to meaningful OSS projects – Create new thought leading OSS applications – Drive awareness of OSS opportunities with Dell EMC

product teams – Participate in relevant community engagement projects – Act in the interest of building a community

{code} by Dell EMC is a group of passionate open source engineers and advocates working to build a community around software-based infrastructure.

Platinum Sponsor

© Copyright 2017 Dell Inc. 3

Large Scale Cassandra and Containers

What is Large Scale Cassandra?

Scaling Cassandra Challenges

How Can Containers Help?

© Copyright 2017 Dell Inc. 4

What is Large Scale Cassandra?

You may be managing one or several Cassandra clusters today.

© Copyright 2017 Dell Inc. 5

What is Large Scale Cassandra?

What do you do when you keep adding clusters, all while growing and maintaining your existing clusters? Is this still manageable?

© Copyright 2017 Dell Inc. 6

Challenges with Scaling Cassandra

Operational Flexibility

Adding/Removing Nodes

Migrating Nodes

Adding Storage Capacity

Maintenance

Prepare for Performance Elasticity

Test/Dev/QA Environments

• Quick Spin-up / Spin-down

• High performance?

© Copyright 2017 Dell Inc. 7

Why Containers for NoSQL Databases

Elasticity Efficiency

Performance

VM

Metal

High Performance High Throughput and Low Response Times Databases are Designed with Expectation of DAS Similar to Bare Metal Performance

~10% overhead

Elasticity Platforms must easily Scale Out Looking to Decrease Management Complexity Externalization of Data & Configuration from Database Binaries

Efficiency Platforms must share resources across multiple database clusters Increase Server Utilization

Efficiency & High Performance Lead to Container Platforms as the Logical Choice

© Copyright 2017 Dell Inc. 8

No 1-Size-Fits-All Solutions

Security & Multi-tenancy Optimized

• Leverage VM Isolation to Maintain Multi-Tenant Separation

• Separate Tenants Get Isolated VMs within a Single Cluster

• Leverage Container Platform within VMs

• Reduces OS Instances to Manage • Improves Memory Efficiency • Leverage Immutable Containers for

Platform Management

• Sacrifices Performance for Security

Performance Optimized • Leverage Bare Metal Systems with

Container Platform • Provides near Bare Metal Performance • Multiple Tenants may require Physical

Separation

• Maximizes Performance vs. Management & Security Isolation

© Copyright 2017 Dell Inc. 9

Logical Architecture

Performance Optimized (Bare Metal Containers)

Scheduler (K8S/Mesos/Swarm)

Scheduler

Software Defined Volume Service

Container Persistent Volumes

Security & Multi-tenancy Optimized (Hypervisor + Containers)

Hypervisor & IaaS (vSphere/OpenStack)

Scheduler (K8S/Mesos/Swarm)

Software Defined Volume Service Guest Ephemeral Storage

Container Persistent Volumes

© Copyright 2017 Dell Inc. 10

Standard Deployment Architecture (Snowflakes)

Data Binaries Config

Data Binaries Config

© Copyright 2017 Dell Inc. 11

Standard Deployment Architecture (Snowflakes)

Data Binaries Config

Upgrade Cassandra SW only affects this

© Copyright 2017 Dell Inc. 12

Standard Deployment Architecture (Snowflakes)

Data Binaries Config

Expand cluster only changes these

© Copyright 2017 Dell Inc. 13

Standard Deployment Architecture (Snowflakes)

Data Binaries Config

Replace node need to retain these

© Copyright 2017 Dell Inc. 14

Standard Deployment Architecture Creating Snowflakes Faster, Magnifying Management

© Copyright 2017 Dell Inc. 15

ScaleIO Deployment Architecture No Snowflakes

© Copyright 2017 Dell Inc. 16

Native Data Protection vs. Highly Available Database Shards

🔥🔥

• Cassandra Native Data Protection – Can take hours or days to re-protect

data – Increases load on remaining nodes

› Nodes must both handle primary workload while also re-distributing data in cluster

• Not Limited to Node Failures – Database Software Updates – Security Patches – Node Hardware Upgrades

© Copyright 2017 Dell Inc. 17

Native Data Protection vs. Highly Available Database Shards

• ScaleIO External Storage & Container Benefits – Containers can be rescheduled to new

nodes on command or automatically after node/container failures

– No major data rebalancing performed – Node operations (reboots, patches,

upgrades) have much less operational cost › Complete in seconds / minutes

– REX-Ray or native integration with container schedulers provides interoperability and storage orchestration

Scheduler (K8S/Mesos/Swarm)

Container Persistent Volumes

Software Defined Volume Service

Native Integration

© Copyright 2017 Dell Inc. 18

Introducing REX-Ray

REX-Ray

The leading container storage orchestration engine enabling persistence for cloud native workloads

rexray.codedellemc.com

• Cloud Native Interoperability

• Open Source

• Enterprise Ready – High Availability – CLI Intuitiveness – Effortless Deployment – Architectural Choices

• Multi-Platform Storage Management – Storage agnostic (block/file/object)

© Copyright 2017 Dell Inc. 19

Kubernetes Integration

• ScaleIO is part of the core Kubernetes code and a first class native storage provider – ScaleIO can take full advantage of the Kubernetes volume

lifecycle features including dynamic provisioning and storage classes

– ScaleIO driver is embedded in the standard distribution of Kubernetes

– Contributed code from the {code} by Dell EMC team passes “Google” standard of quality

– Opens a new opportunity for those running Kubernetes in on-premise data centers. It allows utilization of your commodity x86 server hardware for very high performance and highly available storage for running stateful apps in containers.

© Copyright 2017 Dell Inc. 20

Why ScaleIO for Cassandra

Near Bare Metal Performance

Scale Out Storage & Hyper-Converged

Flexible IP Connectivity to Container and/or Hypervisor

© Copyright 2017 Dell Inc. 21

ScaleIO Near Bare Metal Performance

MAX SDS IOPS PER NODE

© Copyright 2017 Dell Inc. 22

ScaleIO Near Bare Metal Performance

RESPONSE TIMES IN SCALEIO - SSD

• Response Time (RT) is a critical aspect of performance

• RT is shown for a single threaded workload

• Write RT is about 30% higher than Read RT

© Copyright 2017 Dell Inc. 23

Example Architecture

Container Cluster Building Block

Multiple Availability Zones

Multiple Data Centers

© Copyright 2017 Dell Inc. 24

Example Architecture – Container Cluster Mike Janikis

C1

C2

C3

C2 🔥🔥

© Copyright 2017 Dell Inc. 25

Example Architecture – Multiple Zones

• Container Cluster B

• Container Cluster A

• Container Cluster C

C1

C2

C3

C1

© Copyright 2017 Dell Inc. 26

Example Architecture – Multiple Data Centers

•DC2

•DC1

•DC3

C1

C2 C3

C1

C2 C3

C1

C2

Demo

© Copyright 2017 Dell Inc. 28

codedellemc.com

community.codedellemc.com

@codeDellEMC

blog.codedellemc.com

{code} by Dell EMC is a group of passionate open source engineers and

advocates working to build a community around software-based infrastructure.

rexray.codedellemc.com

github.com/codedellemc/labs

HOL01 Use REX-Ray & ScaleIO w/ Docker, Mesos and Kubernetes

Chris Duchesne @ChrisDuchesne github.com/cduchesne

Aaron Spiegel @spiegela github.com/spiegela