ceph day new york 2014: best practices for ceph-powered implementations of storage as-a-service
DESCRIPTION
Presented by Kamesh Pemmaraju, DellTRANSCRIPT
Best Practices for Ceph-Powered Implementations of Storage as-a-ServiceKamesh Pemmaraju, Sr. Product Mgr, Dell
Ceph Developer Day, New York City, Oct 2014
Outline
• Planning your Ceph implementation• Ceph Use Cases• Choosing targets for Ceph deployments• Reference Architecture Considerations• Dell Reference Configurations• Customer Case Study
• Business Requirements– Budget considerations, organizational commitment– Avoiding lock-in – use open source and industry standards– Enterprise IT use cases– Cloud applications/XaaS use cases for massive-scale, cost-effective storage
• Sizing requirements– What is the initial storage capacity?– Is it steady-state data usage vs. Spike data usage– What is the expected growth rate?
• Workload requirements– Does the workload need high performance or it is more capacity focused?– What are IOPS/Throughput requirements?– What type of data will be stored?
– Ephemeral vs. persistent data, Object, Block, File?
• Ceph is like a Swiss Army knife – it can tuned a wide variety of use cases. Let us look at some of them
Planning your Ceph Implementation
Ceph is like a Swiss Army Knife – it can fit in a wide variety of target use cases
Virtualization and Private Cloud
(traditional SAN/NAS)
High Performance(traditional SAN)
PerformanceCapacity
NAS & Object Content Store(traditional NAS)
Cloud Applicatio
ns
Traditional IT
XaaS Compute CloudOpen Source Block
XaaS Content StoreOpen Source NAS/Object
Ceph Target
Ceph Target
Copyright © 2013 by Inktank | Private and Confidential
USE CASE: OPENSTACK
5
OPEN STACKKEYSTONE
APISWIFT API CINDER API
GLANCE API
NOVAAPI
CEPH STORAGE CLUSTER(RADOS)
CEPH OBJECT GATEWAY
(RGW)
CEPH BLOCK DEVICE(RBD)
HYPERVISOR
(Qemu/KVM)
Copyright © 2013 by Inktank | Private and Confidential
USE CASE: OPENSTACK
6
OPEN STACKKEYSTONE
APISWIFT API CINDER API
GLANCE API
NOVAAPI
CEPH OBJECT GATEWAY
(RGW)
CEPH BLOCK DEVICE(RBD)
HYPERVISOR
(Qemu/KVM)
Volumes Ephemeral
Copy-on-Write Snapshots
Copyright © 2013 by Inktank | Private and Confidential
USE CASE: OPENSTACK
7
RED HAT ENTERPRISE LINUX OPENSTACK PLATFORM
CEPH STORAGE CLUSTER(RADOS)
CEPH OBJECT GATEWAY
(RGW)
CEPH BLOCK DEVICE(RBD)
CERTIFIED!
Copyright © 2013 by Inktank | Private and Confidential
USE CASE: CLOUD STORAGE
8
WEB APPLICATIONAPP
SERVERAPP
SERVERAPP
SERVER
CEPH STORAGE CLUSTER(RADOS)
CEPH OBJECT GATEWAY
(RGW)
CEPH OBJECT GATEWAY(RGW)
APP SERVER
S3/Swift S3/Swift S3/Swift S3/Swift
Copyright © 2013 by Inktank | Private and Confidential
USE CASE: WEBSCALE APPLICATIONS
9
WEB APPLICATIONAPP
SERVERAPP
SERVERAPP
SERVER
CEPH STORAGE CLUSTER(RADOS)
APP SERVER
NativeProtocol
NativeProtocol
NativeProtocol
NativeProtocol
Copyright © 2013 by Inktank | Private and Confidential
USE CASE: PERFORMANCE BLOCK
10
KVM/RHEV
CACHE POOL (REPLICATED)
BACKING POOL (REPLICATED)
CEPH STORAGE CLUSTER
Copyright © 2013 by Inktank | Private and Confidential
USE CASE: PERFORMANCE BLOCK
11
KVM/RHEV
CACHE: WRITEBACK MODE
BACKING POOL (REPLICATED)
CEPH STORAGE CLUSTER
Read/Write Read/Write
Copyright © 2013 by Inktank | Private and Confidential
USE CASE: PERFORMANCE BLOCK
12
KVM/RHEV
CACHE: READ ONLY MODE
BACKING POOL (REPLICATED)
CEPH STORAGE CLUSTER
Write Write Read Read
Copyright © 2013 by Inktank | Private and Confidential
USE CASE: ARCHIVE / COLD STORAGE
13
APPLICATION
CACHE POOL (REPLICATED)
BACKING POOL (ERASURE CODED)
CEPH STORAGE CLUSTER
CEPH BLOCK DEVICE (RBD)
Copyright © 2013 by Inktank | Private and Confidential
USE CASE: DATABASES
14
MYSQL / MARIADBRHEL7 RBD Kernel Module
CEPH STORAGE CLUSTER(RADOS)
NativeProtocol
NativeProtocol
NativeProtocol
NativeProtocol
Copyright © 2013 by Inktank | Private and Confidential
USE CASE: HADOOP
15
POSIXRHEL7 CephFS Kernel Module
CEPH STORAGE CLUSTER(RADOS)
NativeProtocol
NativeProtocol
NativeProtocol
NativeProtocol
CEPH FILE SYSTEM (CEPHFS)
• Tradeoff between Cost vs. Reliability (use-case dependent)
• Use the Crush configs to map out your failures domains and performance pools
• Failure domains – Disk (OSD and OS)– SSD journals– Node– Rack– Site (replication at the RADOS level, Block replication, consider latencies)
• Storage pools– SSD pool for higher performance– Capacity pool
• Plan for failure domains of the monitor nodes
• Consider failure replacement scenarios, lowered redundancies, and performance impacts
Architectural considerations – Redundancy and replication considerations
Server Considerations• Storage Node:
– one OSD per HDD, 1 – 2 GB ram, and 1 Gz/core/OSD, – SSD’s for journaling and for using SSD pooling (tiering) in Firefly– Erasure coding will increase useable capacity at the expense of additional
compute load– SAS JBOD expanders for extra capacity (beware of extra latency,
oversubscribed SAS lanes, large footprint for a failure zone)
• Monitor nodes (MON): odd number for quorum, services can be hosted on the storage node for smaller deployments, but will need dedicated nodes larger installations
• Dedicated RADOS Gateway nodes for large object store deployments and for federated gateways for multi-site
Networking Considerations• Dedicated or Shared network
– Be sure to involve the networking and security teams early when designing your networking options
– Network redundancy considerations – Dedicated client and OSD networks– VLAN’s vs. Dedicated switches– 1 Gbs vs 10 Gbs vs 40 Gbs!
• Networking design– Spine and Leaf– Multi-rack– Core fabric connectivity– WAN connectivity and latency issues for multi-site deployments
Ceph additions coming to the Dell Red Hat OpenStack solutionPilot configuration Components
• Dell PowerEdge R620/R720/R720XD Servers• Dell Networking S4810/S55 Switches, 10GB• Red Hat Enterprise Linux OpenStack Platform • Dell ProSupport • Dell Professional Services • Avail. w/wo High Availability
Specs at a glance • Node 1: Red Hat Openstack Manager • Node 2: OpenStack Controller (2 additional
controllers for HA)• Nodes 3-8: OpenStack Nova Compute• Nodes: 9-11: Ceph 12x3 TB raw storage • Network Switches: Dell Networking S4810/S55• Supports ~ 170-228 virtual machines
Benefits • Rapid on-ramp to OpenStack cloud• Scale up, modular compute and storage
blocks • Single point of contact for solution support• Enterprise-grade OpenStack software
package
Storage bundles
Example Ceph Dell Server Configurations
Type Size Components
Performance 20 TB • R720XD• 24 GB DRAM• 10 X 4 TB HDD (data drives)• 2 X 300 GB SSD (journal)
Capacity 44TB /105 TB*
• R720XD• 64 GB DRAM• 10 X 4 TB HDD (data drives)• 2 X 300 GB SSH (journal)
• MD1200• 12 X 4 TB HHD (data drives)
Extra Capacity 144 TB /240 TB*
• R720XD• 128 GB DRAM• 12 X 4 TB HDD (data drives)
• MD3060e (JBOD)• 60 X 4 TB HHD (data drives)
• Dell & Red Hat & Inktank have partnered to bring a complete Enterprise-grade storage solution for RHEL-OSP + Ceph
• The joint solution provides:– Co-engineered and validated Reference Architecture – Pre-configured storage bundles optimized for
performance or storage– Storage enhancements to existing OpenStack Bundles– Certification against RHEL-OSP – Professional Services, Support, and Training
› Collaborative Support for Dell hardware customers› Deployment services & tools
What Are We Doing To Enable?
UAB Case Study
• 900 researchers
• Data sets challenging resources
• Research data scattered everywhere
• Transferring datasets took forever and clogged shared networks
• Distributed data management reduced productivity and put data at risk
• Needed centralized repository for compliance
Overcoming a data delugeUS university that specializes in Cancer and Genomic research
Dell - Confidential
Research Computing System (Originally)A collection of grids, proto-clouds, tons of virtualization and DevOps
HPCCluster
HPCCluster
HPC Storage
DDR Infiniband QDR Infiniband
1Gb Ethernet
University Research Network
Interactive Services
Thumb drives
Local servers
Laptops
Laptops
Thumb drives
Local servers
Dell - Confidential
• Housed and managed centrally, accessible across campus network− File system + cluster, can grow as big as you want
− Provisions from a massive common pool
− 400+ TBs at less than 41¢/GB; scalable to 5PB
• Researchers gain− Work with larger, more diverse data sets
− Save workflows for new devices & analysis
− Qualify for grants due to new levels of protection
• Demonstrating utility with applications− Research storage
− Crashplan (cloud back up) on POC
− Gitlab hosting on POC
Solution: a scale-out storage cloudBased on OpenStack and Ceph
“We’ve made it possible for users to satisfy their own storage needs with the Dell private cloud, so that their research is not hampered by IT.”
David L. Shealy, PhDFaculty Director, Research Computing
Chairman, Dept. of Physics
Dell - Confidential
Research Computing System (Today)Centralized storage cloud based on OpenStack and Ceph
Ceph node
Cep node
Ceph node
Ceph node
Ceph node
POC
OpenStack node
HPCCluster
HPCCluster
HPC Storage
DDR Infiniband QDR Infiniband
10Gb Ethernet
Cloud services layerVirtualized server and storage computing cloud
based on OpenStack, Crowbar and Ceph
University Research Network
Dell - Confidential
• Designed to support emerging data-intensive scientific computing paradigm− 12 x 16-core compute nodes− 1 TB RAM, 420 TBs storage− 36 TBs storage attached to each compute
node• Individually customized
test/development/ production environments− Direct user control over all aspects of the
application environment− Rapid setup and teardown
• Growing set of cloud-based tools & services− Easily integrate shareware, open source, and
commercial software
Building a research cloudProject goals extend well beyond data management
“We envision the OpenStack-based cloud to act as the gateway to our HPC resources, not only as the purveyor of services we provide, but also enabling users to build their own cloud-based services.”
John-Paul Robinson, System Architect
Dell - Confidential
Research Computing System (Next Gen)A cloud-based computing environment with high speed access to dedicated and dynamic compute resources
OpenStack node
OpenStack node
OpenStack node
Ceph node
Ceph node
Ceph node
Ceph node
Ceph node
OpenStack node
OpenStack node
OpenStack node
OpenStack node
HPCCluster
HPCCluster
HPC Storage
DDR Infiniband QDR Infiniband
10Gb Ethernet
Cloud services layerVirtualized server and storage computing cloud
based on OpenStack, Crowbar and Ceph
University Research Network
Dell - Confidential
THANK YOU!
Contact InformationReach Kamesh additional information:
@kpemmaraju
http://www.cloudel.com