ceph day london 2014 - ceph ecosystem overview
DESCRIPTION
Ross Turk, Red HatTRANSCRIPT
Q2 2014INTRODUCTION TO CEPH
A STORAGE REVOLUTION
PROPRIETARY HARDWARE
PROPRIETARY SOFTWARE
SUPPORT & MAINTENANCE
COMPUTER
DISKCOMPUTE
RDISK
COMPUTER
DISK
STANDARDHARDWARE
OPEN SOURCE SOFTWARE
ENTERPRISEPRODUCTS &
SERVICES
COMPUTER
DISKCOMPUTE
RDISK
COMPUTER
DISK
Copyright © 2014 by Inktank | Private and Confidential
ARCHITECTURAL COMPONENTS
3
RGWA web services
gateway for object storage, compatible
with S3 and Swift
LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,
PHP)
RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors
RBDA reliable, fully-distributed block device with cloud
platform integration
CEPHFSA distributed file
system with POSIX semantics and
scale-out metadata management
APP HOST/VM CLIENT
Copyright © 2014 by Inktank | Private and Confidential
ARCHITECTURAL COMPONENTS
4
RGWA web services
gateway for object storage, compatible
with S3 and Swift
LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,
PHP)
RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors
RBDA reliable, fully-distributed block device with cloud
platform integration
CEPHFSA distributed file
system with POSIX semantics and
scale-out metadata management
APP HOST/VM CLIENT
OBJECT STORAGE DAEMONS
5
FS
DISK
OSD
DISK
OSD
FS
DISK
OSD
FS
DISK
OSD
FS
btrfsxfsext4
M
M
M
RADOS CLUSTER
6
APPLICATION
M M
M M
M
RADOS CLUSTER
RADOS COMPONENTS
7
OSDs: 10s to 10000s in a cluster One per disk (or one per SSD, RAID
group…) Serve stored objects to clients Intelligently peer for replication & recovery
Monitors: Maintain cluster membership and state Provide consensus for distributed decision-
making Small, odd number These do not serve stored objects to
clients
M
WHERE DO OBJECTS LIVE?
8
??APPLICATION
M
M
M
OBJECT
A METADATA SERVER?
9
1
APPLICATION
M
M
M
2
CALCULATED PLACEMENT
10
FAPPLICATION
M
M
MA-G
H-N
O-T
U-Z
EVEN BETTER: CRUSH!
11
RADOS CLUSTER
OBJECT
10
01
01
10
10
01
11
01
10
01
01
10
10
01 11
01
1001
0110 10 01
11
01
CRUSH IS A QUICK CALCULATION
12
RADOS CLUSTER
OBJECT
10
01
01
10
10
01 11
01
1001
0110 10 01
11
01
CRUSH: DYNAMIC DATA PLACEMENT
13
CRUSH: Pseudo-random placement algorithm
Fast calculation, no lookup Repeatable, deterministic
Statistically uniform distribution Stable mapping
Limited data migration on change Rule-based configuration
Infrastructure topology aware Adjustable replication Weighting
Copyright © 2014 by Inktank | Private and Confidential
ARCHITECTURAL COMPONENTS
14
RGWA web services
gateway for object storage, compatible
with S3 and Swift
LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,
PHP)
RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors
RBDA reliable, fully-distributed block device with cloud
platform integration
CEPHFSA distributed file
system with POSIX semantics and
scale-out metadata management
APP HOST/VM CLIENT
ACCESSING A RADOS CLUSTER
15
APPLICATION
M M
M
RADOS CLUSTER
LIBRADOS
OBJECT
socket
L
LIBRADOS: RADOS ACCESS FOR APPS
16
LIBRADOS: Direct access to RADOS for applications C, C++, Python, PHP, Java, Erlang Direct access to storage nodes No HTTP overhead
Copyright © 2014 by Inktank | Private and Confidential
ARCHITECTURAL COMPONENTS
17
RGWA web services
gateway for object storage, compatible
with S3 and Swift
LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,
PHP)
RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors
RBDA reliable, fully-distributed block device with cloud
platform integration
CEPHFSA distributed file
system with POSIX semantics and
scale-out metadata management
APP HOST/VM CLIENT
THE RADOS GATEWAY
18
M M
M
RADOS CLUSTER
RADOSGWLIBRADOS
socket
RADOSGWLIBRADOS
APPLICATION APPLICATION
REST
RADOSGW MAKES RADOS WEBBY
19
RADOSGW: REST-based object storage proxy Uses RADOS to store objects API supports buckets, accounts Usage accounting for billing Compatible with S3 and Swift applications
Copyright © 2014 by Inktank | Private and Confidential
ARCHITECTURAL COMPONENTS
20
RGWA web services
gateway for object storage, compatible
with S3 and Swift
LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,
PHP)
RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors
RBDA reliable, fully-distributed block device with cloud
platform integration
CEPHFSA distributed file
system with POSIX semantics and
scale-out metadata management
APP HOST/VM CLIENT
STORING VIRTUAL DISKS
21
M M
RADOS CLUSTER
HYPERVISORLIBRBD
VM
SEPARATE COMPUTE FROM STORAGE
22
M M
RADOS CLUSTER
HYPERVISORLIBRB
D
VM HYPERVISORLIBRB
D
KERNEL MODULE FOR MAX FLEXIBLE!
23
M M
RADOS CLUSTER
LINUX HOSTKRBD
RBD STORES VIRTUAL DISKS
24
RADOS BLOCK DEVICE: Storage of disk images in RADOS Decouples VMs from host Images are striped across the cluster
(pool) Snapshots Copy-on-write clones Support in:
Mainline Linux Kernel (2.6.39+) Qemu/KVM, native Xen coming soon OpenStack, CloudStack, Nebula,
Proxmox
Copyright © 2014 by Inktank | Private and Confidential
ARCHITECTURAL COMPONENTS
25
RGWA web services
gateway for object storage, compatible
with S3 and Swift
LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby,
PHP)
RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors
RBDA reliable, fully-distributed block device with cloud
platform integration
CEPHFSA distributed file
system with POSIX semantics and
scale-out metadata management
APP HOST/VM CLIENT
SEPARATE METADATA SERVER
26
LINUX HOST
M M
M
RADOS CLUSTER
KERNEL MODULE
datametadata 0110
SCALABLE METADATA SERVERS
27
METADATA SERVER Manages metadata for a POSIX-compliant
shared filesystem Directory hierarchy File metadata (owner, timestamps,
mode, etc.) Stores metadata in RADOS Does not serve file data to clients Only required for shared filesystem
CEPH AND OPENSTACK
28
RADOSGWLIBRADOS
M M
RADOS CLUSTER
OPENSTACK
KEYSTONE CINDER GLANCE
NOVASWIFTLIBRB
DLIBRB
D
HYPER-
VISORLIBRBD
Read about the latest version of Ceph. The latest stuff is always at http://ceph.com/get
Deploy a test cluster using ceph-deploy. Read the quick-start guide at http://ceph.com/qsg
Read the rest of the docs! Find docs for the latest release at http://ceph.com/docs
Ask for help when you get stuck! Community volunteers are waiting for you at
http://ceph.com/help
Copyright © 2014 by Inktank | Private and Confidential
GETTING STARTED WITH CEPH
29