cephalopods and samba
TRANSCRIPT
![Page 1: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/1.jpg)
CEPHALOPODS AND SAMBA
IRA COOPER – SNIA SDC 2016.09.18
![Page 2: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/2.jpg)
2
DISCLAIMER
● These opinions are my opinions.
● They do not represent promises from:
– Red Hat Inc.
– Samba Team
– Me
– My Mom
![Page 3: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/3.jpg)
3
AGENDA
● CEPH Architecture.
– Why CEPH?
– RADOS
– RGW
– CEPHFS
● Current Samba integration with CEPH.
● Future directions.
● Maybe a demo?
![Page 4: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/4.jpg)
4
CEPH MOTIVATING PRINCIPLES
● All components must scale horizontally.
● There can be no single point of failure.
● The solution must be hardware agnostic.
● Should use commodity hardware.
● Self-manage whenever possible.
● Open source.
![Page 5: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/5.jpg)
5
ARCHITECTURAL COMPONENTS
RGWA web services
gateway for object storage, compatible
with S3 and Swift
LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)
RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors
RBDA reliable, fully-distributed block device with cloud
platform integration
CEPHFSA distributed file
system with POSIX semantics and scale-
out metadata management
APP HOST/VM CLIENT
![Page 6: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/6.jpg)
6
ARCHITECTURAL COMPONENTS
LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)
RBDA reliable, fully-distributed block device with cloud
platform integration
CEPHFSA distributed file
system with POSIX semantics and scale-
out metadata management
APP HOST/VM CLIENT
RGWA web services
gateway for object storage, compatible
with S3 and Swift
RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors
![Page 7: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/7.jpg)
7
RADOS
● Flat object namespace within each pool
● Rich object API (librados)
– Bytes, attributes, key/value data
– Partial overwrite of existing data
– Single-object compound operations
– RADOS classes (stored procedures)
● Strong consistency (CP system)
● Infrastructure aware, dynamic topology
● Hash-based placement (CRUSH)
● Direct client to server data path
![Page 8: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/8.jpg)
8
RADOS CLUSTER
APPLICATION
M M
M M
M
RADOS CLUSTER
![Page 9: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/9.jpg)
9
OBJECT STORAGE DAEMONS
FS
DISK
OSD
DISK
OSD
FS
DISK
OSD
FS
DISK
OSD
FS
xfsbtrfsext4
M
M
M
![Page 10: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/10.jpg)
10
ARCHITECTURAL COMPONENTS
RGWA web services
gateway for object storage, compatible
with S3 and Swift
LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)
RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors
RBDA reliable, fully-distributed block device with cloud
platform integration
CEPHFSA distributed file
system with POSIX semantics and scale-
out metadata management
APP HOST/VM CLIENT
![Page 11: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/11.jpg)
11
RADOSGW MAKES RADOS WEBBY
RADOSGW: REST-based object storage proxy Uses RADOS to store objects
● Stripes large RESTful objects across many RADOS objects
● Space efficient for small objects API supports buckets, accounts Usage accounting for billing Compatible with S3 and Swift applications
![Page 12: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/12.jpg)
12
THE RADOS GATEWAY
M M
M
RADOS CLUSTER
RADOSGWLIBRADOS
socket
RADOSGWLIBRADOS
APPLICATION APPLICATION
REST
![Page 13: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/13.jpg)
13
MULTI-SITE OBJECT STORAGE
WEB APPLICATION
APP SERVER
CEPH OBJECT GATEWAY
(RGW)
CEPH STORAGE CLUSTER
(US-EAST)
WEB APPLICATION
APP SERVER
CEPH OBJECT GATEWAY
(RGW)
CEPH STORAGE CLUSTER
(EU-WEST)
![Page 14: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/14.jpg)
14
FEDERATED RGW
● Zones and regions
– Topologies similar to S3 and others
– Global bucket and user/account namespace
● Cross data center synchronization
– Asynchronously replicate buckets between regions
● Read affinity
– Serve local data from local DC
– Dynamic DNS to send clients to closest DC
![Page 15: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/15.jpg)
15
ARCHITECTURAL COMPONENTS
RGWA web services
gateway for object storage, compatible
with S3 and Swift
LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)
RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors
RBDA reliable, fully-distributed block device with cloud
platform integration
CEPHFSA distributed file
system with POSIX semantics and scale-
out metadata management
APP HOST/VM CLIENT
![Page 16: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/16.jpg)
16
SEPARATE METADATA SERVER
LINUX HOST
M M
M
RADOS CLUSTER
KERNEL MODULE
datametadata 0110
![Page 17: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/17.jpg)
17
SCALABLE METADATA SERVERS
METADATA SERVER Manages metadata for a POSIX-compliant
shared filesystem Directory hierarchy File metadata (owner, timestamps,
mode, etc.) Clients stripe file data in RADOS
MDS not in data path MDS stores metadata in RADOS
Key/value objects Dynamic cluster scales to 10s or 100s Only required for shared filesystem
![Page 18: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/18.jpg)
18
METADATA SERVERS – FUTURE
METADATA SERVER Sharding of the MDS (MetaData Server)
● More scalable performance.
Active – Passive Failover● Allowing for better availability
Both features are in the codebase● In active development● Not production ready
![Page 19: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/19.jpg)
SAMBA - TODAY
![Page 20: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/20.jpg)
20
ARCHITECTURAL COMPONENTS
RGWA web services
gateway for object storage, compatible
with S3 and Swift
LIBRADOSA library allowing apps to directly access RADOS (C, C++, Java, Python, Ruby, PHP)
RADOSA software-based, reliable, autonomous, distributed object store comprised ofself-healing, self-managing, intelligent storage nodes and lightweight monitors
RBDA reliable, fully-distributed block device with cloud
platform integration
CEPHFSA distributed file
system with POSIX semantics and scale-
out metadata management
APP HOST/VMSAMBA
CLIENT
![Page 21: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/21.jpg)
21
SAMBA INTEGRATION
● vfs_ceph
– Since 2013.
– Used as the outline for vfs_glusterfs
– Been in testing in teuthology for a while now.
– Patches up to be used as a testbed for statx.
● ACL Integration?
– Patchset for POSIX ACLs committed for Samba 4.5
● Thank you to Zheng Yan
– Work on RichACLs is on going.
![Page 22: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/22.jpg)
22
CTDB INTEGRATION
● fcntl locks
– Does any filesystem get this right at the start.
– 0/2 so far.
– Ceph's have been fixed, they work for CTDB.
● If you tweak the time outs.
– But these tweaks aren't production ready!
● Both kernel and FUSE clients have been tested
– CephFS team recommends ceph_fuse.
– That's what our initial integration used.
![Page 23: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/23.jpg)
23
CTDB INTEGRATION
● CTDB “fcntl lock” dependency removal.
– etcd
● Battle tested.● Push other config info into etcd?
– nodes– public_addresses
● The demo will show basic etcd integration.
– Thank you to Jose Rivera for his work here.
– Zookeeper
● Much the same as etcd for this use.● Not working on it now.
![Page 24: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/24.jpg)
DEMO
![Page 25: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/25.jpg)
25
FUTURE DIRECTIONS – OBJECT
● RGW
– Export object data as files.
– Export files as object data?
● Not today in ceph.
– Integrate where?
● S3● RADOS● Librgw● CephFS / vfs_ceph
● S3
– Not being worked on at this time.
● Non file system based locking makes all this possible.
![Page 26: CEPHALOPODS AND SAMBA](https://reader031.vdocuments.mx/reader031/viewer/2022020113/58668b171a28abf2408b7047/html5/thumbnails/26.jpg)
QUESTIONS?