ceph day berlin: ceph and iscsi in a high availability setup
TRANSCRIPT
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
and iSCSI
in a
High Availability Setup
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Who?
➞ 20 years experience and knowledge around Linux servers and Mail
➞ IT Consulting and 24/7 Linux-Support with > 20 employees
➞ Incorporated ISP since 1992➞ jpberlin.de➞ mailbox.org
➞ Daily insights into the IT of small, medium and large businesses
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Goal:Connecting
Classic and ModernStorage Concepts
+
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Intro:
A quick peek on the technologies involved
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Ceph
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
iSCSI
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
A peek on iSCSI
➞ Easy to implement➞ Software-defined block storage➞ established failover mechanisms
➞ ALUA➞ and proprietary path priorisations (EMC,
NetApp, …)
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Why ceph + iSCSI?
➞ Modern and classic storage concepts combined➞ Not every platform supports Rados Block
Devices (RBD)➞ iSCSI implementations widely available➞ iSCSI Targets usually not implemented
over portals of several hosts➞ High Availability only available in hardware
➞ using a shared backplane➞ Linux TGT has native RBD support in
userland➞ needs bs_rbd➞ in mainline since 02/2013
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Implementation
Test
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Prerequisites
➞ Running Ceph Cluster➞ Tested with Ceph Firefly 0.80.5 and Ceph
Giant 0.87
➞ Two or more hosts as ceph ↔iSCSI „Bridges“➞ Two NICs for Backing Store (ceph) and
iSCSI Portal➞ Tested on Ubuntu 14.04 LTS / ceph 0.80.5
as well as ceph 0.87 and tgt 1.0.43
➞ Hosts / Cluster / Applications➞ with iSCSI support
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Concept
➞ Ceph accesses objects in parallel on OSDs
➞ Example with multiple RadosGWs and S3/HTTP load balancer
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Concept
➞ Parallel access also works with TGT iSCSI target
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Preparation of Ceph iSCSI bridge↔
➞ Installation on Ubuntu 14.04 LTS
➞ Connection to the Ceph cluster (configuration)
tgt1 # aptitude install ceph tgt
tgt1 # cat /etc/ceph/ceph.conf[global]mon_host = 192.168.1.247,192.168.1.248,192.168.1.249auth_cluster_required = cephxauth_service_required = cephxauth_client_required = cephx
[client]rbd_cache = false# needed on the iSCSI "bridge" hosts
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Preparation of Ceph iSCSI bridge↔
➞ Connection to the Ceph cluster (authentication)
➞ Better: dedicated account for target pool via „ceph auth“
tgt1 # cat /etc/ceph/ceph.client.admin.keyring [client.admin]
key = HeinleinSupportHeinleinSupportHeinlein==
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Preparation of Ceph iSCSI bridge↔
➞ Create a Ceph RBD as base for iSCSI LUN
➞ Example: a 3 GB sized RBD „ceph-tgt-shared“.➞ As usual the storage space is not
allocated but thin-provisioned➞ The RBD will not be used with the kernel driver
and does not get mapped➞ Reminder: RBD caching has to be deactived
tgt1 # rbd create size 3072 cephtgtsharedtgt1 # rbd info cephtgtsharedrbd image 'cephtgtshared':
size 3072 MB in 768 objectsorder 22 (4096 kB objects)block_name_prefix: rb.0.56c7.238e1f29format: 1
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Preparation of Ceph iSCSI bridge↔
➞ Create new iSCSI Portal
➞ IQN structure: iqn.YYYY-MM.NAMING_AUTHORITY:UNIQUE_NAME
➞ create LUN #1 with RBD as storage backend
➞ bind portal and target
tgt1 # tgtadm lld iscsi op new mode target \ tid 1 \ T iqn.200510.de.heinleinsupport:storage.s0.ceph.shared.tgt
tgt1 # tgtadm lld iscsi mode logicalunit op new \ tid 1 lun 1 bstype rbd backingstore cephtgtshared \ bsopts "conf=/etc/ceph/ceph.conf;id=admin"
tgt1 # tgtadm lld iscsi op bind mode target tid 1 \ I ALL
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Preparation of Ceph iSCSI bridge↔
➞ Repeat Ceph and TGT setup on additional „bridge“ hosts➞ ! tgt configuration cannot (as of version
1.1x) be created completely by tgt-admin
➞ The usual way:tgt-admin –dump > /etc/tgt/conf.d/targets.confdoes no contain parameters bstype and bsopts.
➞ Write target configuration manually
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Preparation of Ceph iSCSI bridge↔
➞ Write target configuration
➞ Create configuration identical on all „bridge“ hosts
➞ Right now we have 2+ iSCSI targets with one Ceph RBD as LUN➞ Storage part completed :)
tgt1 # cat /etc/tgt/conf.d/cephtgtshared.conf
defaultdriver iscsi # better located in /etc/tgt/target.conf
<target iqn.200510.de.heinleinsupport:storage.s0.ceph.shared.tgt> driver iscsi bstype rbd backingstore rbd/cephtgtshared</target>
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Using the iSCSI targets
➞ Installation on Ubuntu 14.04 LTS
➞ Discover iSCSI targets
➞ Repeat for every iSCSI „bridge“ host
➞ Quick Check.Every iSCSI host should be listed with IP and identical IQN
client # aptget install openiscsi multipathtoolsclient # service openiscsi startclient # service multipathtools start
client # iscsiadm m discoverydb t sendtargets \ portal 192.168.100.88 discover
client # iscsiadm m node192.168.100.88:3260,1 iqn.200510.de.heinleinsupport:st..192.168.100.91:3260,1 iqn.200510.de.heinleinsupport:st..
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Multipath Configuration
➞ Option 1: focus on performance
➞ Alternativly every WWID separate in multipaths { multipath { ...
client # cat /etc/multipath.confdefaults {
user_friendly_names yespolling_interval 2path_grouping_policy group_by_serialfeatures "1 queue_if_no_path"path_checker directiorr_min_io 100failback immediateno_path_retry 5 # 5 retries, evtl. „queue“
}[...]
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Multipath Configuration
➞ Option 2: focus on availability
➞ Alternativly every WWID separate in multipaths { multipath { ...
client # cat /etc/multipath.confdefaults {
user_friendly_names yespolling_interval 2path_grouping_policy failoverfeatures "1 queue_if_no_path"path_checker directiorr_min_io 100failback immediateno_path_retry fail # evtl. 5 (retries)
}[...]
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Multipath Configuration and iSCSI Login
➞ Activate Multipath configuration…
➞ Or: restart multipath-tools
➞ … and login to iSCSI Portals
➞ login should be successful for every portal, i.e. every iSCSI „bridge“ host
➞ Done!
client # echo reconfigure | multipathd k
client # iscsiadm m node login Logging in to [iface: default, target: iqn.200510.de.he..Logging in to [iface: default, target: iqn.200510.de.he..Login to [iface: default, target: iqn.200510.de.heinlei..Login to [iface: default, target: iqn.200510.de.heinlei..
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Concluding Test
➞ Check MP configuration of iSCSI LUN
➞ Depending on /etc/multipath.conf a device file /dev/mapper/mpath0 or
➞ the alias /dev/mapper/cephtgtshared exist
➞ Finito!
client # multipath llcephtgtshared (33000000100000001) dm3 IET,VIRTUALDISKsize=3.0G features='1 queue_if_no_path' hwhandler='0' wp=rw`+ policy='roundrobin 0' prio=1 status=active | 28:0:0:1 sdf 8:80 active ready running ` 29:0:0:1 sdg 8:96 active ready running
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Another class of InitiatorsHypervisors...
➞ Successful test with iSCSI-Initiator on XenServer 6.5➞ [x] activate Multipath (via XenCenter)➞ proceed with Ubuntu 14.04 LTS steps:
➞ Discover all Portals via iscsiadm➞ Extend /etc/multipath.conf with Device Vendor „IET*“ and Product „VIRTUAL*“. Add path_grouping_policy group_by_serial or failover.
➞ restart or reconfigure multipathd➞ Add new Storage Repository (SR) via one portal
➞ The generated description of the SR shows only one portal
➞ Looking at the CLI reveals, that multipath is used with all portals
➞ It can be necessary to login to all portals when starting with the first SR:iscsiadm m node T iqn.200510.de.heinleinsupport:st.. \ login
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Final Considerations
➞ Linux TGT is not limited to iSCSI (SCSI over IP)➞ RBD should be exportable via FCoE or iSER
(via RDMA)➞ Export via FC-HBA needs to be tested
➞ Resizing?!?➞ Not trivial
➞ resize RBD➞ stop all portals➞ start only one portal, then the rest➞ Reread portal LUN with iscsiadm or „discover“ as forced method
➞ iscsiadm „logout“ and „login“➞ Restart multipathd or „reconfigure“➞ resize content of LUN (pv, partition, filesystem, etc...)
➞ Compared to hardware SAN: similar effort
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Final Considerations
➞ Kernel RBD is more performant➞ But causes split brain and inconsistencies
with multipathing➞ tgt shows adequate performance in
userland➞ I/O caches have to be deactivated
➞ Kernel RBD loses performance gain➞ cannot be deactivated completely
➞ LIO is more current target framework➞ stgt runs 100% in userland, LIO is based in
kernel➞ Mike Christie / RedHat develop on LIO
framework➞ Data synchronisation is currently unsolved for PREEMPT/ABORT, ALUA-Status, etc.
➞ FC Target➞ qla2xxx in stgt is in proof of concept
phase➞ qla4xxx not included
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Performance!
24 OSD – 3 Nodes – 3 Journal SSD2 iSCSI Bridge Nodes
10 GBit/s Ceph – 10 GBit/s iSCSIXenServer 6.5 – Ubuntu 14.04 VM
iozone3 in one VM
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
References
➞ ceph – Storage Cluster Quick Start ( ceph-deploy )➞ http://ceph.com/docs/master/start/quick-ceph-deploy/
➞ Linux SCSI target framework➞ http://stgt.sourceforge.net/
➞ Generic SCSI Target Subsystem for Linux➞ http://scst.sourceforge.net/
➞ Multipath I/O➞ http://en.wikipedia.org/wiki/Multipath_I/O➞ http://en.wikipedia.org/wiki/Linux_DM_Multipath➞ /usr/share/doc/multipath-
tools/examples/multipath.conf.annotated.gz
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
➞ We are here to help you:➞ Robert Sander➞ Mail: [email protected] ➞ Telefon: 030/40 50 51 – 43
➞ In case of emergency:➞ Heinlein Support 24/7 Notfall-Hotline:
030/40 505 - 110
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
OpenStack DACH Day 2015
June 11, 2015 Urania Berlin
www.openstack-dach.org
Linux höchstpersönlich.
Highly Available iSCSI Storage with CephStephan Seitz <[email protected]>Robert Sander <[email protected]>
Heinlein Support hilft bei allen Fragen rund um Linux-Server
HEINLEIN AKADEMIEVon Profis für Profis: Wir vermitteln die oberen 10% Wissen: geballtes Wissen und umfang-reiche Praxiserfahrung.
HEINLEIN CONSULTINGDas Backup für Ihre Linux-Administration: LPIC-2-Profis lösen im CompetenceCall Notfälle, auch in SLAs mit 24/7-Verfügbarkeit.
HEINLEIN HOSTINGIndividuelles Business-Hosting mit perfekter Maintenance durch unsere Profis. Sicherheit und Verfügbarkeit stehen an erster Stelle.
HEINLEIN ELEMENTSHard- und Software-Appliances und speziell für den Serverbetrieb konzipierte Software rund ums Thema eMail.