ceph object store
TRANSCRIPT
Ceph Object Storeoder: Wie speichert man terabyteweise Dokumente
Daniel Schneller [email protected] @dschneller
Wer sind wir?
@dschneller @drivebytesting
Was machen wir?
Wo kamen wir her?
Warum wollten wir da weg?
Wohin wollten wir?
Und der Weg?
Ceph Grundlagen
“Unified, distributed storage system designed for excellent performance, reliability and scalability”
Stark skalierbar
Commodity Hardware
Kein Single Point of Failure
Ceph Komponenten
OSD DaemonsObject Storage Device Daemons
CRUSH AlgorithmusIntelligente Objektverteilung ohne zentrale Metadaten
RADOSReliable Autonomous Distributed Object Store
Objekte
Data PoolsSammelbecken für Objekte mit gleichen Anforderungen
Placement Groups
MonitorsErste Anlaufstelle für Clients
Hardware Setup
Storage Virtualization
Bare Metal Hardware
Compute Virtualization
Network Virtualization
Virtual Infrastructure
Application
Storage Virtualization
Bare Metal Hardware
Compute Virtualization
Network Virtualization
Virtual Infrastructure
Application
Baseline BenchmarksErwartungen definieren
StorageDisk I/O pro Node
Netzwerk
IEEE 802.3ad != IEEE 802.3ad
> cat /etc/network/interfaces ... auto bond2 iface bond2 inet manual bond-slaves p2p3 p2p4 # interfaces to bond bond-mode 802.3ad # activate LACP bond-miimon 100 # monitor link health bond-xmit_hash_policy layer3+4 # use Layer 3+4 for link selection pre-up ip link set dev bond2 mtu 9000 # set Jumbo Frames
auto vlan-ceph-clust iface vlan-ceph-clust inet static pre-up ip link add link bond2 name vlan-ceph-clust type vlan id 105 pre-up ip link set dev vlan-ceph-clust mtu 9000 # Jumbo Frames post-down ip link delete vlan-ceph-clust address ... netmask ... network ... broadcast ... ...
IEEE 802.3ad != IEEE 802.3ad
[node01] > iperf -s -B node01.ceph-cluster [node02] > iperf -c node01.ceph-cluster -P 2 [node03] > iperf -c node01.ceph-cluster -P 2 ------------------------------------------------------------ Server listening on TCP port 5001 Binding to local address node01.ceph-cluster TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 10.102.5.11 port 5001 connected with 10.102.5.12 port 49412 [ 5] local 10.102.5.11 port 5001 connected with 10.102.5.12 port 49413 [ 6] local 10.102.5.11 port 5001 connected with 10.102.5.13 port 59947 [ 7] local 10.102.5.11 port 5001 connected with 10.102.5.13 port 59946 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 342 MBytes 286 Mbits/sec [ 5] 0.0-10.0 sec 271 MBytes 227 Mbits/sec [SUM] 0.0-10.0 sec 613 MBytes 513 Mbits/sec [ 6] 0.0-10.0 sec 293 MBytes 246 Mbits/sec [ 7] 0.0-10.0 sec 338 MBytes 283 Mbits/sec [SUM] 0.0-10.0 sec 631 MBytes 529 Mbits/sec
IEEE 802.3ad != IEEE 802.3ad
[node01] > iperf -s -B node01.ceph-cluster [node02] > iperf -c node01.ceph-cluster -P 2 [node03] > iperf -c node01.ceph-cluster -P 2 ------------------------------------------------------------ Server listening on TCP port 5001 Binding to local address node01.ceph-cluster TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 10.102.5.11 port 5001 connected with 10.102.5.12 port 49412 [ 5] local 10.102.5.11 port 5001 connected with 10.102.5.12 port 49413 [ 6] local 10.102.5.11 port 5001 connected with 10.102.5.13 port 59947 [ 7] local 10.102.5.11 port 5001 connected with 10.102.5.13 port 59946 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 342 MBytes 286 Mbits/sec [ 5] 0.0-10.0 sec 271 MBytes 227 Mbits/sec [SUM] 0.0-10.0 sec 613 MBytes 513 Mbits/sec [ 6] 0.0-10.0 sec 293 MBytes 246 Mbits/sec [ 7] 0.0-10.0 sec 338 MBytes 283 Mbits/sec [SUM] 0.0-10.0 sec 631 MBytes 529 Mbits/sec ???
Messen!…und die Ergebnisse verstehen
CenterDevice
Gesamtarchitektur
Node 1
OSD 1
…
…
Node 2
…
…
…
Node 3
…
…
…
Node 4
…
OSD 48
…
Bare Metal
Ceph
Gesamtarchitektur
Node 1
OSD 1
…
…
Rados GW
Node 2
…
…
…
Rados GW
Node 3
…
…
…
Rados GW
Node 4
…
OSD 48
…
Rados GW
Bare Metal
Ceph
Gesamtarchitektur
Node 1
OSD 1
…
…
Rados GW
Node 2
…
…
…
Rados GW
Node 3
…
…
…
Rados GW
Node 4
…
OSD 48
…
Rados GW
VM 1
HAProxy
VM 1
HAProxy
VM 1
HAProxy
VM …
HAProxyVMs
Bare Metal
Ceph
Gesamtarchitektur
Node 1
OSD 1
…
…
Rados GW
Node 2
…
…
…
Rados GW
Node 3
…
…
…
Rados GW
Node 4
…
OSD 48
…
Rados GW
VMs
Bare Metal
VM 1
Ceph
HAProxy
CenterDevice
Swift
VM 1
HAProxy
CenterDevice
Swift
VM 1
HAProxy
CenterDevice
Swift
VM …
HAProxy
CenterDevice
Swift
Vorteile
Nachteile
Caveats
CephFSNot recommended for production data.
ScrubbingIntegrität hat ihren Preis. Aber man kann handeln!
Zukunft
Rados Gateway
Ceph Caching Tier
SSD based Journaling
10GBit/s Networking
Zum Schluss
Folien bei Slidesharehttp://www.slideshare.net/dschneller
Handout bei CenterDevicehttps://public.centerdevice.de/399612bf-ce31-489f-bd58-04e8d030be52
@drivebytesting @dschneller
EndeDaniel Schneller [email protected] @dschneller