mellanox for openstack - openstack最新情報セミナー 2014年10月
DESCRIPTION
講師:メラノックステクノロジーズジャパン友永 日時:2014/10/08 タイトル:Mellanox for OpenStack 概要: - 1. Mellanox Overview - 2. Mellanox CloudX - 3. CloudX Deep Dive --- Network Virtualization Acceralation – VXLAN NICハードウェアオフロード --- High Speed Storage Interconnect – iSER (iSCSI over RDMA) --- SR-IOV / NFV - 7. Cloud Application Performance - 8. Demo - 9. まとめTRANSCRIPT
Senior System Engineer | Mellanox Technologies Japan KK 友永 和総 (Kazusa Tomonaga) 2014年10月8日 – OpenStack最新情報セミナー
Mellanox for OpenStack
※本内容は予告なく変更となる場合があります。
© 2014 Mellanox Technologies 2
Agenda
1. Mellanox Overview
2. Mellanox CloudX
3. CloudX Deep Dive
• Network Virtualization Acceralation – VXLAN NICハードウェアオフロード
• High Speed Storage Interconnect – iSER (iSCSI over RDMA)
• SR-IOV / NFV
7. Cloud Application Performance
8. Demo
9. まとめ
© 2014 Mellanox Technologies 3
Mellanox Overview
© 2014 Mellanox Technologies 4
メラノックス会社概要
高帯域、低レイテンシーなサーバ・ストレージ間インターコネクトのリーディングカンパニー
• FDR 56Gb/s InfiniBandと10/40/56ギガビットEthernet
• アプリケーションのデータ処理時間を大幅に削減
• データセンタ・サービス基盤のROIを劇的に向上
会社概要
• 本社:イスラエルYokneam, 米国Sunnyvale, California
• 従業員数:全世界で1,428名(2014年6月末時点)
堅実な財務内容
• 2013年度売上 : $390.9M
• 2014年度第二四半期売上 : $102.6M
• 2014年度第三四半期売上 : $114M~$118M
• Cash + investments @ 6/30/14 = $343.7M
Ticker: MLNX
© 2014 Mellanox Technologies 5
Exponential Data Growth – The Best Platforms Are Needed
We Live in a World of Data
More Data More Applications More Devices
Data Needs to be Accessible Always and in Real-Time
© 2014 Mellanox Technologies 6
Leading Supplier of End-to-End Interconnect Solutions
Software and Services ICs Switches/Gateways Adapter Cards Cables/Modules Metro / WAN
Store Analyze Enabling the Use of Data
At the Speeds of 10, 40 and 100 Gigabit per Second
Comprehensive End-to-End InfiniBand and Ethernet Portfolio
© 2014 Mellanox Technologies 7
VPI (Virtual Protocol Interconnect) テクノロジー
• IB and EN with single chip (ConnectX-3、SwitchX-2)
• IB and EN port by port (ConnectX-3、SwitchX-2)
• IB/EN Bridging (SwitchX-2)
高スループット、低レイテンシー、超低消費電力 (Ultra Low Power)
RDMA (Remote Direct Memory Access) 対応、高速データ転送
VXLAN/NVGREオフロード (ConnectX-3 Pro)
3.0 x8
17mm
45mm
InfiniBand/Ethernet
InfiniBand/Ethernet
2 x 56Gbps Ethernet mode: 1/10/40/56GbE
144組のネットワークSerDesを搭載 36 x 40/56GbE 64 x 10GbE 48 x 10GbE + 12 x 40/56GbE Ethernet mode: 1/10/40/56GbE
• InfiniBand or Ethernet • InfiniBand + Ethernet • InfiniBand / Ethernet Bridging
36x 40GbE: 83W 64x 10GbE: 63W
(100% load power) 2pt 40GbE Typ power: 7.9W
3.0 x16
2 x IB FDR (56Gbps)
メラノックスのコアテクノロジー : 高性能・高集積ASIC
© 2014 Mellanox Technologies 8
1) 高スループット アダプタ/スイッチ - 10/40/56Gbps
2) 低レイテンシー スイッチ : 220ns (L2 40GbE), 330ns (L3 40GbE) アダプタ : ~1us 3) 超低消費電力 スイッチ : 83W (36 x 40GbE 100% load) アダプタ : 7.9W (2x 40GbE typ. Power) エンドトゥエンドソリューション: アダプタ向けASIC、スイッチ向けASICを自社開発(フルカスタムASIC) アダプタ、スイッチ、ケーブルを自社開発 • 100Gbps実現に向けた光モジュール/ケーブル関連技術を自社で保有 (2013年、Kotura社とIPtronics社を買収)
ドライバ、高速化ソフトウェア、管理ソフトウェアを自社開発
メラノックス製品の3つの特長
© 2014 Mellanox Technologies 9
Mellanox CloudX
© 2014 Mellanox Technologies 10
Making the Cloud Easy
The World is Moving to the Cloud
But Building a Cloud is a Challenge
Mellanox Makes it Easy: Deploy, Use and Maintain
Private Cloud Public Cloud
© 2014 Mellanox Technologies 11
CloudX is the Most Efficient Cloud Platform
The Platform for Creating the Applications of Tomorrow
Lower Your IT Cost by 50%!
© 2014 Mellanox Technologies 12
CloudX: Optimized Cloud Platform
CloudX is a group of reference architectures which
allow building the most efficient, high performance
and scalable Infrastructure As A Service (IaaS)
clouds based on Mellanox superior interconnect and
off the shelf building blocks
Supports the most popular cloud software
• OpenStack
• Windows Azure Pack (WAP)
• VMware
© 2014 Mellanox Technologies 13
Mellanox OpenStack Reference Documents
http://www.mellanox.com/openstack/
Designing CloudX Solution using Mirantis
Fuel OpenStack Software
http://community.mellanox.com/docs/DOC-1464
HowTo Configure iSER Block Storage for
OpenStack Cloud with Mellanox ConnectX-3
Adapters
http://community.mellanox.com/docs/DOC-1462
今後、随時拡充されていく予定です。
© 2014 Mellanox Technologies 14
Mellanox CloudX
• エンドトゥエンドの高性能ネットワークを実現する ハードウェア
•Mellanox ネットワークアダプタ •Mellanox ネットワークスイッチ •Mellanox ネットワークケーブル •ハードウェアサポートサービス
• CloudXシステム構築プロフェッショナルサービス • CloudXシステム運用・保守テクニカルサポート
• メラノックスが最先端の大規模クラウドシステムでグローバルに培ったテクノロジーを パッケージ化 • リファレンスアーキテクチャのドキュメントを無償提供、プロフェッショナルサービス メニューも用意 • 世界最先端のクラウドシステム構築を実現
© 2014 Mellanox Technologies 15
Highest Capacity in 1RU • 12 to 36 QSFP
• 64 x 10GbE
Value • VPI 56Gb/s InfiniBand & Ethernet
• End to end solution
Mellanox CloudX ハードウェア : Ethernetスイッチ
SX1036 The Ideal 40GbE ToR/Aggregation
SX1024 Non-blocking 10GbE 40GbE ToR
SX1016 Highest density 10GbE ToR
Latency • 220ns L2 latency
• 330ns L3 latency
Power (SX1036) • Under 1W per 10GbE interface
• 0.6W per 10GbE of throughput
SX1012 Ideal storage/Database 10/40GbE Switch
SX1036 – 83W
SX1016 – 62W
SX1024 – 75W
SX1012 – 50W
For 100% Load Power
© 2014 Mellanox Technologies 16
Leading throughput: 2.5X better
• 2.88Tb/s throughput on a single chip, running Full Wire Speed at any packet size
Leading L2 unicast/multicast latency for L2/L3 switches: 2X better
• 198-223ns for any packet size
Leading L3 latency: 2X better
• 321-337ns for any packet size
Industry record power efficiency: 6X better than competition
• Sub 0.6Watt per 10GbE throughput with 100% load at Full Wire Speed
Shattering Ethernet Switching Performance Records
206 209 220 216 219 223 222 222 223
0
200
400
600
800
1,000
64 128 256 512 1,024 1,280 1,518 2,176 9,216
L2 Min
L2 Average
L2 Max
Packet Size (Bytes)
La
ten
cy (
ns
)
‘Zero’ Jitter!
© 2014 Mellanox Technologies 17
Best ROI – Switch Silicon Example
What matters in switch silicon? • Highest switching capacity
• Lowest power
• Lowest latency
0.95
1.44
2.5
0
0.5
1
1.5
2
2.5
3
Trident+ Trident2 SwitchX
Switching Capacity (Bpps)
1
0.5
0.2
0
0.2
0.4
0.6
0.8
1
1.2
Trident+ Trident2 SwitchX
Latency (µsec)
3
2
0.4
0
0.5
1
1.5
2
2.5
3
3.5
Trident+ Trident2 SwitchX
Power (Watt/Gb)
x1.7 Better x2.5 Better x5 Better
© 2014 Mellanox Technologies 18
Mellanox CloudX ハードウェア : Ethernetネットワークアダプタ
The Foundation of Cloud 2.0 The World’s First NVGRE / VXLAN Offloaded NIC
• 10/40/56ギガビットEthernetサポート • HPCシステムでも広く活用される世界トップレベルの性能
• 低レイテンシー、高スループット、超低消費電力 • RDMA (Remote Direct Memory Access) サポート
• RoCE (L2 Ethernet RDMA) • RoCEv2 (L2/L3 Ethernet RDMA)
• オーバーレイネットワークオフロード • VXLAN (Linux, VMware*) • NVGRE (Windows) *Available soon
© 2014 Mellanox Technologies 19
Mellanox CloudX プロフェッショナルサービス
Mellanox CloudXテクノロジーを活用した200ノードまでのクラウドシステム構築を実現する技術サービス • Mellanox CloudX Technical implementation package up to 200 nodes • CloudX Planning and design, Installation, Configuration, Performance Tuning, Testing
and Knowledge Transfer
GPS-0200-ONST-CLOUDX オンサイトサポート
GPS-0200-REMT-CLOUDX リモートサポート
※価格等、詳細は個別にお問い合わせください。
© 2014 Mellanox Technologies 20
Mellanox CloudX テクニカルサポート
Ordering P/N Description Comments
SUP-CLOUDX-1S CloudxX™ System Support 1 YEAR SILVER SUPPORT
Includes support for CloudX™ operational assistance, including integrated components for Mellanox Opensack plugins: Neutron Mellanox Plug-in, eswitchd, Cinder over iSER and/or other CloudX supported architecture plug-in’s operational assistance, Mellanox OFED CloudX™ operational assistance
下記のようなMellanox CloudX使用にあたって必要となる コンポーネント群の運用についての技術サポートを提供 ・Mellanox Neutron Plug-in ・Mellanox eSwitchd (Embedded Switch) ・Mellanox iSER for Cinder ・Other CloudX components ・MLNX OFED
SUP-CLOUDX-3S CloudxX™ System Support 3 YEAR SILVER SUPPORT
SUP-CLOUDX-1G CloudxX™ System Support
1 YEAR GOLD SUPPORT
SUP-CLOUDX-3S CloudxX™ System Support
3 YEAR GOLD SUPPORT
※価格等、詳細は個別にお問い合わせください
© 2014 Mellanox Technologies 21
CloudX Deep Dive
© 2014 Mellanox Technologies 22
EVN: More than SDN - Efficient Clouds Need an Efficient Virtualized Network
Next Generation Software Defined Networks
Efficient Virtualized Network (EVN) Fully Integrated, World Class SDN Solution
EVN: Efficient Virtualized Network
Fully Integrated Solution Combines
RDMA, Convergence, & SDN/Virtualization
CONVERGENCE ACCELERATION VIRTUALIZATION
© 2014 Mellanox Technologies 23
Accelerate All Three Elements Required for SDN Networks
OpenFlow Software Defined Networks Overlay Network Tunnels Virtual Network Management
1.Centralized Software Based Control Plane • Enables network virtualization
2.Overlay Networks – NVGRE/VXLAN • Isolation, Scalability, Simplicity
• Mellanox accelerates overlay networks to offer bare metal speed
3.Industry Standard API – OpenFlow • Enables an industry ecosystem and innovation
= + +
© 2014 Mellanox Technologies 24
Comprehensive OpenStack Integration for Switch and Adapter
Integrated with Major
OpenStack
Distributions
In-Box With Havana &
Ice House
Neturon-ML2
support for
mixed VM
environment
(VXLAN, PV,
SRIOV)
Ethernet
Neutron :
Hardware
support for
security and
isolation
Accelerating
storage
access by up
to 5x
OpenStack Plugins Create Seamless Integration, Control, & Management
© 2014 Mellanox Technologies 25
Network Virtualization Acceleration
© 2014 Mellanox Technologies 26
Server
VM1 VM2 VM3 VM4
ConnectX-3 Pro Accelerates Overlay Networks
Overlay Network Virtualization: Isolation, Simplicity, Scalability
Virtual Domain 3
Virtual Domain 2
Virtual Domain 1
Physical
View
Server
VM5 VM6 VM7 VM8
Mellanox SDN Switches & Routers
Virtual
View
VXLAN Overlay Networks Virtual Overlay Networks Simplifies
Management and VM Migration
ConnectX-3 Pro
Overlay Accelerators Enable
Bare Metal Performance OpenFlow
Virtual Network
Management API
© 2014 Mellanox Technologies 27
VXLAN Performance
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
1 VM 2 VMs 3 VMsVxLAN in software 3.50 3.33 4.29
VxLAN HWOffload
0.90 0.89 1.19
CP
U%
/
Ban
dw
idth
(G
bit
/sce)
CPU Usage Per Gbit/sec with VxLAN
0
5
10
15
20
25
1 VM 2 VMs 3 VMsVxLAN in software 2 3 3.5
VxLAN HW Offload 10 19 21
Ban
dw
idth
[G
bit
/sec]
Total VM Bandwidth when using VxLAN
Hig
he
r is
Be
tte
r
Lo
we
r is
Be
tte
r
VXLAN Offload Engine – 5X higher throughput, 75% lower CPU utilization
© 2014 Mellanox Technologies 28
Turbocharge Your OVS with Mellanox ConnectX-3 Pro
“Mellanox ConnectX-3 Pro card is the only way to scale-out PLUMgrid’s Virtual Network Infrastructure (VNI) overlay-based Architecture”
Source: PLUMGrid white paper
© 2014 Mellanox Technologies 29
High Performance Storage Interconnect
© 2014 Mellanox Technologies 30
iSER – iSCSI Extension for RDMA
Zero copy using RDMA
IB and Ethernet (RoCE)
Transport protocol implemented in hardware (minimal CPU cycles per IO)
Open stack integration
Support for T10/DIF
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
iSCSI/TCP iSCSI/RDMA
IO L
ate
nc
y @
4K
IO
[
mic
se
c]
iSCSI (TCP/IP)1 x FC 8 Gb
port4 x FC 8 Gb
portiSER 1 x
40GbE/IB Port
iSER 2 x40GbE/IB Port(+Acceleration)
KIOPs 130 200 800 1100 2300
0
500
1000
1500
2000
2500
K IO
Ps
@ 4
K IO
Siz
e
5-10% the latency under 20x the workload
© 2014 Mellanox Technologies 31
Mellanox Accelerates Storage: More than 4X Greater Throughput
1.3
5.5
0
1
2
3
4
5
6
iSCSI over TCP iSER
GB
yte
s/s
OpenStack Storage Performance *
* iSER patches are available on OpenStack
branch: https://github.com/mellanox/openstack
Built-in OpenStack Components/Management & Cinder/iSER to Accelerate Storage Access
© 2014 Mellanox Technologies 32
SR-IOV / NFV
© 2014 Mellanox Technologies 33
Mellanox Single Root I/O Virtualization (SR-IOV)
PCIe device presents multiple instances to the OS/Hypervisor
Enables Application Direct Access (ADA)
• Reduces CPU overhead and improves application performance
Eliminates virtualization penalty with RDMA & ADA
• Low latency applications benefit from the Virtual infrastructure
VF Device Driver
VM
VF Device Driver
VM
VF Device Driver
VMn
Virtual NIC
VM
Physical Function Device Driver
PF VF VF VF
© 2014 Mellanox Technologies 34
No Performance Compromise in Virtualized Environment
SR-IOV Accelerates RoCE • Enables native RoCE performance in virtualized environments
SR-IOV Boosts Ethernet Performance
10
15
20
25
30
35
40
1 VM 2 VM 4 VM 8 VM 16 VM
Th
rou
gh
pu
t (G
b/s
)
RoCE – SR-IOV Throughput
Throughput (Gb/S)
0
0.5
1
1.5
2
2.5
3
1 VM 2 VM 4 VM 8 VM
Late
ncy (
us)
RoCE - SR-IOV Latency
Message Size 2B Message Size 16B Message Size 32B
© 2014 Mellanox Technologies 35
SR-IOV and eSwitch
OS VM
Para-
virtual
OS VM
OS VM
OS VM
tap tap SR-IOV
to the
VM
Provision VM & Fabric Policy in hardware, through standard APIs Benefits: Isolation, functionality, performance & offload, simpler SDN
Embedded
Switch
Mellanox
Neutron
Agent
Create/delete,
configure policy
per VM vNIC
Neutron
Plug-Ins
Servers
Manager
OpenStack Manager
Compare eSwitch vs OVS
Qperf (TCP) Latency
© 2014 Mellanox Technologies 36
ConnectX-3 Family QoS
Port based ETS
• Max bandwidth per TClass
• Bandwidth reservation per TClass
Per Function rate limiter
• Average bandwidth
• Peak bandwidth
• Maximum burst at peak bandwidth
Packet pacing
• Low jitter packet pacing for work queues
Rate ShaperRate Shaper
QoS Queue
Work Queue
Work Queue
Work Queue
Priority 0 Arbiter
QoS Queue
Work Queue
Work Queue
Work Queue
QoS Queue
Work Queue
Work Queue
Work Queue
QoS Queue
Work Queue
Work Queue
Work Queue
Priority 1 Arbiter
RR arbiter
RR arbiter
RR arbiter
RR arbiter
Strict Priority
TC Group 0DWRR
TC Group 1DWRR
TC Group 7DWRR
TC
0
TC
1
Flow Ctrl
Flow Ctrl
TC
2
TC
3
Flow Ctrl
Flow Ctrl
TC
7
Flow Ctrl...
HL
...
...Priority 0
Priority 1
Priority 2
Priority 3
Priority 7
Rate Limiter
Rate Limiter
Enhanced ETS Function Rate Limiter Packet Pacing
© 2014 Mellanox Technologies 37
Neutron Plug-in
OpenStack integration
High performance 10/40/56Gbps
SR-IOV enabled
OpenFlow enabled eSwitch
OpenStack Neutron Plug-in
PMD for DPDK: VM OS bypass
Multi cores and RSS support
Delivering bare-metal performance
Record Breaking 195Gbs for Guest VM over DPDK
OS
VM
OS
VM Hypervisor
Legacy Software
vSwitches
SR-IOV eSwitch
Hardware Offload
OpenFlow enabled
VM
6WIND or Intel® DPDK
• Data Plane libraries
• Optimized NIC drivers
Client’s Application Software
High-performance packet processing solutions for
• Gateways
• Security appliances
• UTMs
• Virtual appliances
• etc.
Multicore Processor
……
librte_pmd_mlx4 librte_crypto_nitrox 6WIND addons VMware …
10/40/56Gbps
© 2014 Mellanox Technologies 38
Fat-tree SDN Switch Network
40GbE
56Gbps
IB FDR
SX1024 Ethernet Switch
HWA /
Signal
Processing
Fabric 40Gbps
SX1024 Ethernet Switch
HWA /
Signal
Processing
Fabric
Platform 1
Platform 2
40Gbps
SX1024 Ethernet Switch
HWA /
Signal
Processing
Fabric
Platform X
Nx40Gbps Nx40Gbps Nx40Gbps
40Gbps
Remote HWA as a Service in NFV Cloud Model
DPI
BRAS
SGSN GGSN
PE Router
Firewall
CG-NAT SBC
STB
Ethernet Ethernet Ethernet
RD
MA
/ R
oC
E
RD
MA
/ R
oC
E
RD
MA
/ R
oC
E
© 2014 Mellanox Technologies 39
Fat-tree SDN Switch Network
10/40/100Gbps
ToR
Aggregation
Ethernet Switch
SAN/NAS Storage
Compute Storage
10/40/100Gbps
10/40/100Gbps
Ethernet Switch
SAN/NAS Storage
Compute Storage
Rack 1 Rack 2 10/40/100Gbps
10/40/100Gbps
Ethernet Switch
SAN/NAS Storage
Compute Storage
Rack n
12x10/40/100Gbps 12x10/40/100Gbps 12x10/40/100Gbps
10/40/100Gbps
iSCSI SAN/NAS Storage Architecture in an NFV Cloud Model
iSCSI SAN/NAS Storage over Standard Ethernet Network: Shared Resource
RD
MA
/ R
oC
E
RD
MA
/ R
oC
E
RD
MA
/ R
oC
E
© 2014 Mellanox Technologies 40
Cloud Application Performance
© 2014 Mellanox Technologies 41
CloudX Delivers Unbounded Cloud Performance
4X Faster Runtime! Benchmark: TestDFSIO (1TeraByte, 100 files)
2X Higher Performance! Benchmark: 1M Records Workload (4M Operations)
2X faster runtime and 2X higher throughput
2X Faster Runtime! Benchmark: MemCacheD Operations
3X Faster Runtime! Benchmark: Redis Operations
© 2014 Mellanox Technologies 42
6200
1200 800 0
2000
4000
6000
8000
I/O Size - 64 [KB]B
an
dw
idth
[M
B/s
]
SCSI Write Example, Linux KVM
iSER 16 VMs Write
10GbE
Fiber Channel - 8Gb
Accelerating Cloud Performance
38
10
0
10
20
30
40
50
Tim
e [
s]
Migration of Active VM
10GE-A 40GE-A
Storage
Migration
Virtualization
3.5X
20X
6X
40
2 01020304050
Message Size - 256 [bytes]
Late
ncy
[u
s] VM-to-VM Latency Performance
TCP ParaVirtualization
RDMA Direct Access
10 GbE
Fibre Channel 8Gb
40 GbE
iSER 40GbE VMs Write
© 2014 Mellanox Technologies 43
Demo
© 2014 Mellanox Technologies 44
OpenStack VXLAN offload Demo – Mellanox ConnectX-3 Pro
© 2014 Mellanox Technologies 45
OpenStack Cinder iSER (iSCSI over RDMA) Demo
© 2014 Mellanox Technologies 46
まとめ – Mellanox CloudXで効率の良い先進クラウドが構築可能
• エンドトゥエンドの高性能ネットワークを実現する ハードウェア
•Mellanox ネットワークアダプタ •Mellanox ネットワークスイッチ •Mellanox ネットワークケーブル •ハードウェアサポートサービス
• CloudXシステム構築プロフェッショナルサービス • CloudXシステム運用・保守テクニカルサポート
• メラノックスが最先端の大規模クラウドシステムでグローバルに培ったテクノロジーを パッケージ化 • リファレンスアーキテクチャのドキュメントを無償提供、プロフェッショナルサービス メニューも用意 • 世界最先端のクラウドシステム構築を実現
Thank You