the future of interconnect technology - hpc advisory council...rdma accelerates openstack storage...

36
Michael Kagan, CTO HPC Advisory Council Stanford, 2014 The Future of Interconnect Technology

Upload: others

Post on 29-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

Michael Kagan, CTO

HPC Advisory Council Stanford, 2014

The Future of Interconnect Technology

Page 3: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 3

The Power of Data

Data-Intensive Simulations Internet of Things National Security

Healthcare

Smart Cars

Congestion-Free Traffic Business Intelligence

Page 4: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 4

Data Must Always Be Accessible and at Real-Time

Compute Storage Archive Sensor Data

Smart Interconnect Required to Unleash The Power of Data

CPU CPU

Lower Latency, Higher Bandwidth, RDMA, Offloads, NIC/Switch Routing, Overlay Networks

Page 5: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 5

InfiniBand’s Unsurpassed System Efficiency

TOP500 systems listed according to their efficiency

InfiniBand is the key element responsible for the highest system efficiency

Mellanox delivers efficiencies of up to 96% with InfiniBand

Page 6: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 6

FDR InfiniBand Delivers Highest Return on Investment

Higher is better

Higher is better Higher is better

Source: HPC Advisory Council

Page 7: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 7

13 Million Financial Transactions Per Day, 4 Billion Database Inserts

Real Time Fraud Detection

235 Supermarkets, 8 States, USA

Reacting to Customers’ Needs in Real Time!

Reducing Data Queries from 20 minutes to 20 seconds

Accuracy, Details, Fast Response

10X Higher Performance, 50% CAPEX Reduction

Microsoft

Bing Maps

Businesses Success Depends on Fast Interconnect

97% Reduction in Database Recovery Time

From 7 Days to 4 Hours! Tier-1 Fortune100 Company

Web 2.0 Application

Page 8: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 8

Helping to Make the World a Better Place

SANGER

• Sequence Analysis and Genomics Research

• Genomic Analysis for pediatric cancer patients

Challenge: An individual patient’s RNA analysis took 7 days

Goal: Reduce it to 5 days

InfiniBand reduced the RNA-Sequence data analysis time

for patients to only 1 hour!

Fast interconnect for fighting pediatric cancer

Page 9: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 9

Mellanox InfiniBand Paves the Road to Exascale Computing

Accelerating Half of the World’s Petascale Systems Mellanox Connected Petascale System Examples

Page 10: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 10

20K InfiniBand nodes

Mellanox end-to-end FDR and QDR InfiniBand

Supports variety of scientific and engineering projects • Coupled atmosphere-ocean models

• Future space vehicle design

• Large-scale dark matter halos and galaxy evolution

NASA Ames Research Center Pleiades

Asian Monsoon Water Cycle

High-Resolution Climate Simulations

Page 11: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 11

InfiniBand Enables Lowest Application Cost in the Cloud (Examples)

Microsoft Windows Azure

90.2% Cloud Efficiency

33% Lower Cost per Application

Cloud

Application Performance

Improved up to 10X

3x Increase in VMs per Physical Server

Consolidation of Network and Storage I/O

32% Lower Cost per Application

694% Higher Network Performance

Page 13: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 13

Technology Roadmap

2000 2020 2010 2005

20Gbs 40Gbs 56Gbs 100Gbs

“Roadrunner” Mellanox Connected

1st 3rd

TOP500 2003 Virginia Tech (Apple)

2015

200Gbs

Mega Supercomputers

Terascale Petascale Exascale

10Gb/s

Page 14: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 14

Architectural Foundation for Exascale Computing

Connect-IB

Page 15: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 15

Mellanox Connect-IB The World’s Fastest Adapter

The 7th generation of Mellanox interconnect adapters

World’s first 100Gb/s interconnect adapter (dual-port FDR 56Gb/s InfiniBand)

Delivers 137 million messages per second – 4X higher than competition

Support the new innovative InfiniBand scalable transport – Dynamically Connected

Page 16: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 16

Connect-IB Provides Highest Interconnect Throughput

Source: Prof. DK Panda

Connect-IB FDR

(Dual port)

ConnectX-3 FDR

ConnectX-2 QDR

Competition (InfiniBand)

Connect-IB FDR

(Dual port)

ConnectX-3 FDR

ConnectX-2 QDR

Competition (InfiniBand)

Hig

he

r is

Be

tte

r

Gain Your Performance Leadership With Connect-IB Adapters

Page 17: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 17

Connect-IB Delivers Highest Application Performance

200% Higher Performance Versus Competition, with Only 32-nodes

Performance Gap Increases with Cluster Size

Page 18: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 18

Solutions for MPI/SHMEM/PGAS

Fabric Collective Accelerations

Page 19: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 19

Collective algorithms are not topology aware

and can be inefficient

Congestion due to many-to-many

communications

Slow nodes and OS jitter affect scalability and

increase variability

Collective Operation Challenges at Large Scale

Ideal Actual

Page 20: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 20

CORE-Direct

• US Department of Energy (DOE) funded project – ORNL and Mellanox

• Adapter-based hardware offloading for collectives operations

• Includes floating-point capability on the adapter for data reductions

• CORE-Direct API is exposed through the Mellanox drivers

FCA

• FCA is a software plug-in package that integrates into available MPIs

• Provides scalable topology aware collective operations

• Utilizes powerful InfiniBand multicast and QOS capabilities

• Integrates CORE-Direct collective hardware offloads

Mellanox Collectives Acceleration Components

Page 21: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 21

Minimizing the impact of system noise on applications – critical for scalability

The Effects of System Noise on Applications Performance

Ideal System noise CORE-Direct (Offload)

Page 22: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 22

Provide support for overlapping computation and communication

CORE-Direct Enables Computation and Communication Overlap

Synchronous CORE-Direct - Asynchronous

Page 23: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 23

Nonblocking Alltoall (Overlap-Wait) Benchmark

CoreDirect Offload

allows Alltoall

benchmark with almost

100% compute

Page 24: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 24

Accelerator and GPU Offloads

Page 25: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 25

GPUDirect 1.0

CPU

GPU Chip

set

GPU Memory

InfiniBand

System

Memory 1 2

CPU

GPU Chip

set

GPU Memory

InfiniBand

System

Memory

1

2

Transmit Receive

CPU

GPU Chip

set

GPU Memory

InfiniBand

System

Memory

1

CPU

GPU Chip

set

GPU Memory

InfiniBand

System

Memory

1

Non GPUDirect

GPUDirect 1.0

Page 26: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 26

GPUDirect RDMA

Transmit Receive

CPU

GPU Chip

set

GPU Memory

InfiniBand

System

Memory

1

CPU

GPU Chip

set

GPU Memory

InfiniBand

System

Memory

1

GPUDirect RDMA

CPU

GPU Chip

set

GPU Memory

InfiniBand

System

Memory

1

CPU

GPU Chip

set

GPU Memory

InfiniBand

System

Memory

1

GPUDirect 1.0

Page 27: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 27

0

200

400

600

800

1000

1200

1400

1600

1800

2000

1 4 16 64 256 1K 4K

Message Size (bytes)

Ban

dw

idth

(M

B/s

)

0

5

10

15

20

25

1 4 16 64 256 1K 4K

Message Size (bytes)

Late

ncy (

us

)

GPU-GPU Internode MPI Latency

Low

er is

Bette

r 67 %

5.49 usec

Performance of MVAPICH2 with GPUDirect RDMA

67% Lower Latency

5X

GPU-GPU Internode MPI Bandwidth

Hig

her

is B

ett

er

5X Increase in Throughput

Source: Prof. DK Panda

Page 28: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 28

Execution Time of HSG

(Heisenberg Spin Glass)

Application with 2 GPU Nodes

Source: Prof. DK Panda

Performance of MVAPICH2 with GPU-Direct-RDMA

Problem Size

Page 29: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 29

Remote GPU Access through rCUDA

GPU servers GPU as a Service

rCUDA daemon

Network Interface CUDA

Driver + runtime Network Interface

rCUDA library

Application

Client Side Server Side

Application

CUDA

Driver + runtime

CUDA Application

rCUDA provides remote access from

every node to any GPU in the system

CPU VGPU

CPU VGPU

CPU VGPU

GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU GPU

Page 30: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 30

rCUDA Performance Comparison

Page 31: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 31

Other Developments

Page 32: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 32

RDMA Accelerates OpenStack Storage

RDMA Accelerates iSCSI Storage

Hypervisor (KVM)

OS

VM

OS

VM

OS

VM

Adapter

Open-iSCSI w iSER

Compute Servers

Switching Fabric

iSCSI/iSER Target (tgt)

Adapter Local Disks

RDMA Cache

Storage Servers

OpenStack (Cinder)

Utilizing OpenStack Built-in components/Management - Open-iSCSI, tgt

target, Cinder To accelerate Storage Access

1.3

5.5

0

1

2

3

4

5

6

iSCSI over TCP iSER

GB

yte

s/s

Cinder / Volume Storage Performance *

* iSER patches are available on OpenStack

branch: https://github.com/mellanox/openstack

Page 33: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 33

Next Generation Enterprises: The Generation of Open Ethernet

PROPRIETARY

Software

PROPRIETARY

Management

Software

of Choice

Management

of Choice

Freedom to Choose and Create Any Software, Any Management

Enables Vendor / User Differentiation, No Limitations

Open Ethernet

Open Platform Locked Down Vertical Solution

OPEN ETHERNET

Switch

CLOSED ETHERNET

Switch

Page 34: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 34

Open Ethernet Solutions – The Freedom to Choose

Open Source

Software

3rd Party

Software

Switch Vendor

Software

Home Grown

Software

Page 35: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

© 2014 Mellanox Technologies 35

Futures

Page 36: The Future of Interconnect Technology - HPC Advisory Council...RDMA Accelerates OpenStack Storage RDMA Accelerates iSCSI Storage Hypervisor (KVM) OS VM OS VM OS VM Adapter Open-iSCSI

Thank You