performance implications libiscsi rdma support...qemu iser support qemu iscsi block driver needed...

25
Performance Implications Libiscsi RDMA support Roy Shterman Software Engineer, Mellanox Sagi Grimberg Principal architect, Lightbits labs Shlomo Greenberg Phd. Electricity and computer department Ben-Gurion University, Israel

Upload: others

Post on 18-Sep-2020

21 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

Performance Implications Libiscsi RDMA support

Roy Shterman Software Engineer, Mellanox

Sagi Grimberg Principal architect, Lightbits labs

Shlomo Greenberg Phd. Electricity and computer department

Ben-Gurion University, Israel

Page 2: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

Agenda

Introduction to Libiscsi Introduction to iSER Libiscsi/iSER implementation The memory Challenge in user-space RDMA Performance results Future work

2

Page 3: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

What is Libiscsi?

iSCSI initiator user-space implementation. High performance non-blocking async API. Mature. Permissive license (GPL). Portable, OS independent. Fully integrated in Qemu. Written and maintained by Ronnie Sahlberg

[https://github.com/sahlberg/Libiscsi] 3

Page 4: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

Why Libiscsi?

Originally developed to provide built-in iSCSI client side support for KVM/QEMU.

Process private Logical Units (LUNs) without the need to have root permissions.

Since, grew iSCSI/SCSI compliance test-suits.

4

Page 5: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

iSCSI Extensions for RDMA (iSER)

Part of IETF RFC-7147

Transport layer iSER or iSCSI/TCP are transparent to the user.

5

Page 6: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

iSER benefits

Zero-Copy CPU offload Fabric reliability High IOPs, Low latency Inherits iSCSI management Fabric/Hardware consolidation InfiniBand and/or Ethernet (RoCE/iWARP)

6

Page 7: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

iSER Read command flow

SCSI Reads Initiator send Protocol Data Unit with encapsulated SCSI read to target. Target writes the data into Initiator buffers with RDMA_WRITE

command. Target initiate Response to the Initiator that will complete the SCSI

command.

7

Page 8: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

iSER Write command flow

SCSI Writes Initiator send Protocol Data Unit with encapsulated SCSI write to target (can

contain also inline data to improve latency). Target reads the data from initiator buffers with RDMA_READ commands. Target initiate Response to the Initiator that will complete the SCSI command.

8

Page 9: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

Libiscsi iSER implementation

Transparent integration. User-space networking (kernel bypass). High performance. Separation of data and control plane. Reduce latency by using non-blocking fd polling.

9

Page 10: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

Libiscsi stack modification

Layered the stack Centralized transport specific code Added a nice transport abstraction API Plugged in iSER

10

typedef struct iscsi_transport { int (*connect)(struct iscsi_context *iscsi, union socket_address *sa, int ai_family); int (*queue_pdu)(struct iscsi_context *iscsi, struct iscsi_pdu *pdu); struct iscsi_pdu* (*new_pdu)(struct iscsi_context *iscsi, size_t size); int (*disconnect)(struct iscsi_context *iscsi); void (*free_pdu)(struct iscsi_context *iscsi, struct iscsi_pdu *pdu); int (*service)(struct iscsi_context *iscsi, int revents); int (*get_fd)(struct iscsi_context *iscsi); int (*which_events)(struct iscsi_context *iscsi); } iscsi_transport;

Page 11: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

QEMU iSER support

Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass IO vectors to the transport stack.

Work in progress should be available in the next few weeks.

Also through libvirt!

11

Page 12: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

Experiments and results

Performance measured with Mellanox ConnectX4 on both initiator and target.

Target side was TGT user-space iSCSI target with RAM storage devices.

IO generator was FIO (Flexible I/O tester). Each guest with single CPU core and single FIO

process. Comparison against iSCSI/TCP and block

device pass-through of iSER devices.

12

Page 13: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

13

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

0 20 40 60 80 100 120 140

IOPS

I/O Depth

IOPS vs I/O depth

iSER Libiscsi TCP Libiscsi iSER kernel PT

Page 14: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

14

0

1000000

2000000

3000000

4000000

5000000

6000000

0 20 40 60 80 100 120 140

Band

wid

th (

KB/

s)

Block Size (K)

Bandwidth vs Block size

iSER Libiscsi TCP Libiscsi iSER kernel PT

Page 15: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

15

0

1000

2000

3000

4000

5000

6000

7000

8000

0 20 40 60 80 100 120 140

Late

ncy

(us)

I/O Depth

Latency vs I/O depth

iSER Libiscsi TCP Libiscsi iser PT Latency

Page 16: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

16

0

50

100

150

200

250

300

350

400

450

500

1k 2k 4k 8k 16k 32k 64k 128k

Late

ncy

(us)

Block Size

Latency vs Block Size

iSER Libiscsi TCP Libiscsi iSER kernel PT

Page 17: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

17

0

500000

1000000

1500000

2000000

2500000

3000000

1 2 3 4

Band

wid

th (

KB/

s)

num of VMs

Bandwidth across multiple VMs

iSER Libiscsi TCP Libiscsi iSER kernel PT

Page 18: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

18

0

100000

200000

300000

400000

500000

600000

700000

800000

1 2 3 4

IOPS

num of VMs

IOPS across multiple VMs

iSER Libiscsi TCP Libiscsi iSER kernel PT

Page 19: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

RDMA Memory registration

In order to allow remote access the application needs to map the buffer with remote access permissions.

Mapping operation is slow and not suite-able for the data-plane.

Applications usually preregister all the buffers intended for networking and RDMA

19

Page 20: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

Memory registration in Mid-layers

Mid-layers often don't own the buffers but rather receives them from the application. Examples: OpenMPI, SHMEM and

Libiscsi/iSER Memory registration for each data-transfer is not

acceptable.

20

Page 21: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

Possible solutions

1) Pre-register the entire application space. 2) Modify applications to use Mid-layer buffers. 3) “Pin-down” Cache: Register and cache

mappings on the fly. 4) Page-able RDMA (ODP): Let the device and the

kernel handle IO page-faults

21

Page 22: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

RDMA paging - ODP

RDMA devices can supports IO page-faults App can register “huge” virtual memory region

(even entire memory space). HW and kernel handle page-faults and page

invalidations If locality is good enough, performance penalty

is amortized. Not bounded to physical memory.

22

Page 23: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

iSER with ODP and memory windows

iSER can leverage ODP for a more efficient data-path

But, cannot map non-IO related memory for remote access. Solution: Open a memory window on a page-able

memory region (fast operation – can be used in the data-path).

ODP support for memory windows is on the works. Initial experiments with ODP look promising. 23

Page 24: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

Future Work

Leveraging RDMA paging support to reduce the memory foot-print.

Plenty of room for performance optimizations. Stability improvements. Libiscsi iSER unit tests.

24

Page 25: Performance Implications Libiscsi RDMA support...QEMU iSER support Qemu iSCSI block driver needed some modifications to support iSER. Move polling logic to the transport layer. Pass

2016 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved.

Acknowledgments

This project was conducted under the supervision and guidance of Dr. Shlomo Greenberg, Ben-Gurion University.

Special thanks to Ronnie Sahlberg, creator and maintainer of the Libiscsi library for his support.

25