secure data reservoir: 70 gbps fully encrypted data ... · kei hiraki mary inaba the university of...

35
Junichiro Shitami Secure Data Reservoir: 70 Gbps fully encrypted data transfer facility Junichiro Shitami Goki Honjo Kei Hiraki Mary Inaba The University of Tokyo 1

Upload: buihanh

Post on 20-Jul-2019

213 views

Category:

Documents


0 download

TRANSCRIPT

Junichiro Shitami

Secure Data Reservoir: 70 Gbps fully encrypted data transfer facility

Junichiro Shitami

Goki Honjo

Kei Hiraki

Mary Inaba

The University of Tokyo

1

Junichiro Shitami

Agenda

• Background

• Our Goal

• Preliminary Evaluations

• Design of Secure Data Reservoir

• Evaluation on Real Network

• Conclusion

• Future Work

2

Junichiro Shitami

Background

• High performance data transfer is essential• We have been working to efficiently utilze high-speed network

• Data Reservoir Project• Started on 1 Gbps Tokyo – US network

• Achieved Land Speed Record of that date

• We can get enough performance• for memory transfer even 100G network

• also for unencrypted storage transfer

3

0

500000

1000000

1500000

2000000

2000 2003 2006 2009 2012 2015 2018

Bandwidth Distance Product(Tbit meter / second)

Junichiro Shitami

Change of target application

• Target application is changed• Existing: Physics, Astronomy, Graphics, Videos

• Recent: Bio-science, Medical science

• We need fully encrypted transfer• for very high-speed cell sorter called “Serendipiter”

• Bio science is regulated by low

• Traditionally used “scp” is very slow

4

Serendipiter

Junichiro Shitami

• Design a data transfer facility optimal for bio- and medical science• to enable fully encrypted storage transfer

• using ordinary Linux 1 server pair

• utilizes all available bandwidth on intercontinental network

• named “Secure Data Reservoir”

• Verify the performance• on high-speed long-distance network

• on real network

• with pacing technique

Our Goal

5

Junichiro Shitami

Preliminary Evaluations

• We made several preliminary evaluations• for the better design of Secure Data Reservoir

• Evaluations• Network performance

• RAID performance

• Storage performance

• Encryption performance

6

Junichiro Shitami

Network Performance

Throughput CPU load

Send 99 Gbps 75 %

Receive 99 Gbps 90 %

• We plan to use Chelsio 100 Gbps NIC (T62100-LP-CR)

• Confirm basic TCP send/receive performance• on back-to-back network

• using iperf3

• Result• Wire-rate performance

• Receiver CPU load is higher than sender

7

Junichiro Shitami

RAID Performance

Read CPU load Write CPU load

Linux software RAID 6729 MB/s 28.2 % 2059 MB/s 15.9 %

VROC RAID 6751 MB/s 28.3 % 2087 MB/s 16.2 %

• We plan to use Intel VROC RAID

• Compare Intel VROC RAID and Linux software RAID• on same SSD

• using fio benchmark

• Result• Almost no difference

• We can use any SSDs regardless of VROC support

8

Intel CPUVROC RAID

VS

LinuxSoftware RAID

Junichiro Shitami

Storage Performance

Throughput CPU load

Read 180 Gbps 100 %

Write 95 Gbps 100 %

• We plan to use Samusung SSD 960 PRO (NMVe)

• Confirm basic read/write performance• using 8 SSDs per 1 server (Linux software RAID)

• using fio benchmark

• Result• Sufficient performance for data transfer on 100 Gbps network

• Write CPU load is much higher than read

9

Linux Software RAID8 SSDs (4 x2)

ASUS HYPER M.2 X16 ASUS HYPER M.2 X16

Junichiro Shitami

Encryption Performance

64 bytes 1024 bytes 8192 bytes CPU load

Intel AES-NI 29.0 GB/s 80.4 GB/s 98.8 GB/s 100 %

Chelsio Crypto Offload 28.6 GB/s 81.2 GB/s 98.9 GB/s 100 %

• We plan to use Crypto Offload function of Chelsio NIC

• Compare Chelsio Crypto Offload and Intel AES-NI (CPU)• using openssl benchmark

• Result• Almost no difference

• Chelsio Crypto Offload does not reduce CPU load

10

Intel CPUAES-NI encryption

ChelsioCrypto Offload

VS

Junichiro Shitami

Design of Secure Data Reservoir

• Design component• Chelsio 100G NIC

• Sumsung SSD 960 PRO (8 SSDs per 1 server)

• Linux software RAID

• Intel AES-NI encryption

• Features• AES256 encryption

• Dedicated Storage threads and network threads

• Pacing technique to stabilize TCP streams

11

Junichiro Shitami

System Configuration

Storage data

CPU

M.2 NVMe SSD x8

Storage thread

Network thread

Encryption

100G NIC

CPU

Network

CPU

M.2 NVMe SSD x8

Storage thread

Network thread

Encryption

100G NIC

CPU

CPU Intel Xeon Scalable Gold 6144 (8 core, 3.5 GHz) x2

Memory DDR4-2666 192 GB

Network Chelsio T62100-LP-CR (100Gbps NIC)

Storage Samsung SSD 960 PRO (512GB) x8, ASUS HYPER M.2 X16

OS Linux 4.9.88

Linux Sever Linux Server

12

Junichiro Shitami

Dallas

Seattle

Los Angeles

Tokyo

Singapore

Evaluation on Real Network

Route RTT Folded RTT

Dallas – Seattle 41 ms 82 ms

Dallas – Tokyo 137 ms 274 ms

Dallas – Singapore 222 ms 444 ms

Dallas – Los Angeles 403 ms 806 ms

SC18 venue

13

TransPAC

JGN

SingAREN

Rec

eive

r

Sen

der

Junichiro Shitami

Experimental Setup

• 2 servers (sender, receiver) placed at SC18 venue• Using folded network

• Experiments on 3 routs• Dallas – Tokyo

• Dallas – Singapore

• Dallas – Los Angeles

• Encrypted memory and Storage transfer

• Sender side traffic pacing technique

14

Folded network

senderreceiver

dataack

VLA

N A

VLA

N B

Junichiro Shitami

Dallas

Seattle

Los Angeles

Tokyo

Singapore

Result: Dallas – Tokyo

SC18 venue

Folded RTT is 274 ms

15

TransPAC

JGN

SingAREN

Rec

eive

r

Sen

der

Junichiro Shitami

Result: Dallas – Tokyo (RTT 274 ms)Encrypted Memory to Memory Transfer

• ~5 seconds to peak throughput

• ~15 Gbps

• Throuput is vibrating due to TCP window size limitation

0

5

10

15

20

1 51 101 151 201 251

Thro

ugh

pu

t(G

bp

s)

Time (sec)

16

Junichiro Shitami

Result: Dallas – Tokyo (RTT 274 ms)Encrypted Storage to Storage Transfer

• ~11 Gbps

• Throughput is limited by storage performance

0

2

4

6

8

10

12

14

1 51 101 151

Thro

ugh

pu

t(G

bp

s)

Time (sec)

17

Junichiro Shitami

Result: Dallas – Tokyo (RTT 274 ms)Encrypted Storage to Storage Transfer x8

• ~74 Gbps

• Throughput is not stable

• Difference between streams is large

0

20

40

60

80

100

1 51 101 151 201

Thro

ugh

pu

t(G

bp

s)

Time (sec)

stream 1

stream 2

stream 3

stream 4

stream 5

stream 6

stream 7

stream 8

total

without pacing

18

Junichiro Shitami

Result: Dallas – Tokyo (RTT 274 ms)Encrypted Storage to Storage Transfer x8

• ~76 Gbps

• Pacing technique improve stability

• Difference between streams decreases

0

20

40

60

80

100

1 51 101 151 201

Thro

ugh

pu

t(G

bp

s)

Time (sec)

stream 1

stream 2

stream 3

stream 4

stream 5

stream 6

stream 7

stream 8

total

with pacing

19

Junichiro Shitami

Dallas

Seattle

Los Angeles

Tokyo

Singapore

Result: Dallas – Singapore

SC18 venue

Folded RTT is 444 ms

20

TransPAC

JGN

SingAREN

Rec

eive

r

Sen

der

Junichiro Shitami

Result: Dallas – Singapore (RTT 444 ms)Encrypted Storage to Storage Transfer x8

• ~70 Gbps

• Throughput is not stable

• Throughput degradation due to retransmission

0

10

20

30

40

50

60

70

80

90

1 51 101 151 201

Thro

ugh

pu

t(G

bp

s)

Time (sec)

stream 1

stream 2

stream 3

stream 4

stream 5

stream 6

stream 7

stream 8

total

without pacing

21

Junichiro Shitami

Result: Dallas – Singapore (RTT 444 ms)Encrypted Storage to Storage Transfer x8

• ~74 Gbps

• Pacing technique improve stability

• Difference between streams decreases

0

10

20

30

40

50

60

70

80

90

1 51 101 151 201

Thro

ugh

pu

t(G

bp

s)

Time (sec)

stream 1

stream 2

stream 3

stream 4

stream 5

stream 6

stream 7

stream 8

total

with pacing

22

Junichiro Shitami

Dallas

Seattle

Los Angeles

Tokyo

Singapore

Result: Dallas – Los Angeles

SC18 venue

Folded RTT is 806 ms

23

TransPAC

JGN

SingAREN

Rec

eive

r

Sen

der

Junichiro Shitami

Result: Dallas – Los Angeles (RTT 806 ms)Encrypted Storage to Storage Transfer x8

• ~40 Gbps

• Throuput is vibrating due to TCP window size limitation

0

10

20

30

40

50

60

1 51 101 151 201 251

Thro

ugh

pu

t(G

bp

s)

Time (sec)

stream 1

stream 2

stream 3

stream 4

stream 5

stream 6

stream 7

stream 8

total

24

Junichiro Shitami

TCP window problem

• Standard TCP windows size is up to 1 GB• It is too small for very long-distance high-speed network

• like Dallas – Los Angeles route

• We developed LFTCP protocol• overcomes TCP window size limitation

• out of the scope of this presentation

25

0

20

40

60

80

100

0 200 400 600 800 1000

Thro

ugh

pu

t (G

bp

s)

RTT (ms)

Junichiro Shitami

Result SummaryTotal of 8 streams encrypted storage transfer

26

Route RTT Peak Throughput Average Throughput

Dallas – Tokyo (without pacing) 274 ms 90.5 Gbps 74.3 Gbps

Dallas – Tokyo (with pacing) 274 ms 83.6 Gbps 76.3 Gbps

Dallas – Singapore (without pacing) 444 ms 80.6 Gbps 70.2 Gbps

Dallas – Singapore (with pacing) 444 ms 79.6 Gbps 74.1 Gbps

Dallas – Los Angeles 806 ms 50.6 Gbps 39.7 Gbps

• Pacing technique improve stability • Difference between peak and average throughput decreases

• Dallas – Los Angeles is too long• Total throughput is limited by TCP window size

Junichiro Shitami

Comparison with Emulated Delay

• Compare the result • on SC18 network (real network)

• on emulated delay network

• Memory and storage transfer (1 stream)• Due to delay emulation performance

• Network configuration is not exactly same

27

Junichiro Shitami

Emulated Delay Setup

• Delay is emulated by Linux server

• Maximum throughput is ~70 Gbps• depends on traffic pattern

CPU

M.2 NVMe SSD x8

Storage thread

Network thread

Encryption

100G NIC

CPU

CPU

M.2 NVMe SSD x8

Storage thread

Network thread

Encryption

100G NIC

CPU

Linux Server Linux Server

100G NIC

CPU

Linux Server

28

Delay Emulation

Junichiro Shitami

Result: RTT 275 ms (Dallas – Tokyo)Encrypted Memory to Memory Transfer

• Same behavior

0

5

10

15

20

1 51 101 151 201 251

Thro

ugh

pu

t (G

bp

s)

Time (sec)

SC18

Emulated Delay

29

Junichiro Shitami

Result: RTT 275 ms (Dallas – Tokyo)Encrypted Storage to Storage Transfer

• Average throughput is nearly same

• Behavior of start is different

0

2

4

6

8

10

12

14

1 51 101 151

Thro

ugh

pu

t (G

bp

s)

Time (sec)

SC18

Emulated Delay

30

Junichiro Shitami

Discussion about Emulated Delay

• Emulated delay works with 1 stream transfer• Large traffic causes significant performance degradation

• Pacing technique may be useful

• Average throughput is same

• Behavior of start is different

• Impact is large on storage transfer

31

Junichiro Shitami

Conclusion

• We designed “Secure Data Reservoir”

• We achieved 70 Gbps fully encrypted data transfer• on real intercontinental network

• Pacing technique improves stability• CPU load on receiver side is higher than sender side

• Throughput may be limited by TCP window size• LFTCP protocol overcomes this limitation

32

Junichiro Shitami

Current Work

• We participate at SCA19.

• Hardware is critical to performance

33

0

20

40

60

80

100

Singapore –Chicago

Singapore –South Korea

Singapore –Australia

Singapore -Japan

Dallas – Tokyo Dallas –Singapore

Dallas – Los Angeles

Thro

ugh

pu

t (G

bp

s)

SC18 (Dedicated Hardware)

Data Mover Challenge(Provided Hardware)

RTT 249 ms RTT 406 ms RTT 334 ms RTT 84 ms RTT 274 ms RTT 444 ms RTT 806 ms

Junichiro Shitami

Future Work

• Apply Secure Data Reservoir to other large-scale medical applications• Currently for “Serendipiter” (very high-speed cell sorter)

• Improvement User Interface• Current software is very experimental

• Further performance improvement for 400 Gbps network

34

Junichiro Shitami

Any question?

Thanks to

35

This work was supported by JSPS KAKENHI Grant Number JP16H01714.