visapult and the scxx bandwidth challenge hat trick

14
NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER 1 Visapult and the SCXX Bandwidth Challenge Hat Trick Wes Bethel LBNL

Upload: beau

Post on 06-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Visapult and the SCXX Bandwidth Challenge Hat Trick. Wes Bethel LBNL. The Punch Line. SC00 – 1.5 Gbps peak, 660 Mbps sustained. SC01 – 3.3 Gbps sustained. SC02 – 16.8 Gbps sustained. SC03 – we’re taking the year off. How did we do this?. Visapult Architecture. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

1

Visapult and the SCXX Bandwidth Challenge Hat Trick

Wes Bethel

LBNL

Page 2: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

2

The Punch Line

• SC00 – 1.5 Gbps peak, 660 Mbps sustained.

• SC01 – 3.3 Gbps sustained.

• SC02 – 16.8 Gbps sustained.

• SC03 – we’re taking the year off.

How did we do this?

Page 3: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

3

Visapult Architecture

Page 4: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

4

SC00 Bandwidth Challenge – the year of TCP

NTON

8 node StorageCluster (DPSS)

Network Throughput: 5 sec peak 1.48 Gbits/sec (72 streams: 20.5Mbits/stream); 60 minute sustained average: 582 Mbits/sec

Co

mp

ute

Clu

ste

r (

8 n

od

es

)

Berkeley Lab:.75 TB, 4 server DPSS

ANL Booth Linux Cluster

OC-48OC-48

2 x 1000 BT

HSCC

SGI Origin (8 CPU)

1.5 Gb/s4 x 1000BT

QwestASCI Booth:

SGI Origin (8 CPU)

4 x 1000BT

Visapult VisualizationApplication

File Transfer Application

Page 5: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

5

SC01 Bandwidth Challenge – the year of UDP

Page 6: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

6

SC01 BWC – What’s this UDP stuff?

• Do away with “frame boundaries.”

• Each packet is independent.

• Flow-rate regulation.

• Decouple simulation PE from Visapult PE (domain decomposition independence).

Page 7: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

7

SC02 Bandwidth Challenge Resource Map

Page 8: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

8

SC02 SCInet Weathermap

Page 9: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

9

SC02 SCInet 10Gb Interface Traffic

Visapult/Cactus SC02 Bandw idth Challenge Results

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

0 100 200 300 400 500 600 700 800

Tim e (seconds)

Meg

abit

s

10G-Link-1

10G-Link-2

10G-Link-3

Cumulative

Page 10: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

10

Page 11: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

11

SC02 BWC Visapult Improvements

• Omniview capabilities

• Overhauled custom UDP encoding & connection handshake – more flexibility and better performance.

Page 12: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

12

Page 13: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

13

Lessons Learned

• Rate regulated flows seriously kick booty. Want general purpose implementation. (Tsunami)

• Corollary: TCP sucks.

• Visapult is completely custom. Want better support to more generally deploy component-based RDV tools (Diva).

• Application drivers particularly useful to push technology in many areas, not just networking.

Page 14: Visapult and the SCXX Bandwidth Challenge  Hat Trick

NATIONAL ENERGY RESEARCH SCIENTIFIC COMPUTING CENTER

14