how to tune your wide area file system for a 100 gbps network
DESCRIPTION
How to Tune Your Wide Area File System for a 100 Gbps Network. Experiences from the SCinet Research Sandbox. Scott Michael LUG2012. April 24,2012. Talk Roadmap. Background: IU’s Lustre -WAN efforts to date Lustre-WAN at 100 Gbps : SC11 SCinet Research Sandbox entry - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/1.jpg)
Experiences from the SCinet Research Sandbox
How to Tune Your Wide Area File System for a 100 Gbps Network
Scott MichaelLUG2012April 24,2012
![Page 2: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/2.jpg)
100 Gbps Wide Area Lustre April 24, 2012
Talk Roadmap• Background: IU’s Lustre-WAN efforts to
date• Lustre-WAN at 100 Gbps: SC11 SCinet
Research Sandbox entry• LNET measurements: Important tunables
2 of 14
![Page 3: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/3.jpg)
100 Gbps Wide Area Lustre
Wide Area Lustre in Production at IU
April 24, 2012
3 of 14
![Page 4: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/4.jpg)
100 Gbps Wide Area Lustre
Lustre-WAN at IU • We have had and currently have several
remote client production mounts with a range of bandwidths and latencies• Clients connected at 1 Gbit and 10 Gbit• Clients connected across various
regional, national, and international networks
• Latencies ranging from a few milliseconds to 120 milliseconds
April 24, 2012
4 of 14
![Page 5: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/5.jpg)
100 Gbps Wide Area Lustre
100 Gbits Over Low LatencyApril 24, 2012
5 of 14
Dresden to Freiberg − 60 km − 0.72 msThroughput 10.8 GB/s − 86% efficiency
![Page 6: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/6.jpg)
100 Gbps Wide Area Lustre
100 Gbits Over a Bit More Latency• Indiana University submitted an entry to
the SC11 SCinet Research Sandbox program to demonstrate cross-country 100 Gbit/s Lustre performance
• The demonstration included network benchmarks, LNET testing, file system benchmarks, and a suite of real-world scientific workflows
April 24, 2012
6 of 14
![Page 7: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/7.jpg)
100 Gbps Wide Area Lustre
SCinet Research Sandbox Setup
Seattle to Indianapolis − 3,500 km − 50.5 ms
April 24, 2012
7 of 14
![Page 8: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/8.jpg)
100 Gbps Wide Area Lustre
SCinet Research Sandbox Outcome
Measurement EfficiencyLatency 50.5 ms −TCP iperf 96 Gbit/s 96%IOR 6.5 GB/s 52%Applications 6.2 GB/s 50%
April 24, 2012
8 of 14
• Relatively small cluster• 20 hours of test, troubleshoot, and demo time
![Page 9: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/9.jpg)
100 Gbps Wide Area Lustre
Workflow Suite • Enzo – astronomical adaptive mesh code• Vampir – parallel tracing code and debugger• Heat3d – heat diffusion code• ODI – astronomical image reduction pipeline• NCGAS – genomics codes• OLAM – climate code• CMES – Computational Model for
Electroencephalography responses in Schizophrenia - computational neuroscience
• Gromacs – molecular dynamics code
April 24, 2012
9 of 14
![Page 10: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/10.jpg)
100 Gbps Wide Area Lustre
More RPCs Are Needed• For high latency links max_rpcs_in_flight has to be increased from the default of 8
• One can show the max throughput for a given connection is:
or to maximize a given link…
April 24, 2012
10 of 14
![Page 11: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/11.jpg)
100 Gbps Wide Area Lustre
What We Learned About credits• Initial LNET testing for a single
client/server showed we were unable to achieve theoretical throughput
• Throughput leveled off past RPCs of 8• This was due to the default settings of credits and peer_credits
April 24, 2012
11 of 14
![Page 12: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/12.jpg)
100 Gbps Wide Area Lustre
What We Learned About credits• Single client/server LNET performance
was 1092 MB/s − 89% efficiency• We saw somewhat improved
performance with the entire system and increased credits, but less than expected
April 24, 2012
12 of 14
![Page 13: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/13.jpg)
100 Gbps Wide Area Lustre
Summary and Implications• Cross-country 100 Gbit networks are here or
coming soon• Lustre-WAN is a useful tool for empowering
geographically distributed scientific workflows• Centers that deploy Lustre-WAN systems
should consider the impact of RPCs and credits
• Multiple wide area/local client endpoints require some planning when setting tunables
April 24, 2012
13 of 14
![Page 14: How to Tune Your Wide Area File System for a 100 Gbps Network](https://reader036.vdocuments.mx/reader036/viewer/2022062323/56816385550346895dd46f15/html5/thumbnails/14.jpg)
100 Gbps Wide Area Lustre
Thank You for Your Attention
Questions?
Scott MichaelIndiana [email protected]
April 24, 2012
14 of 14
Look for the LNET paper at DIDC2012 in conjunction with HPDC
A Study of Lustre Networking Over a 100 Gigabit Wide Area Network with 50 milliseconds of Latency, DIDC ‘12