gridftp challenges in data transport john bresnahan [email protected] argonne national laboratory...
TRANSCRIPT
![Page 1: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/1.jpg)
GridFTPChallenges In Data Transport
John Bresnahan
Argonne National Laboratory
The University of Chicago
![Page 2: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/2.jpg)
Challenges Past and Future
Standards
Throughput
Robustness
Extensibility
Security
Scalability
![Page 3: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/3.jpg)
Standards
Interoperability
Big selling point for adoption
GridFTP 1
1)Designed
2)Implemented
3)Released/Deployed/Used
4)Standardized
GridFTP 2
1)Standardized
2)Imple....
![Page 4: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/4.jpg)
Throughput
It had to be fast
GridFTP was sold on speed Other features eliminate excuses not to use
Fast varies with the environment LANs, WANs, Long Fat Pipe Must be able to configure and exchange protocols
TCP window sizes, UDP based protocols
See extensibility
![Page 5: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/5.jpg)
Lots Of Small Files
1 large file is easy (but less prevalent)
Overhead to payload ratio is low
1 data set partitioned in many little files
Overlap control overhead in data payload Pipelining Concurrent sessions Data channel caching
![Page 6: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/6.jpg)
Robustness
It has to work ALL the time Hard to get a solid stable code base
Harder to extend it
Race conditions But of course it can't
Recover from errors Check point transfers A session crash can't be a service crash
Fork()/setuid()/exec()
![Page 7: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/7.jpg)
Extensibility
Everything has a version 2.0
Even Garbage
Clean/safe abstractions
ability to add significant features without compromising stability
In the right place A balance between control and ease of development. XIO DSI
![Page 8: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/8.jpg)
XIO
A stack of data interceptors Filesystem Data channel Alter/monitor read/write buffers
Treats the data as a stream Options at open only Application treats it as it would a file
stream
![Page 9: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/9.jpg)
Frontend
XIO Driver Stacks
Client
DPI
DSI
All data passes through XIO driver stacks
to network and disk
observe data
change data
change protocol
XIO
XIO
XIO
XIOXIOXIO
![Page 10: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/10.jpg)
GridFTP XIO Extension Examples
Netlogger Observes and times events for bottleneck
detection Bandwidth Rate limiter
Throttles the rate buffers are passed along Multicast
Forward the buffer to many places UDT
Switch out transport protocols
![Page 11: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/11.jpg)
Multicast
Prototyped in a week
![Page 12: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/12.jpg)
GridFTP over UDT
428.3178.6GridFTP disk UDT
396.6179.3GridFTP mem UDT
102.437.4GridFTP disk TCP – 8 streams
59.616.3GridFTP disk TCP – 1 stream
112.640.2GridFTP mem TCP – 8 streams
63.816.4GridFTP mem TCP – 1 stream
117.040.3Iperf – 8 streams
74.519.7Iperf – 1 stream
Argonne to LA Throughput in Mbit/s
Argonne to NZ Throughput in Mbit/s
![Page 13: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/13.jpg)
Data Storage Interface (DSI)
Intercept all file system calls
stat, remove, mkdir, send, receive, …
Must handle the I/O for the FS Harder to write Much more flexibility
Examples HPSS, SRB, proxy/striping
![Page 14: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/14.jpg)
Security
Protection vs. Ease of use GSI and CAs were hard for many users
Speed vs. protection Users area happy with a minimal amount of
data channel protection Warm fuzzies
Simple and unsafe mode Flexibilty
XIO drivers handle security Still hard to extend
GridFTP over SSH A big win for many users
![Page 15: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/15.jpg)
Firewalls Punching through
Control channel is statically assigned Data channels dynamically assigned
1 way firewall (and NAT) Automatic traversal Simultaneous Open/TCP splicing STUN
2 way firewall Use a broker to create a route Negotiate the local ports
new protocol needed
Hooks in GridFTP to contact a broker at the right time
![Page 16: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/16.jpg)
Outgoing allowed
GridFTPSourceServer
GridFTPDest
Server
Client
TCP 2811TCP 2811
DATA
![Page 17: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/17.jpg)
GridFTPSourceServer
Connection Broker
GridFTPDest
Server
Client
TCP 2811TCP 2811
CB CB
DATA
IP 4 tuple IP 4 tupleTemporary hole Temporary hole
![Page 18: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/18.jpg)
Scalabilty
Striping Multi-host coordinated transfers You give us the hardware, we'll give you the
bandwidth Load balancing proxy Dynamic backends
![Page 19: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/19.jpg)
Proxy Server
The separation of processes buys the ability to proxy
Allows for load balancing
Frontend can choose from a pool of DPIs to service a client request
Client DPI
IPC
DPI
Frontend DPI
DPI
![Page 20: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/20.jpg)
Frontend
Striping
Client
CC
DC
DPI
CC
Frontend IPC
DPI DPI DPI
DPI DPI DPI DPI
DC
DC
DC
IPC IPCIPC IPC
IPC IPC IPCIPC
![Page 21: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/21.jpg)
Multi-core
CPU/to NIC ration increasing Treat each core as a stripe Parallel stream on each core
Fully encrypted transfers at network speeds all the security, none of the perf loss
Compression Faster than network speed transfers
![Page 22: GridFTP Challenges In Data Transport John Bresnahan bresnaha@mcs.anl.gov Argonne National Laboratory The University of Chicago](https://reader036.vdocuments.mx/reader036/viewer/2022062511/5514d86f550346b0338b5458/html5/thumbnails/22.jpg)
Conclusion
Past success Robustness Throughput Standard
Future ( += ) Scalable Secure Extensible
http://www.gridftp.org [email protected]