cc-nie dances project: implementing multi-site,...

19
1 © Pittsburgh Supercomputing Center CC-NIE DANCES Project: Implementing Multi-Site, SDN-Enabled Applications Bryan Learn <[email protected]> Pittsburgh Supercomputing Center Internet2 Technology Exchange 26 September 2016

Upload: others

Post on 03-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

1© Pittsburgh Supercomputing Center

CC-NIE DANCES Project: Implementing Multi-Site,

SDN-Enabled ApplicationsBryan Learn <[email protected]>

Pittsburgh Supercomputing CenterInternet2 Technology Exchange

26 September 2016

Page 2: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

2© Pittsburgh Supercomputing Center

DANCES Project Introduction

• DANCES is an NSF-funded (#1341005), CC-NIE collaborative “Network Integration and Applied Innovation” project

• DANCES has integrated SDN/OpenFlow into selected supercomputing cyberinfrastructure applications to support network bandwidth reservation and QoS

• Motivated by the need to support large bulk file transfer flows and efficiently share bandwidth of end site 10G infrastructure

Presenter
Presentation Notes
Project was started in Jan2014. The two year project will finish in 2016 under a No Cost Extension Concentrating on CI applications that initiate file transfer. These include: User workflow via scheduling/resource management software (e.g., Torque/Moab, Torque/SLURM) Wide area distributed filesystem (SLASH2) Project was motivated by the need to support large bulk wide area file transfers and the desire to provide predictability and stability throughout the transfer. When the project was proposed, end site components typically had 10Gb/s interfaces. Internet2 had recently deployed 100Gb/s uncongested SDN-capable infrastructure in the wide area. Multiple 10G attached servers at end sites could cause local congestion while the WAN path remained uncongested. Large file transfers can take days to complete. Depending on how smaller competing file transfers are initiated, they may either disrupt the larger flow, or the larger flow may block or seriously impede performance of smaller flows. Through the DANCES project we are trying to bring more predictability to file transfer performance by providing dedicated and protected bandwidth for priority bulk data flows.
Page 3: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

3© Pittsburgh Supercomputing Center

DANCES Development Team

• Pittsburgh Supercomputing Center (PSC)

• National Institute for Computational Sciences (NICS)

• Pennsylvania State University (Penn State)

• eXtreme Science and Engineering Discovery Environment (XSEDE)

• Internet2

• Corsa Technologies, Inc.

Presenter
Presentation Notes
Our core team
Page 4: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

4© Pittsburgh Supercomputing Center

DANCES-enhanced Applications

• Implementation uses SDN/OpenFlow 1.3 metering to implement QoS

• The enhanced cyberinfrastructure applications are:• Supercomputing resource management and scheduling. File

transfer via• GridFTP• scpcan then be scheduled and included as part of a workflow

• Integrated into a distributed wide area file system• SLASH2

Presenter
Presentation Notes
The metering capability and marking of packets, along with port queueing and weighting functionality, implements QoS SLASH2 was developed by PSC and is an open source wide area network (WAN)-friendly distributed file system offering advanced features such as data multi-residency, system-managed data transfer, and inline checksum verification. We are working to incorporate the network bandwidth scheduling capability with SLASH2 file replication.
Page 5: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

5© Pittsburgh Supercomputing Center

SDN/OF Infrastructure Components and Interfaces

• CONGA bandwidth manager software• Receives bandwidth request from application• Verifies user authorization with user database• Tracks available/allocated bandwidth between sites• Initiates OpenFlow path set up via the RYU OpenFlow controller

• RYU OpenFlow controller software• Writes flowmods (routing rules) to switches• Reads port statistics to verify operation

• OpenFlow 1.3-compatible switch hardware

Presenter
Presentation Notes
Expand as you see fit
Page 6: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

6© Pittsburgh Supercomputing Center

Presenter
Presentation Notes
The next few slides step through the control and data flow of the system. User submits job (via qsub, for example) that requires file transfer with reserved bandwidth QoS (can also include compute and storage requests) to the Resource Manager / Scheduler. Torque Prologue script initiates RM/S check with CONGA for user auth and available bandwidth between src and dst. CONGA checks user authorization listed in XSEDE User Database for bandwidth scheduling (We worked with the XSEDE allocations/database team and they created a small database with entries for the DANCES project team members. The test db is accessible via REST API) and determines [(based on tracking state and/or polling?)] if bandwidth is available. If not, [best effort and inform user?] If bandwidth is available, CONGA will initiate path and bandwidth provisioning with the RYU OF controller. RYU OF controller provisions local and remote flow and bandwidth Torque/<scheduler> schedules job when resources are available. System has been used with Moab@NICS and PBS@PSC; PSC in production will likely use SLURM Execute file transfer Torque Epilogue script or timeout will initiate tear down of provisioned path when transfer is finished
Page 7: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

7© Pittsburgh Supercomputing Center

Presenter
Presentation Notes
User submits job (via qsub, for example) that requires file transfer with reserved bandwidth QoS (can also include compute and storage requests) to the Resource Manager / Scheduler. Torque Prologue script initiates RM/S check with CONGA for user auth and available bandwidth between src and dst. CONGA checks user authorization listed in XSEDE User Database for bandwidth scheduling (We worked with the XSEDE allocations/database team and they created a small database with entries for the DANCES project team members. The test db is accessible via REST API) and determines [(based on tracking state and/or polling?)] if bandwidth is available. If not, [best effort and inform user?] If bandwidth is available, CONGA will initiate path and bandwidth provisioning with the RYU OF controller. RYU OF controller provisions local and remote flow and bandwidth Torque/<scheduler> schedules job when resources are available. System has been used with Moab@NICS and PBS@PSC; PSC in production will likely use SLURM Execute file transfer Torque Epilogue script or timeout will initiate tear down of provisioned path when transfer is finished
Page 8: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

8© Pittsburgh Supercomputing Center

Presenter
Presentation Notes
User submits job (via qsub, for example) that requires file transfer with reserved bandwidth QoS (can also include compute and storage requests) to the Resource Manager / Scheduler. Torque Prologue script initiates RM/S check with CONGA for user auth and available bandwidth between src and dst. CONGA checks user authorization listed in XSEDE User Database for bandwidth scheduling (We worked with the XSEDE allocations/database team and they created a small database with entries for the DANCES project team members. The test db is accessible via REST API) and determines [(based on tracking state and/or polling?)] if bandwidth is available. If not, [best effort and inform user?] If bandwidth is available, CONGA will initiate path and bandwidth provisioning with the RYU OF controller. RYU OF controller provisions local and remote flow and bandwidth Torque/<scheduler> schedules job when resources are available. System has been used with Moab@NICS and PBS@PSC; PSC in production will likely use SLURM Execute file transfer Torque Epilogue script or timeout will initiate tear down of provisioned path when transfer is finished
Page 9: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

9© Pittsburgh Supercomputing Center

Presenter
Presentation Notes
User submits job (via qsub, for example) that requires file transfer with reserved bandwidth QoS (can also include compute and storage requests) to the Resource Manager / Scheduler. Torque Prologue script initiates RM/S check with CONGA for user auth and available bandwidth between src and dst. CONGA checks user authorization listed in XSEDE User Database for bandwidth scheduling (We worked with the XSEDE allocations/database team and they created a small database with entries for the DANCES project team members. The test db is accessible via REST API) and determines [(based on tracking state and/or polling?)] if bandwidth is available. If not, [best effort and inform user?] If bandwidth is available, CONGA will initiate path and bandwidth provisioning with the RYU OF controller. RYU OF controller provisions local and remote flow and bandwidth Torque/<scheduler> schedules job when resources are available. System has been used with Moab@NICS and PBS@PSC; PSC in production will likely use SLURM Execute file transfer Torque Epilogue script or timeout will initiate tear down of provisioned path when transfer is finished
Page 10: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

10© Pittsburgh Supercomputing Center

Presenter
Presentation Notes
User submits job (via qsub, for example) that requires file transfer with reserved bandwidth QoS (can also include compute and storage requests) to the Resource Manager / Scheduler. Torque Prologue script initiates RM/S check with CONGA for user auth and available bandwidth between src and dst. CONGA checks user authorization listed in XSEDE User Database for bandwidth scheduling (We worked with the XSEDE allocations/database team and they created a small database with entries for the DANCES project team members. The test db is accessible via REST API) and determines [(based on tracking state and/or polling?)] if bandwidth is available. If not, [best effort and inform user?] If bandwidth is available, CONGA will initiate path and bandwidth provisioning with the RYU OF controller. RYU OF controller provisions local and remote flow and bandwidth Torque/<scheduler> schedules job when resources are available. System has been used with Moab@NICS and PBS@PSC; PSC in production will likely use SLURM Execute file transfer Torque Epilogue script or timeout will initiate tear down of provisioned path when transfer is finished
Page 11: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

11© Pittsburgh Supercomputing Center

Presenter
Presentation Notes
User submits job (via qsub, for example) that requires file transfer with reserved bandwidth QoS (can also include compute and storage requests) to the Resource Manager / Scheduler. Torque Prologue script initiates RM/S check with CONGA for user auth and available bandwidth between src and dst. CONGA checks user authorization listed in XSEDE User Database for bandwidth scheduling (We worked with the XSEDE allocations/database team and they created a small database with entries for the DANCES project team members. The test db is accessible via REST API) and determines [(based on tracking state and/or polling?)] if bandwidth is available. If not, [best effort and inform user?] If bandwidth is available, CONGA will initiate path and bandwidth provisioning with the RYU OF controller. RYU OF controller provisions local and remote flow and bandwidth Torque/<scheduler> schedules job when resources are available. System has been used with Moab@NICS and PBS@PSC; PSC in production will likely use SLURM Execute file transfer Torque Epilogue script or timeout will initiate tear down of provisioned path when transfer is finished
Page 12: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

12© Pittsburgh Supercomputing Center

Presenter
Presentation Notes
User submits job (via qsub, for example) that requires file transfer with reserved bandwidth QoS (can also include compute and storage requests) to the Resource Manager / Scheduler. Torque Prologue script initiates RM/S check with CONGA for user auth and available bandwidth between src and dst. CONGA checks user authorization listed in XSEDE User Database for bandwidth scheduling (We worked with the XSEDE allocations/database team and they created a small database with entries for the DANCES project team members. The test db is accessible via REST API) and determines [(based on tracking state and/or polling?)] if bandwidth is available. If not, [best effort and inform user?] If bandwidth is available, CONGA will initiate path and bandwidth provisioning with the RYU OF controller. RYU OF controller provisions local and remote flow and bandwidth Torque/<scheduler> schedules job when resources are available. System has been used with Moab@NICS and PBS@PSC; PSC in production will likely use SLURM Execute file transfer Torque Epilogue script or timeout will initiate tear down of provisioned path when transfer is finished
Page 13: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

13© Pittsburgh Supercomputing Center

Workflow Example

1. User submits job (e.g., via qsub) that requires file transfer with reserved bandwidth QoS. Bandwidth request is initiated in Torque Prologue

2. CONGA checks user authorization in XSEDE User Database

3. If bandwidth is available, CONGA will initiate path and bandwidth provisioning with the RYU OF controller.

4. RYU OF controller provisions local and remote flow and bandwidth

Presenter
Presentation Notes
This slide and the next are text for the previous diagrams so may or may not be useful to include. User submits job (via qsub, for example) that requires file transfer with reserved bandwidth QoS (can also include compute and storage requests) to the Resource Manager / Scheduler. Torque Prologue script initiates RM/S check with CONGA for user auth and available bandwidth between src and dst. CONGA checks user authorization listed in XSEDE User Database for bandwidth scheduling (We worked with the XSEDE allocations/database team and they created a small database with entries for the DANCES project team members. The test db is accessible via REST API) and determines [(based on tracking state and/or polling?)] if bandwidth is available. If not, [best effort and inform user?] If bandwidth is available, CONGA will initiate path and bandwidth provisioning with the RYU OF controller. RYU OF controller provisions local and remote flow and bandwidth
Page 14: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

14© Pittsburgh Supercomputing Center

Workflow Example

5. Torque/<scheduler> schedules job when resources are available. System has been tested with Moab local scheduling at NICS and PBS at PSC; PSC in production will likely use SLURM

6. Execute file transfer

7. Torque Epilogue script or timeout will initiate tear down of provisioned path when transfer is finished

Presenter
Presentation Notes
Torque/<scheduler> schedules job when resources are available. System has been used with Moab@NICS and PBS@PSC; PSC in production will likely use SLURM Execute file transfer Torque Epilogue script or timeout will initiate tear down of provisioned path when transfer is finished
Page 15: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

15© Pittsburgh Supercomputing Center

Implementation

• CONGA • Custom software created as the “Northbound API” for DANCES

project• Uses REST API

• RYU OpenFlow controller

• Production deployment will require extension to XSEDE allocation system to accommodate a dedicated network bandwidth resource to be managed along with traditional compute and storage resources

• Accounting by duration of dedicated bandwidth request

Presenter
Presentation Notes
The implementation of DANCES required significant software development and integration effort. CONGA: Created a “northbound API” for CONGA to enable the applications (Torque and SLASH2) to communicate with the RYU OpenFlow controller framework. With the concept of QoS request scheduling and cross domain operation, CONGA also needed to track bandwidth allocation. REST API chosen to ease extensibility RYU Provided as an OpenFlow controller “framework” with example modules Customize for specific applications Integration with existing XSEDE operational model Using a test database of DANCES development team members in XSEDE user database format
Page 16: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

16© Pittsburgh Supercomputing Center

Implementation and Deployment

• Original plan was full interoperation with Internet2’s AL2S for end-to-end SDN

• Internet2’s upcoming transition to MPLS removes SDN from WAN production infrastructure to a 10Gb research overlay

• The DANCES components are still applicable to campuses with congested 10Gb connectivity and DTNs close to the edge

• Each site would run an instance of CONGA or some other type of resource manager/scheduler and RYU to locally manage devices and congestion

Presenter
Presentation Notes
Created software module to interface with OESS to provision VLANs on demand. B/W provisioning was not supported by OF1.0 on AL2S, but we tested QoS enforced at end point switches. Will be working with I2 to define interoperation as I2 transitions to MPLS (Someone from I2 may add comments) With the campus to XSEDE SP concept, resource contention is more likely at the campus, not within the WAN path or SP so DANCES functionality would improve throughput without requiring support at both ends or completely end-to-end
Page 17: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

17© Pittsburgh Supercomputing Center

Observations

• Cross-domain control of resources is challenging• SDN/OF has been slower to gain WAN deployment

support than originally expected (or has gone the opposite direction)

• Control of multi-vendor SDN environments requires additional custom coding

Presenter
Presentation Notes
By Bullet2 I’m referring to I2 stepping away from SDN/OF to MPLS
Page 18: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

18© Pittsburgh Supercomputing Center

Future Work

• Additional testing of OpenFlow 1.3 flow metering with various queue sizes and weights

• Measure network bandwidth utilization achieved by using bandwidth scheduling for large flows along with “best effort” traffic

• Expand test deployment

• Prepare and package for production deployment

Presenter
Presentation Notes
Project is drawing to a close, but in the time remaining, we may...
Page 19: CC-NIE DANCES Project: Implementing Multi-Site, …noc.ucsc.edu/docs/I2-tech-X/DANCES_I2_TechEx_Sep2016_18...Project was started in Jan2014. The two year project will finish in 2016

19© Pittsburgh Supercomputing Center

Questions?

https://www.dances-sdn.org

[email protected]