storkcloud - illinois institute of technologydatasys.cs.iit.edu/events/sciencecloud2013/s04.pdf ·...

27
StorkCloud Data Transfer Scheduling and Optimization as a Service Presented by: Brandon Ross

Upload: others

Post on 06-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

StorkCloud Data Transfer Scheduling and

Optimization as a Service Presented by: Brandon Ross

Page 2: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Contents 1.  Introduction 2.  Components 3.  Optimization 4.  Conclusion

2 of 23

Page 3: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

StorkCloud? ü Multi-protocol data transfer scheduler ü Remote metadata retrieval and

caching service ü Dynamic, protocol-agnostic transfer

optimization to improve speed ü All in the cloud – accessible through

thin client GUIs and public REST API

3 of 23

Page 4: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Why StorkCloud? ü  Storage and computation are both in the

cloud – cloud transfer is the missing link ü  Data transfer is usually inconvenient –  Many different protocols –  Many different applications

ü  Transferring large files requires monitoring whole process – StorkCloud is “fire-and-forget”

4 of 23

Page 5: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Why optimize? ü  Transfers are usually suboptimal

–  Inadequacies of underlying protocols –  End-system misconfiguration

ü  Not designed for high-speed networks ü  Some applications can be specially

configured, but it’s mostly guesswork ü  Network environments can vary – dynamic

optimization is important ü  StorkCloud aims to solve these problems

5 of 23

Page 6: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Who? ü  Scientists in data-oriented fields

(astrophysics, genomics, climatology, biochemistry, etc.)

ü Data centers looking to outsource replication and data placement

ü Application developers who might want to offload data transfer tasks

ü Anyone with a lot of data to move

6 of 23

Page 7: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Similar Work ü Only similar service we know of is

Globus Online – Mature, popular service

•  Over 18.3 petabytes transferred! – Designed to support FTP and GridFTP –  Statically optimized transfers – No prefetching/caching available for

directory listings

7 of 23

Page 8: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Contents 1.  Introduction 2.  Components 3.  Optimization 4.  Conclusion

8 of 23

Page 9: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

9 of 23

Page 10: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Stork Data Scheduler ü Accepts file transfer jobs – source,

destination, and other options ü Places job in queue for processing ü Passes jobs off to transfer module ü Job status can be queried by clients ü Currently first come, first served

10 of 23

Page 11: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Directory Listing Service ü Conceptually: unified metadata

interface to many file systems ü Retrieves file and directory metadata

from remote file systems ü Returns results as JSON ü Uses caching and prefetching to

improve responsiveness

11 of 23

Page 12: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

12 of 23

DLS Performance

Page 13: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Transfer Modules ü Pluggable transfer modules perform

transfers for specific protocols ü Communicate progress and

messages back to scheduler ü Either Java bytecode or external

executable – can be any language!

13 of 23

Page 14: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Client Interfaces ü  Thin clients communicate with server REST API

using JSON –  Starting or canceling transfers, or querying transfer

status –  Browsing remote directories –  Initializing credentials (e.g. GSI proxies)

ü  Currently have Android and web applications, and command line tools

ü  Our GUI applications can browse remote files and check transfer progress

14 of 23

Page 15: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Contents 1.  Introduction 2.  Components 3.  Optimization 4.  Conclusion

15 of 23

Page 16: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Optimization in StorkCloud ü  Each pluggable optimizer is an

implementation of a different algorithm ü  Each targets a set of file transfer

parameters ü  Feedback loop; optimizer and TM work

together to optimize transfer ü  Optimizers are protocol-agnostic – only

care about whether transfers support targeted parameters

16 of 23

Page 17: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Parameters and Techniques ü Pipelining – “queuing up” transfer

commands at a remote system ü Parallelism – transferring file data

over multiple connections ü Concurrency – transferring multiple

files at a time

17 of 23

Page 18: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

18 of 23

Parameters Visualized

Page 19: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Algorithms ü Optimal parallelism prediction: – Samples points, performs regression

analysis to predict optimal parallelism – 2nd order and c-order analysis

ü Parallelism-Concurrency-Pipelining – Uses historical database and clustering

19 of 23

Page 20: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Algorithms ü  Single/Multi Chunk Concurrency

–  Newest algorithms, designed for multi-file transfers with mixed sizes

–  Partition file sets into “chunks” based on file size and transfer concurrently on multiple channels

–  Each channel is configured heuristically –  SCC: transfer chunks one at a time, split across all

concurrent channels –  MCC: each chunk gets a dedicated channel;

reduces effect of small files

20 of 23

Page 21: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Contents 1.  Introduction 2.  Components 3.  Optimization 4.  Conclusion

21 of 23

Page 22: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

22 of 23

SCC/MCC Performance

0 500

1000 1500 2000 2500 3000 3500 4000 4500 5000

1 2 4 6 8 10

Thro

ughp

ut (M

bps)

Concurrency

(a) (XSEDE)Globus-Online

SCCMCC

SCC_bufferMCC_buffer

0

100

200

300

400

500

600

700

800

900

1 2 4 6 8 10

Thro

ughp

ut (M

bps)

Concurrency

(b) (LONI)SCCMCC

XSEDE LONI

Page 23: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

23 of 23

Effects of Parameters

Page 24: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

Future Work ü  Additional scheduling algorithms and priority ü  Date-based scheduling and routine jobs ü  Direct file upload/download through client

interfaces – currently only 3rd party transfers ü  SSL/TLS for secure communications (HTTPS) ü  Additional protocols (SFTP, HTTP, AFP, etc.) ü  Temporary file parking – storage on server in

case of destination issues ü  Historical performance database

24 of 23

Page 25: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

The End Thank you! Any questions?

Page 26: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

26

Page 27: StorkCloud - Illinois Institute of Technologydatasys.cs.iit.edu/events/ScienceCloud2013/s04.pdf · Places job in queue for processing ! Passes jobs ... from remote file systems !

27