cmts cyberinfrastructure - university of chicago

24
CMTS Cyberinfrastructure Michael Wilde [email protected] h5p://swi9lang.org

Upload: others

Post on 07-Dec-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

CMTS Cyberinfrastructure

 Michael  Wilde  -­‐  [email protected]  

 h5p://swi9-­‐lang.org  

CMTS Cyberinfrastructure

Cyberinfrastructure Challenges

•  Expressing  mul>-­‐scale  workflow  protocols  

•  Leveraging  and  expressing  parallelism  

•  Moving,  processing  and  managing  data  

•  Traversing  networks  and  security  •  Diverse  environments:  

schedulers,  storage…  •  Handling  system  failures  •  Tracking  what  was  done:  

where,  when,  how,  who?  

     

Campus-­‐wide  clusters  and  archival  storage  

   Departmental  clusters,  storage,  and  

personal  worksta>ons  

Cyberinfrastructure Tasks

§  Enable  easy  of  use  of  mul>ple  distributed  large-­‐scale  systems  

§  Reduce  effort  of  wri>ng  parallel  simula>on  and  analysis  scripts  

§  Record  and  organize  provenance  of  computa>ons  

§  Annotate  datasets  to  facilitate  sharing,  collabora>on,  and  valida>on  

§  Reduce  the  cost  of  modeling,  both  in  terms  of  wall  clock  >me  and  computer  >me.  

Swift Parallel Scripting Language: A Core Component of CMTS Cyberinfrastructure

•  Swi$  is  a  parallel  scrip>ng  language  •  Composes  applica,ons  linked  by  files  •  Captures  protocols  and  processes  as  a  CMTS  library  

•  Easy  to  write:  a  simple,  high-­‐level  language  •       Small  Swi:  scripts  can  do  large-­‐scale  work  

•  Easy  to  run:  on  clusters,  clouds,  supercomputers  and  grids  •       Sends  work  to  XSEDE,  campus,  and  Amazon  resources  

•  Automates  solu>ons  to  four  hard  problems  •       Implicit  parallelism  •       Transparent  execu,on  loca,on  •       Automated  failure  recovery  •       Provenance  tracking  

   Swi:  is  a  parallel  scrip>ng  language    Composes  applica,ons  linked  by  files  Easy  to  write:  a  simple,  high-­‐level  language        Small  Swi:  scripts  can  do  large-­‐scale  work  Easy  to  run:  on  clusters,  clouds  and  grids        Sends  work  to  XSEDE,  Amazon,  OSG,  Cray  Fast  and  highly  parallel      Runs  a  million  tasks  on  thousands  of  cores      hundreds  of  tasks  per  second              

   Swi:  is  a  parallel  scrip>ng  language    Composes  applica,ons  linked  by  files  Easy  to  write:  a  simple,  high-­‐level  language        Small  Swi:  scripts  can  do  large-­‐scale  work  Easy  to  run:  on  clusters,  clouds  and  grids        Sends  work  to  XSEDE,  Amazon,  OSG,  Cray  Fast  and  highly  parallel      Runs  a  million  tasks  on  thousands  of  cores      hundreds  of  tasks  per  second              

   Swi:  is  a  parallel  scrip>ng  language    Composes  applica,ons  linked  by  files  Easy  to  write:  a  simple,  high-­‐level  language        Small  Swi:  scripts  can  do  large-­‐scale  work  Easy  to  run:  on  clusters,  clouds  and  grids        Sends  work  to  XSEDE,  Amazon,  OSG,  Cray  Fast  and  highly  parallel      Runs  a  million  tasks  on  thousands  of  cores      hundreds  of  tasks  per  second              

   Swi:  is  a  parallel  scrip>ng  language    Composes  applica,ons  linked  by  files  Easy  to  write:  a  simple,  high-­‐level  language        Small  Swi:  scripts  can  do  large-­‐scale  work  Easy  to  run:  on  clusters,  clouds  and  grids        Sends  work  to  XSEDE,  Amazon,  OSG,  Cray  Fast  and  highly  parallel      Runs  a  million  tasks  on  thousands  of  cores      hundreds  of  tasks  per  second              

   Swi:  is  a  parallel  scrip>ng  language    Composes  applica,ons  linked  by  files  Easy  to  write:  a  simple,  high-­‐level  language        Small  Swi:  scripts  can  do  large-­‐scale  work  Easy  to  run:  on  clusters,  clouds  and  grids        Sends  work  to  XSEDE,  Amazon,  OSG,  Cray  Fast  and  highly  parallel      Runs  a  million  tasks  on  thousands  of  cores      hundreds  of  tasks  per  second              

Swi9’s  run>me  system  supports  and  aggregates  diverse,  distributed  execu>on  environments  

Using Swift in the Larger Cyberinfrastructure Landscape

Submit host (login node, laptop, Linux server)

Data servers

Swift script

Local data Campus  systems  Application

programs

Expressing CMTS Algorithms in Swift

1.   (…)EnergyLandscape(…) 2.   { 3.   loadPDB (pdbcode="3lzt"); 4.   createSystem (LayerOfWater=7.5, 5.   IonicStrength=0.15); 6.   preequilibrate (waterEquilibration_ps=30, 7.  

totalEquilibration_ps=1000); 8.   runMD (config="EL1",tinitial=0, tfinal=10, 9.   frame_write_interval=1000); 10.   postprocess(…); 11.   forcematching(…); 12.  }

…to  create  a  well-­‐structured  library  of  scripts  that  automate  workflow-­‐level  parallelism  

Transla>ng  from  a  shell  script  of  1300  lines  into  code  like  this:  

CMTS Cyberinfrastructure Architecture

1:  Run  script(EL1.trj)   2.  Lookup  file    name=EL1.trj  user=Anton  type=trajectory  

Storage    loca>ons  

3:  Transfer  inputs  

Compute  Facili>es  

4:  Run  app  

•  A challenging “breakthrough” project"•  Integration & evaluation underway"•  Full implementation by the end of

Phase II"

6:  Update  catalogs  

5:  Transfer  results    

Researchers  

External  collaborators  

CMTS  Collabora>on  Catalogs  

Provenance  

Files  &  Metadata  

Script  libraries  

0:  Develop  script  

When do you need Swift? Typical application: protein-ligand docking for drug screening

8  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

Work  of  M.  Kubal,  T.A.Binkowski,  And  B.  Roux  

(B)  

O(100K)  drug  

candidates  

Tens  of  fruiQul  candidates  for  wetlab  &  APS  

O(10)  proteins  implicated  in  a  disease  

1M  compute  jobs  

X   …  

Numerous many-task applications

9  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

§  Simula>on  of  super-­‐  cooled  glass  materials  

§  Protein  folding  using  homology-­‐free  approaches  

§  Climate  model  analysis  and    decision  making  in  energy  policy  

§  Simula>on  of  RNA-­‐protein  interac>on  

§  Mul>scale  subsurface  flow  modeling  

§  Modeling  of  power  grid  for  OE  applica>ons  

§  A-­‐E  have  published  science  results  obtained  using  Swi9    

T0623,  25  res.,  8.2Å  to  6.3Å    (excluding  tail)  

Protein  loop  modeling.  Courtesy  A.  Adhikari  

Na?ve        Predicted  

Ini?al  

E  

D  

C  

A   B  

A  

B  

C  

D  

E  

F   F  

>  

Nested parallel prediction loops in Swift

10  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

1.  Sweep(  )  2.  {  3.       int  nSim  =  1000;  4.       int  maxRounds  =  3;  5.       Protein  pSet[  ]  <ext;  exec="Protein.map">;  6.       float  startTemp[  ]  =  [  100.0,  200.0  ];  7.       float  delT[  ]  =  [  1.0,  1.5,  2.0,  5.0,  10.0  ];  8.       foreach  p,  pn  in  pSet  {  9.             foreach  t  in  startTemp  {  10.                   foreach  d  in  delT  {  11.                         ItFix(p,  nSim,  maxRounds,  t,  d);  12.                   }  13.             }  14.       }  15.  }  16. Sweep();  

10  proteins  x  1000  simula>ons  x  3  rounds  x  2  temps  x  5  deltas  

=  300K  tasks    

Programming model: all execution driven by parallel data flow

§  f()  and  g()  are  computed  in  parallel  §  myproc()  returns  r  when  they  are  done  

§  This  parallelism  is  automa,c  §  Works  recursively  throughout  the  program’s  call  graph  

11  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

(int r) myproc (int i)!{! j = f(i); ! k = g(i);! r = j + k;!}!!

Programming model: Swift in a nutshell

§  All  expressions  are  computed  in  parallel,  thro5led  by  various  levels  of  the  run>me  system  

§  Simple  data  type  model:  scalars  (boolean,  int,  float,  string,  file,  external)  and  collec>ons  (arrays  and  structs)  

§  ALL  data  atoms  and  collec>ons  have  “future”  behavior  §  A  few  simple  control  constructs  

–  if,  switch,  foreach,  iterate  §  A  growing  library  

–  tracef(),  strcat(),  regexp(),  readData(),  writeData()  

12  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

Swift 2.0

§  Mo>va>on  for  2.0  –  Scalability:  1M  tasks/sec  vs  500  –  goal  has  been  reached  for  basic  tests  

–  Richer  programming  model,  broader  spectrum  of  applica>on  –  Extensibility  and  Maintainability  

§  Convergence  issues  –  Some  array  closing;  for  loop  extensions;  library  cleanup;  mapper  models  

–  Data  marshalling/passing  for  in-­‐memory  leaf  func>ons  §  Informa>on  and  downloads  

–  h5p://exm.xstack.org  

13  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

Encapsulation enables distributed parallelism

14  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

Swi9  app(  )  func>on  Interface  defini>on  

Applica>on  program  

Typed  Swi9  data  object  

Files  expected  or  produced  

by  applica>on  program  

Encapsula?on  is  the  key  to  transparent  distribu?on,  paralleliza?on,    and  automa?c  provenance  capture  

app(  )  func?ons  specify  cmd  line  argument  passing

15  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

Swi9  app  func>on  “predict()”  

t  

seq   dt  log  

PSim  applica>on  -­‐t   -­‐d  -­‐s  -­‐c  

>  pdb  

pg  

 To  run:              psim  -­‐s  1ubq.fas  -­‐pdb  p  -­‐t  100.0  -­‐d  25.0  >log      In  Swi$  code:                  app  (PDB  pg,  File  log)  predict  (Protein  seq,                                                                              Float  t,  Float  dt)                {                      psim  "-­‐c"  "-­‐s"  @pseq.fasta  "-­‐pdb"  @pg                                        "–t"  temp  "-­‐d"  dt;                }                  Protein  p  <ext;  exec="Pmap",  id="1ubq">;                PDB  structure;                File  log;                    (structure,  log)  =  predict(p,  100.,  25.);  

Fasta  file  

100.0   25.0  

foreach  sim  in  [1:1000]  {          (structure[sim],  log[sim])  =  predict(p,  100.,  25.);  }  result  =  analyze(structure)  

…1000  

Runs  of  the  “predict”  applica,on  

Analyze()  

T1af7   T1r69  

T1b72  

Large scale parallelization with simple loops

16  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

Dataset mapping example: fMRI datasets

17  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

type  Study  {      Group  g[  ];    

}    type  Group  {    

 Subject  s[  ];    }    type  Subject  {    

 Volume  anat;      Run  run[  ];    

}    type  Run  {    

 Volume  v[  ];    }    type  Volume  {    

 Image  img;      Header  hdr;    

}  

On-­‐Disk  Data  Layout   Swi9’s  

in-­‐memory  data  model  

Mapping  func>on  or  script  

Metadata

18  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

tag      /d1/d2/f3    owner=asinitskiy  group=cmts-­‐chem          \                                                            create-­‐date=2013.0415      \                                                            type=trajectory-­‐namd  \                                                            state=published  \                                                            note="trajectory  for  UCG  case  1"  \                                                            molecule=1ubq  domain=loop7,  bead=u2    tag  /d1/d2/f0        owner=jdama  group=cmts-­‐chem      \                                                          create-­‐date=2013.0412    \                                                          type=trajectory-­‐namd  \                                                          state=validated  \                                                          note="trajectory  for  UCG  case  2"                                                          molecule=clathrin,  domain=loop1,  bead=px    

Metadata

19  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

Query ==> (owner=asinitskiy) and (molecule=1ubq)!!! dataset_id | name | value!------------+-------------+------------------------!! /d1/d2/f4 | owner | asinitskiy! | group | cmts-chem! | create-date | 2013.0416.12:20:29.456! | type | trajectory-namd! | state | published! | note | trajectory for case 1! | molecule | 2ubq! | domain | loop7! | bead | dt!!

Swift information

§  Swi9  Web:  –  h5p://www.ci.uchicago.edu/swi9    or  newer:  h5p://swi9-­‐lang.org  

§  Swi9  User  Guide:  –  h5p://swi9-­‐lang.org/guides/trunk/userguide/userguide.html  

 

20  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

Running the CMTS “cyber” tutorials #  pick  up  latest  changes  if  instructed:  cd  /project/cmtsworkshop  cp          –rp      tutorial/cyber            $USER/cyber      cd    $USER/cyber  source    ./setup.sh    cd  basic_swi9  cat  README    cd  part01  cat  README  swi9  p1.swi9        #  etc  etc  etc                  #  start  with  NAMD    cd  cyber/namd_sweep  cat  README  source    setup.sh        #  <==  Don’t  forget  this  step!!!    swi9  rmsd.swi9  –etc  etc  etc    

Summary of the CMTS Cyberinfrastructure Approach §  Streamlining scientist / computer interaction:"

–  Large datasets (e.g. MD trajectories), fast distributed storage"–  Chemically important metadata: force field, bound ions, …"–  Save time locating, accessing, moving, and sharing data"–  Scheduling, data management, authentication, site-specific

dependencies"–  Automatic recovery of failed computations"–  Save time by load balancing, leveraging more parallel resources, …"–  Benefit: chemists & biologists can focus on science and less on the

mechanics of computation.!

§  Facilitating scientific collaboration:"–  Understandable code, flexibility for future development"–  Find provenance and tag data and procedures for reuse and sharing"–  Facilitate collaboration by sharing data, libraries, protocols"–  Wide range of researchers able to use the environment."–  Benefit: cyberinfrastructure enables collaborations not otherwise

possible!

§  Swi9  is  a  parallel  scrip>ng  system  for  grids,  clouds  and  clusters  –  for  loosely-­‐coupled  applica>ons  -­‐  applica>on  and  u>lity  programs  linked  by  

exchanging  files  

§  Swi9  is  easy  to  write:  simple  high-­‐level  C-­‐like  func>onal  language  –  Small  Swi9  scripts  can  do  large-­‐scale  work  

§  Swi9  is  easy  to  run:  contains  all  services  for  running  Grid  workflow  -­‐  in  one  Java  applica>on  –  Untar  and  run  –  acts  as  a  self-­‐contained  Grid  client  

§  Swi9  is  fast:  uses  efficient,  scalable  and  flexible  “Karajan”  execu>on  engine.  –  Scaling  close  to  1M  tasks  –  .5M  in  live  science  work,  and  growing  

§  Swi9  usage  is  growing:  –  applica>ons  in  neuroscience,  proteomics,  molecular  dynamics,  biochemistry,  

economics,  sta>s>cs,  and  more.  

§  Try  Swi9!    www.ci.uchicago.edu/swi9  and  www.mcs.anl.gov/exm  

23  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

24  

www.ci.uchicago.edu/swi9        www.mcs.anl.gov/exm  

Author's personal copy

Swift: A language for distributed parallel scripting

Michael Wilde a,b,⇑, Mihael Hategan a, Justin M. Wozniak b, Ben Clifford d, Daniel S. Katz a,Ian Foster a,b,caComputation Institute, University of Chicago and Argonne National Laboratory, United StatesbMathematics and Computer Science Division, Argonne National Laboratory, United StatescDepartment of Computer Science, University of Chicago, United StatesdDepartment of Astronomy and Astrophysics, University of Chicago, United States

a r t i c l e i n f o

Article history:Available online 12 July 2011

Keywords:SwiftParallel programmingScriptingDataflow

a b s t r a c t

Scientists, engineers, and statisticians must execute domain-specific application programsmany times on large collections of file-based data. This activity requires complex orches-tration and data management as data is passed to, from, and among application invoca-tions. Distributed and parallel computing resources can accelerate such processing, buttheir use further increases programming complexity. The Swift parallel scripting languagereduces these complexities by making file system structures accessible via language con-structs and by allowing ordinary application programs to be composed into powerful par-allel scripts that can efficiently utilize parallel and distributed resources. We presentSwift’s implicitly parallel and deterministic programming model, which applies externalapplications to file collections using a functional style that abstracts and simplifies distrib-uted parallel execution.

! 2011 Elsevier B.V. All rights reserved.

1. Introduction

Swift is a scripting language designed for composing applicationprograms into parallel applications that can be executedonmulticore processors, clusters, grids, clouds, and supercomputers. Unlike most other scripting languages, Swift focuses on theissues that arise from the concurrent execution, composition, and coordination of many independent (and, typically, distrib-uted) computational tasks. Swift scripts express the execution of programs that consume and produce file-resident datasets.Swift uses a C-like syntax consisting of function definitions and expressions, with dataflow-driven semantics and implicit par-allelism. To facilitate the writing of scripts that operate on files, Swift mapping constructs allow file system objects to be ac-cessed via Swift variables.

Many parallel applications involve a single message-passing parallel program: a model supported well by the MessagePassing Interface (MPI). Others, however, require the coupling or orchestration of large numbers of application invocations:either many invocations of the same program or many invocations of sequences and patterns of several programs. Scaling uprequires the distribution of such workloads among cores, processors, computers, or clusters and, hence, the use of parallel orgrid computing. Even if a single large parallel cluster suffices, users will not always have access to the same system (e.g., bigmachines may be congested or temporarily unavailable because of maintenance). Thus, it is desirable to be able to use what-ever resources happen to be available or economical at the moment when the user needs to compute—without the need tocontinually reprogram or adjust execution scripts.

0167-8191/$ - see front matter ! 2011 Elsevier B.V. All rights reserved.doi:10.1016/j.parco.2011.05.005

⇑ Corresponding author at: Computation Institute, University of Chicago and Argonne National Laboratory, United States. Tel.: +1 630 252 3351.E-mail address: [email protected] (M. Wilde).

Parallel Computing 37 (2011) 633–652

Contents lists available at ScienceDirect

Parallel Computing

journal homepage: www.elsevier .com/ locate/parco

Parallel  Compu>ng,  Sep  2011