studying protein folding on the grid: experiences using charmm on npaci resources under legion...
TRANSCRIPT
![Page 1: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/1.jpg)
Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion
University of VirginiaAnand Natrajan
Marty A. HumphreyAnthony D. Fox
Andrew S. Grimshaw
Scripps (TSRI)Michael Crowley
Charles L. Brooks III
SDSCNancy Wilkins-Diehr
http://legion.virginia.edu
![Page 2: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/2.jpg)
Outline
• CHARMM– Issues
• Legion• The Run
– Results– Lessons
• Portals• Summary
![Page 3: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/3.jpg)
CHARMM
• Routine exploration of folding landscapes helps in search for protein folding solution
• Understanding folding critical to structural genomics, biophysics, drug design, etc.
• Key to understanding cell malfunctions in Alzheimer’s, cystic fibrosis, etc.
• CHARMM and Amber benefit majority (>80%) of bio-molecular scientists
• Structural genomic & protein structure predictions
![Page 4: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/4.jpg)
Folding Free Energy LandscapeMolecular
Dynamics Simulations
100-200 structures to sample
(r,Rgyr ) space
Rgyr
![Page 5: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/5.jpg)
Folding of Protein L
• Immunoglobulin-binding protein– 62 residues (small), 585 atoms
– 6500 water molecules, total 20085 atoms
– Each parameter point requires O(106) dynamics steps
– Typical folding surfaces require 100-200 sampling runs
• CHARMM using most accurate physics available for classical molecular dynamics simulation
• Multiple 16-way parallel runs - maximum efficiency
![Page 6: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/6.jpg)
Application Characteristics
• Parameter-space study– Parameters correspond to structures along
& near folding path
• Path unknown - could be many or broad– Many places along path sampled for
determining local low free energy states– Path is valley of lowest free energy states
from high free energy state of unfolded protein to lowest free energy state (folded native protein)
![Page 7: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/7.jpg)
Application Characteristics
• Many independent runs
– 200 sets of data to be simulated in two sequential runs
• Equilibration (4-8 hours)
• Production/sampling (8 to 16 hours)
• Each point has task name, e.g., pl_1_2_1_e
![Page 8: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/8.jpg)
Legion
Complete, Integrated Infrastructure for Secure
Distributed Resource Sharing
![Page 9: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/9.jpg)
Grid OS Requirements
• Wide-area• High Performance• Complexity
Management• Extensibility• Security• Site Autonomy• Input / Output• Heterogeneity
• Fault-tolerance• Scalability• Simplicity• Single Namespace• Resource
Management• Platform
Independence• Multi-language• Legacy Support
![Page 10: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/10.jpg)
Transparent System
![Page 11: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/11.jpg)
npacinet
![Page 12: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/12.jpg)
The Run
![Page 13: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/13.jpg)
Computational Issues
• Provide improved response time• Access large set of resources
transparently– geographically distributed– heterogeneous– different organisations
6 organisations6 queue types
10 queues6 architectures
~1000 processors
![Page 14: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/14.jpg)
IBM Blue HorizonSDSC
375MHz Power3512/1184
IBM Blue HorizonSDSC
375MHz Power3512/1184
Resources Available
HP SuperDomeCalTech
440 MHz PA-8700128/128
HP SuperDomeCalTech
440 MHz PA-8700128/128
IBM SP3UMich
375MHz Power324/24
IBM SP3UMich
375MHz Power324/24
IBM AzureUTexas
160MHz Power232/64
IBM AzureUTexas
160MHz Power232/64
Sun HPC 10000SDSC
400MHz SMP32/64
Sun HPC 10000SDSC
400MHz SMP32/64
DEC AlphaUVa
533MHz EV5632/128
DEC AlphaUVa
533MHz EV5632/128
![Page 15: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/15.jpg)
Scientists Using Legion
• Binaries for each type• Script for dispatching
jobs• Script for keeping
track of results
• Script for running binary at site– optional feature in
Legion
• Abstract interface to resources– queues, accounting,
firewalls, etc.
• Binary transfer (with caching)
• Input file transfer• Job submission• Status reporting• Output file transfer
![Page 16: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/16.jpg)
Mechanics of Runs
Leg
ion
Register binaries
Create taskdirectories &specification
Dispatchequilibration
Dispatchequilibration& production
![Page 17: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/17.jpg)
71%
24%
1%
2%
1%
1%
0%SDSC IBMCalTech HPUTexas IBMUVa DECSDSC CraySDSC SunUMich IBM
Distribution of CHARMM Work
![Page 18: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/18.jpg)
LEGION
• Network slowdowns– Slowdown in the middle of the run– 100% loss for packets of size ~8500 bytes
• Site failures– LoadLeveler restarts– NFS/AFS failures
• Legion– No run-time failures– Archival support lacking– Must address binary differences
Problems Encountered
UVa
SDSC
UMich 01101
![Page 19: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/19.jpg)
Successes
• Science accomplished faster– 1 month on 128 SGI Origins @Scripps– 1.5 days on national grid with Legion
• Transparent access to resources– User didn’t need to log on to different machines– Minimal direct interaction with resources
• Problems identified• Legion remained stable
– Other Legion users unaware of large runs
• Large grid application run at powerful resources by one person from local resource
• Collaboration between natural and computer scientists
![Page 20: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/20.jpg)
Portal Interface
Easy Interface to Grid
![Page 21: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/21.jpg)
• Simple point-and-click interface to Grids– Familiar access to distributed file system– Enables & encourages sharing
• Application portal model for HPC– AmberGrid– RenderGrid– Accounting
Legion GUIs
Transparent Accessto Remote Resources
Intended Audience isScientists
![Page 22: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/22.jpg)
Logging in tonpacinet
![Page 23: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/23.jpg)
View ofcontexts(DistributedFile System)
![Page 24: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/24.jpg)
Control Panel
![Page 25: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/25.jpg)
RunningAmber
![Page 26: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/26.jpg)
RunStatus(Legion)
GraphicalView(Chime)
![Page 27: Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey](https://reader036.vdocuments.mx/reader036/viewer/2022070411/56649f425503460f94c62369/html5/thumbnails/27.jpg)
Summary
• CHARMM Run– Succeeded in starting big runs– Encountered problems– Learnt lessons for future
• AmberGrid– Showed proof-of-concept - grid portal– Need to resolve licence issues