teragrid science gateways

24
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways [email protected] TeraGrid Rount Table, October 7, 2010

Upload: selima

Post on 24-Feb-2016

53 views

Category:

Documents


0 download

DESCRIPTION

TeraGrid Science Gateways. Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways [email protected]. What have the gateways been up to?. Ultrascan Borries Demeler, UT ; Suresh Marru, Raminder Singh, IU Gateway software listing Wrap up of support for Arroyo, RENCI science portal - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: TeraGrid Science Gateways

TeraGrid Rount Table, October 7, 2010

TeraGrid Science Gateways

Nancy Wilkins-DiehrTeraGrid Area Director for Science

[email protected]

Page 2: TeraGrid Science Gateways

What have the gateways been up to?•Ultrascan–Borries Demeler, UT ; Suresh Marru, Raminder Singh, IU

•Gateway software listing•Wrap up of support for Arroyo, RENCI science portal–But hopefully not the end of TG usage by those groups

•Dark Energy Survey–Jim Myers, Michelle Gower, NCSA

•CUE presentation–Derek Simmel, PSC

•login-env, build, comm, math, tgTeraGrid Rount Table, October 7, 2010

Page 3: TeraGrid Science Gateways

•GRAM5–Making use of TG portal user forum for discussion– Interest in sharing experiences with OSG–Update on Inca tests (able to recreate load from “Gateway Debug

2007”)–Gateway experiences – hung processes when errors pile up–SGE job manager issues–Nice work by David Carver (TACC), Suresh Marru (IU), Stu Martin

(ANL)•Expressed Sequence Tag gateway

–Archit Kulshrestha, IU•CIPRES

–Over 600 users on TG Apr-June–2.7M hours awarded 7/1/10, “model gateway proposal”

• But able to use much more than this•Gateways in the extension year•Gateway study

TeraGrid Rount Table, October 7, 2010

Page 4: TeraGrid Science Gateways

Analytical UltracentrifugationEmerging computational tool for the study of proteins

•Samples from researchers all over the world–Some (Germany, Australia)

have their own ultracentrifuges and use only the analysis capabilities, others send samples to UT to spin

•Spin the samples at high speeds, learn about macromolecule properties•Monte Carlo simulations•Observations are electronically digitized and stored for further mathematical analysis

TeraGrid Rount Table, October 7, 2010Source: Suresh Marru, IU

The Center for Analytical Ultracentrifugation of Macromolecular Assemblies, UT Health Sciences

Page 5: TeraGrid Science Gateways

Comprehensive data analysis environment•Management of analytical ultracentrifugation data for single users or entire facilities•Support for storage, editing, sharing and analysis of data–HPC facilities used for 2-D spectrum analysis and genetic

algorithm analysis•TeraGrid (~2M CPU hours used)•Technische University of Munich• Juelich Supercomputing Center

•Portable graphical user interface•MySQL database backend for data management•Over 30 active institutions

TeraGrid Rount Table, October 7, 2010Source: Suresh Marru, IU

Page 6: TeraGrid Science Gateways

Gateway and ASTA supporta growing trend

•TeraGrid advanced support–Fault tolerance–Workflows–Use of multiple TG resources (using Lonestar, expanding to

QueenBee and Ranger, using Quarry for test server, waiting for GRAM5 on Ranger)

–Community account implementation–Remote steering– Improved UI (no manual specification of CPU time)–Applying lessons learned from GridChem, LEAD, incorporating

new features into OGCE•LEAD is portlet-based, Gridchem is java swing client side app, Ultrascan is php and perl-based gateway, all can use OGCE

•Big MPI app that forks off many independent runs, improvements here will be tackled by TG's advanced support team

TeraGrid Rount Table, October 7, 2010Source: Suresh Marru, IU

Page 7: TeraGrid Science Gateways

Gateway software listing•Populate TeraGrid’s information service with gateway software information–Similar to RP software

listings•But, RP listings are maintained at RPs, IIS pulls from those sources

•With gateways we are thinking they fill in a form and push the info to IIS

•http://www.renci.org/~jdr0887/gawsr-howto/

TeraGrid Rount Table, October 7, 2010

Page 8: TeraGrid Science Gateways

Dark Energy Survey•Know universe is expanding, but expansion is accelerating for unknown reasons•DES is telescope experiment to constrain various theories- 4m telescope in Chile, Fermi and others developing new lens, working with simulated data until telescope goes online in 2011•200 TB raw data over 5 years, 4 PB of derived products- lots of filtering•Thousands of jobs run on TeraGrid each week with very few failures• Removing light from bright stars, airplanes, clouds, calibration- telescope operated by staff, users will use the portal to do queries for particular stars/regions of the sky afterwardTeraGrid Rount Table, October 7, 2010

Source: Jim Myers and Michele Gower, NCSA

Page 9: TeraGrid Science Gateways

•Condor dagman, condor-g, pre-ws gram, gridftp, elf/ogrescript for monitoring (developed at ncsa), oracle•Challenges–Efficiently managing small jobs in big batch world

•Databases stresses, block updates instead of individual transactions for better performance, indexing strategies, narrow vs wide tables•~100 front end users, expected to grow in production- changing paradigms from Sloan Digital Sky Survey - data now too large for bulk downloads and full table scans

TeraGrid Rount Table, October 7, 2010Source: Jim Myers and Michele Gower, NCSA

Page 10: TeraGrid Science Gateways

Expressed Sequence Tag (EST) Pipeline•Integrate existing computational biology software•Expand compute capacity by using TeraGrid•Take raw genome data in the FASTA format and run a series of applications on it–RepeatMasker, PaCE, CAP3 and BLAST used to generate the

final assembled output•EST Pipeline based on the SWARM Web Service that provides a web service interface to clients and also manages the bulk job submission using the Birdbath API to submit to Condor•Workflow is configured using a PHP based gateway that allows users to upload input data and select programs to run

TeraGrid Rount Table, October 7, 2010Source: Archit Kulshrestha, IU

Page 11: TeraGrid Science Gateways

Expressed Sequence Tag Assembly

•ESTs are a collection of random cDNA sequences, sequenced from a cDNA library or sequencing devices.–Typical inputs are of the order of millions

of sequences–Newer 454 devices produce higher

volume and are relatively easier to obtain and operate

–Stored in a file using the FASTA format•The ESTs are clustered and assembled to form contigs.•The contigs are then used to identify potential unknown genes, by Blasting against a known protein database.

Application PurposeRepeatMasker

Cleaning sequences

PaCE ClusteringCAP3 Assembly

BLAST Identification

Source: Archit Kulshrestha, IU

TeraGrid Rount Table, October 7, 2010

Page 12: TeraGrid Science Gateways

Application Runtime Characteristics

RepeatMasker

•Serial Execution on split input•Eg. 1000 for 2 million

PaCE

•MPI – Runtime of several hours•Exponential Growth in time with growth in input data. Increasing number of procs works quite well

CAP3

•Serial Runs on Clusters generated by PaCE – Clusters can be combined•Varied sizes with varied resource requirements (run times of milliseconds to days)

BLAST

•Serial – Takes CAP3 results. Number of jobs controlled by adjusting number of sequences per job.

Source: Archit Kulshrestha, IU

TeraGrid Rount Table, October 7, 2010

Page 13: TeraGrid Science Gateways

ResultsProgram No. Of Jobs Wait time +

Run timeRepeat Masker 1000 11:56PaCE 1 01:22CAP3 4073 25:44BLAST 893 49:00

The results are from a single 2 million job run and hence may not be an accurate model of the wait time. However other than in the case of BLAST the wait times were not a significant component of the total time.

Long waits due to long queue times for small jobs.

Previous run times – 5 days compared to 2. Serial waits eliminated.

Had hooks to inca to determine when jobs were down

Failure rate quite low – 10-12 out of thousands

Source: Archit Kulshrestha, IU

TeraGrid Rount Table, October 7, 2010

Page 14: TeraGrid Science Gateways

Cyberinfrastructure for Phylogenetic Research (CIPRES)

•Enables large-scale phylogenetic reconstructions•Parallel versions of applications such as MrBayes, Raxml and Garli run on Teragrid•Easy to use graphical user interface

TeraGrid Rount Table, October 7, 2010

Page 15: TeraGrid Science Gateways

CIPRES Portal users consumed 1,200,000 TeraGrid cpu hours between Dec 2009 and June 2010. This was 3 times our projected use.

A new award of 2.7 million cpu hours was made on July 1, 2010.

The portal provides access to parallel versions of MrBayes, RAxML, and GARLI, which all scale well on TG resources. The portal staff has worked with TG special projects group personnel and community developers to provide access to the fastest versions of MrBayes and RAxML available anywhere.

Access to BEST, a variant of MrBayes, is planned in the near future.A GPU platform called BEAGLE will be used to provide access to BEAST on Teragrid (Lincoln), also in the near future.

The toolkit will be expanded to provide access to other community codes that are appropriate for use on TeraGrid

Current Status:

Source: Mark Miller, SDSC

TeraGrid Rount Table, October 7, 2010

Page 16: TeraGrid Science Gateways

Usage Statistics for CIPRES Portal on TG 12/1/2009 – 5/31/2010

Monthdecjanfebmaraprmay

SU's

con

sum

ed

2e54e5

Monthdecjanfebmaraprmay

Jobs

Sub

mitt

ed

1e32e33e3

Monthdecjanfebmaraprmay

SU's

con

sum

ed

2e54e5

MonthdecjanfebmaraprmayJo

bs S

ubm

itted

1e32e33e3

Source: Mark Miller, SDSC

TeraGrid Rount Table, October 7, 2010

Page 17: TeraGrid Science Gateways

Intellectual Merit:

• the CIPRES portal is cited in at least 35 publications

• this includes publications in Nature, PNAS, and Cell.

• highlights of scientific findings:

New Family Tree for Arthropoda: A team of scientists compared genetic sequences from 75 arthropod species and drew a new family tree for the most successful phylum of animals on Earth. This work represents an important advance in the century-old problem of arthropod evolution.

Genome Sequence of a Transitional Eukaryote: A group of scientists sequenced the genome of Naegleria gruberi, a single-cell organism that is a key transitional species between prokaryotes and eukaryotes. This work provides new insights into the origins of subcellular organelles.

Co-evolution of Beetles and Flowering Plants: A group of researchers studied the evolutionary history of angiosperms and the beetles that interact with them. The work provided compelling experimental evidence for the long-postulated co-evolution of these two symbiotic groups.

Source: Mark Miller, SDSC

TeraGrid Rount Table, October 7, 2010

Page 18: TeraGrid Science Gateways

Broad Impacts:

• 77% of all jobs have been submitted from locations in the USA. Submissions are received regularly from researchers at top-tier institutions such as Harvard, Yale, and Stanford.

• Jobs are received regularly from academic institutions in 17 EPSCOR states.

• Job submissions have been received from 34 countries on 5 continents.

• At least 5 undergraduate classes are known to use the portal routinely. This is likely an underestimate (based on Web log patterns).

• More than 45,000 jobs have been run on the Portal over its lifetime. Between Dec 1, 2010 and June 30, 2010, users ran 6,108 parallel jobs on the TeraGrid.

Source: Mark Miller, SDSC

TeraGrid Rount Table, October 7, 2010

Page 19: TeraGrid Science Gateways

Broad Impacts:

Impacts on Productivity:

Average wall time for RAxML and GARLI jobs decreased 3-4 fold with the shift to TeraGrid resources.

Moreover, the number of RAxML jobs has doubled relative to the rate of submission on the CIPRES Portal running on the CIPRES cluster alone.

Thus, TeraGrid access is helping users finish their jobs faster and also to make more runs per unit time.

The average wall time for MrBayes jobs increased 2-fold on the TeraGrid, but the number of jobs decreased by approximately 33%. This trend reflects users’ ability to run much larger and longer jobs on TeraGrid than on the CIPRES cluster. The increased maximum run-time limit for MrBayes submissions to Abe (168 hours on Abe vs. 72 hours on the CIPRES cluster) allowed users to complete their long runs with a single large submission, thus eliminating the need to make smaller, incremental runs.

Source: Mark Miller, SDSC

TeraGrid Rount Table, October 7, 2010

Page 20: TeraGrid Science Gateways

Broad Impacts:

Improved User Access to TG: 100 – 150 new users per month access TG resources; the number of repeat users is growing….

Monthdec jan feb mar ap r may*

Tota

l use

rs

100

200

300

400

500

*may is a partial month (18 days), error bar projects full month

Monthdec jan feb mar apr may*

num

ber o

f use

rs

50

100

150

200

*may is a partial month (18 days)

Repeat Users

New Users

Source: Mark Miller, SDSC

TeraGrid Rount Table, October 7, 2010

Page 21: TeraGrid Science Gateways

New gateway activities in the extension year•Helpdesk support expanded

–From .2 FTE in PY5 to 1.7 in Extension [NCSA, Purdue]• Helpdesk and Condor support, new GIS communities, SimpleGrid extensions

•Accounting– Improved views for gateways now that we have attributes [TACC]

•Community accounts–Continued work toward improved standardization [NICS]

•Prebuilt VMs with gateway software–OGCE, SimpleGrid [IU, NCSA]

•Online tutorials with CI Tutor and the EOT team–OGCE, SimpleGrid [IU, NCSA]

•More example-based documentation–Less talk, more action, short videos, based on user feedback [NCSA,

SDSC]•Remote vis for gateways [ORNL]

TeraGrid Rount Table, October 7, 2010

Page 22: TeraGrid Science Gateways

Targeted Support in the ExtensionAll staff available for assignments as new projects come in

•Cactus–Meet the needs of several groups with large TG allocations

[LSU]•GridChem, PolarGrid, Ultrascan–Scheduling, vis, Matlab processing, processing of centrifuge

data for large international project [IU]•CCSM-ESG–Continuing work to combine capabilities [NCAR, Purdue]

•Uintah, computational fluids [NCAR, Utah]•SNS [ORNL]•CIPRES [SDSC]•OpenSocial for gateways [U Chicago]•Improved use of remote vis resources [ORNL]•Condor and cloud support [Purdue]TeraGrid Rount Table, October 7, 2010

Page 23: TeraGrid Science Gateways

Gateway Sustainability StudySmall, non-TG, EAGER grant

•Characteristics of short funding cycles– Build exciting prototypes with input

from scientists– Work with early adopters to extend

capabilities– Tools are publicized, more scientists

interested– Funding ends– Scientists who invested their time

to use new tools are disillusioned• Less likely to try something new again

– Start again on new short-term project

•Need to break this cycle•EAGER grant to look at characteristics of successful gateways and domain areas where a gateway could have a big impact TeraGrid Rount Table, October 7, 2010

4 focus group meetings over 2 yearsFirst 2 held June, 2010

www.sciencegateways.org

Page 24: TeraGrid Science Gateways

TeraGrid Rount Table, October 7, 2010

Thank you for your attention!Questions?

Nancy Wilkins-Diehr, [email protected]