ogce teragrid 2010 science gateway tutorial intro

26
CloudCom 2010

Upload: marpierc

Post on 11-May-2015

1.348 views

Category:

Documents


3 download

DESCRIPTION

OGCE TeraGrid 2010 Science Gateway Tutorial Intro, Pittsburgh, PA, August 2-5

TRANSCRIPT

Page 1: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

CloudCom 2010

Page 2: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

Gateway Computing Environments (GCE10)

Page 3: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

Software for Science Gateways: Open Grid Computing

EnvironmentsMarlon Pierce, Suresh Marru

Pervasive Technology InstituteIndiana University

www.collab-ogce.org

Page 4: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

Tutorial Agenda• 1:00-2:15

– Introduction– Computational chemistry workflow example– Building the software

• 2:15-2:30: Break• 2:30-5:00

– Hands on workflow: OREChem– Hands on portal: Data mining– Demo: gadget container

Page 5: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

http://www.collab-ogce.org/ogce/index.php/Tutorials

Link to demonstration movies.

Page 6: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

TeraGrid is one of the largest investments in shared CI from NSF’s Office of Cyberinfrastructure

Page 7: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

TeraGrid resources today include:• Tightly Coupled Distributed Memory Systems, 2 systems in the top 10 at top500.org

– Kraken (NICS): Cray XT5, 99,072 cores, 1.03 Pflop– Ranger (TACC): Sun Constellation, 62,976 cores, 579 Tflop, 123 TB RAM

• Shared Memory Systems– Cobalt (NCSA): Altix, 8 Tflop, 3 TB shared memory– Pople (PSC): Altix, 5 Tflop, 1.5 TB shared memory

• Clusters with Infiniband– Abe (NCSA): 90 Tflops– Lonestar (TACC): 61 Tflops– QueenBee (LONI): 51 Tflops

• Condor Pool (Loosely Coupled)– Purdue- up to 22,000 cpus

• Gateway hosting– Quarry (IU): virtual machine support

• Visualization Resources– TeraDRE (Purdue): 48 node nVIDIA GPUs– Spur (TACC): 32 nVIDIA GPUs

• Storage Resources– GPFS-WAN (SDSC)– Lustre-WAN (IU)– Various archival resources

Source: Dan Katz, U Chicago

But change is constant - new systems:•Data Analysis and Vis systems

•Longhorn (TACC): Dell/NVIDIA, CPU and GPU•Nautilus (NICS): SGI UltraViolet, 1024 cores, 4TB global shared memory

•Data-Intensive Computing•Dash (SDSC): Intel Nehalem, 544 processors, 4TB flash memory

•FutureGrid•Experimental computing grid and cloud test-bed to tackle research challenges in computer science

•Keeneland•Experimental, high-performance computing system with NVIDIA Tesla accelerators

Page 8: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

What Is a Science Gateway?• Web and desktop user interfaces and user-centric Web

services for accessing Grid and Cloud resources.– Clusters, supercomputers, mass storage– Applications, databases– Workflows

• Example Science Gateways from the NSF TeraGrid– GridChem: computational chemistry– UltraScan: biophysics computational analysis – LEAD: Atmospheric science– BioDrugScreen: drug docking, scoring, and discovery.

• Many others: see https://www.teragrid.org/web/science-gateways/gateway_list

• This tutorial is about software that powers gateways.

• Web and desktop user interfaces and user-centric Web services for accessing Grid and Cloud resources.– Clusters, supercomputers, mass storage– Applications, databases– Workflows

• Example Science Gateways from the NSF TeraGrid– GridChem: computational chemistry– UltraScan: biophysics computational analysis – LEAD: Atmospheric science– BioDrugScreen: drug docking, scoring, and discovery.

• Many others: see https://www.teragrid.org/web/science-gateways/gateway_list

• This tutorial is about software that powers gateways.

Page 9: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

When is a gateway appropriate?• Provide access to community applications

– WRF, Gaussian, CHARMM, Amber, BLAST, CCSM, UltraScan– Create multi-scale workflows

• Provide access to community data sets– National Virtual Observatory– Earth System Grid– Some groups have invested significant efforts here

• caBIG, extensive discussions to develop common terminology and formats

• BIRN, extensive data sharing agreements• Difficult to access data/advanced workflows

– Sensor/radar input• LEAD, GEON

Page 10: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

3 steps to connect a gateway to TeraGrid• Request an allocation

– Only a 1 paragraph abstract required for up to 200k CPU hours

• Register your gateway– Visibility on public TeraGrid page

• Request a community account– Run jobs for others via your portal

• Staff support is available!• www.teragrid.org/gateways

SciDAC, Chattanooga, TN, July 16, 2010

Page 11: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

1111

OVP/RST/ MIG

OGCERe-engineer, Generalize,

Build, Test and Release

LEAD

OGCE Gateway Tool Adaption & Reuse

GridChem

TeraGridUser Portal

OGCE Team

GridChem

Ultrascan

BioVLab

ODI

Bio Drug Screen

EST Pipeline

Future Grid

GFac, XBaya, XRegistry, FTR

Eventing System

LEAD

Resource Discovery Service

GPIR, File Browser

Gadget Container, GTLab, Javascript Cog,

XRegistry Interface, Experiment Builder, Axis2 Gfac, Axis2 Eventing System,

Resource Prediction Service, Swarm

Experiment Builder, XRegistry Interface

Xbaya, GC Middleware

GFac, Eventing System

XBaya, GFac

Workflow Suite, Gadget Container

Swarm->GFac

Swarm->GFac

GFac, Xbaya, …

Page 12: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

Software DescriptionOGCE Gadget Container

Google Gadget/Open Social compatible software for building Web-based user interfaces.

XBaya A visual user interface for composing, launching and monitoring workflows

GFAC An application factory service for wrapping command-line tools as Web services

XRegistry; Registry Gadget

A service and workflow registry and its user interface

Experiment Builder User interface for creating online experiments with registered workflows

Page 13: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

Compute ResourcesCompute Resources

Resource Middleware

Resource Middleware Cloud Interfaces Grid Middleware SSH & Resource

Managers

Computational Clouds

Computational Grids

Gateway ServicesGateway Services

User Interfaces

User Interfaces

Web/Gadget

Container

Web Enabled Desktop

Applications

User Managemen

t

Auditing & Reporting

Fault Tolerance

Application Abstractions

Workflow System

Information Services

ApplicationMonitoring

Registry Security

Provenance & Metadata Managemen

t

Local Resources

Web/Gadget

Interfaces

Gateway Abstraction Interfaces

Science Gateways Layer Cake

Color Coding

Dependent resource provider components

Complimentary Gateway Components

OGCE Gateway Components

Page 14: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

GFac Current & Future Features

Input Handlers

Input Handlers

Scheduling Interface

Scheduling Interface

AuditingAuditing

Monitoring Interface

Monitoring Interface

Data Management Abstraction

Data Management Abstraction

Job ManagementAbstraction

Job ManagementAbstraction

Fault Toleranc

e

Fault Toleranc

e

Output HandlersOutput

Handlers

Registry InterfaceRegistry Interface

Checkpoint Support

Checkpoint Support

GlobusGlobus

Campus ResourcesCampus

Resources

UnicoreUnicore

CondorCondor

Amazon Eucalyptus

Amazon Eucalyptus

Color Coding

Planned/Requested Features

Existing Features

Page 15: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

OGCE Layered Workflow Architecture:Derived from LEAD Workflow System

Workflow Execution &

Control Engines

Workflow Execution &

Control Engines

Apache ODE

Workflow Specification

Workflow Specification

Workflow Interfaces (Design

& Definition)

Workflow Interfaces (Design

& Definition)

PythonBPEL 2.0

BPEL 1.0 Java Code Pegasus DAG

Scufl

XBaya GUI (Composition,

Deploying, Steering & Monitoring) Gadget Interface for

Input Binding

Condor DAGMan

Taverna

Dynamic Enactor

Jython InterpreterGBPEL

Flex/Web Composition

Page 16: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

Putting It All Together

Page 17: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

Software Strategy

• Focus on gadget container and tools for running science applications on grids and clouds.

• Provide a tool set that can be used in whole or in part.– If you just want GFac, then you can use it without

buying an entire framework.

• Outsource security, information services, data and metadata, etc to other providers.– MyProxy, TG IIS, Globus, Condor, XMC Cat, iRods, etc.

Page 18: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

More Information• This is downloadable, packaged software.

– Apache Maven build system provides everything you need to to build the gadget container, gadgets, workflow composer, and backing services.

– Get code by anonymous SVN checkout.• Email: [email protected],

[email protected], [email protected]

• OGCE Web Site: www.collab-ogce.org• Blog/News Feed: http://collab-

ogce.blogspot.com/

Page 19: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

Acknowledgements and People

• Funding by TeraGrid GIG, RP and by OCI SDCI• IU: Marlon Pierce, Suresh Marru, Raminder

Singh, Archit Kulshrestha, Zhenhua Guo• TACC: Maytal Dahan, Rion Dooley• SDSC: Nancy Wilkins-Diehr, Jeff Sale• SDSU: Mary Thomas

Page 20: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

Demos Next

Page 21: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

The OGCE Application Registry gadget allows users to interactively register hosts and applications that are

dynamically wrapped as Web services.

Page 22: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

The OGCE Gadget Container allows you to build portals out of public and private Google Open Social gadgets. Supports HTTPS.

Downloadable, packaged software.

Page 23: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

The OGCE Experiment Builder gadget allows users to create projects and experiments out of previously

composed workflows.

Page 24: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

The XBaya workflow composer allows you to build scientific workflows from services running across the TeraGrid. This is part of our workflow suite.

OGCE Tools for Science Workflows

Page 25: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

What Is a Science Gateway?• Web and desktop user interfaces and user-centric Web

services for accessing Grid and Cloud resources.– Clusters, supercomputers, mass storage– Applications, databases– Workflows

• Example Science Gateways from the NSF TeraGrid– GridChem: computational chemistry– UltraScan: biophysics computational analysis – LEAD: Atmospheric science– BioDrugScreen: drug docking, scoring, and discovery.

• Many others: see https://www.teragrid.org/web/science-gateways/gateway_list

• This demo is about software that powers gateways.

Page 26: OGCE TeraGrid 2010 Science Gateway Tutorial Intro

Google Gadget-Based Science Gateways

LEAD

PolarGrid