grid computing - an overview michael p. cummings laboratory of molecular evolution center for...

49
Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Post on 21-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Grid Computing - An Overview

Michael P. Cummings

Laboratory of Molecular EvolutionCenter for Bioinformatics and Computational Biology

Page 2: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Acknowledgments

Core Middleware Development Adam Bazinet Daniel Myers John Fuetsch Stephen McLellan, Chris Milliron, Deji

Akinyemi

Semantic Web Grid Services/Workflows Sung Lee, Fujitsu Laboratories of America Nada Hashmi, UMIACS (now CBA, Saudi Arabia) David Wang, UMIACS

Page 3: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Outline

Grid computing introduction and motivation

Goals of The Lattice Project Basic architecture Our current production Grid

system Implementation details Results of usage

Research and development

Page 4: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Grid Computing: A Definition

A model of distributed computing that uses geographically and administratively disparate resources. In Grid computing, individual users can access computers and data transparently, without having to consider location, operating system, account administration, and other details. In Grid computing, the details are abstracted, and the resources are virtualized.

Page 5: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Grid Computing: Characteristics

Resources are heterogeneous Resources are administratively

disparate Resources are geographically

disparate Users do not have to worry about

system details (e.g., location, operating system, accounts)

Page 6: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Grid Computing: Advantages

Provides increased resources for research

Utilizes resources already purchased

Space and HVAC needs already met Little increased administrative

burden Economically and environmentally

appealing

Page 7: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Types of Grid Middleware

Heavyweight/feature rich (e.g., Globus Toolkit) Multiple users and multiple

applications Mechanisms for authentication,

authorization, communication, file access, resource discovery and specification

Push model: jobs are assigned to specific resources

Page 8: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Types of Grid Middleware

Desktop Grids (e.g., Berkeley Open Infrastructure for Network computing [BOINC]) Single user and single application Limited features Pull model: clients contact server

for jobs

Page 9: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

An Example: SETI@home

A scientific experiment that uses Internet-connected computers in the Search for Extraterrestrial Intelligence (SETI), a scientific effort seeking to determine if there is intelligent life outside Earth. The project analyzes radio signals to look for patterns that might be associated with intelligent life.

Page 10: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

SETI@home statistics (Monday)

Total participants: 5,521,708 Rate of signup: a new participant

every 96 seconds Effective number of computers: At

any given moment there are the equivalent of >412,000 computers working full time

Results received: 2,200,991,756 Total CPU time: 2,555,681 years

Page 11: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Why Go Grid?

To speed research Parallel execution means higher throughput

To make compute resources commodities Analogous to the electrical power grid

To foster efficiency and interaction in the research community Use of the Grid spans departments and domains Grid resources are typically shared resources

Page 12: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Outline

Grid computing introduction and motivation

Goals of The Lattice Project Basic architecture Our current production Grid

system Implementation details Results of usage

Research and development

Page 13: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

The Lattice Project: Initial Goals

Develop a Grid system for research that: Speeds up workflows by “Grid-enabling”

various programs Is simple and intuitive Takes advantage of heterogeneous resources Is capable of managing large numbers of

jobs (thousands) Supports multiple users and lowers the

barriers to getting involved Is community-driven and supported

Page 14: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Principles of Design

Make use of well supported open source software Globus Toolkit BOINC Condor

Engineered software should be scalable, modular, and robust

Expose programs as well-defined services Arbitrary user-supplied code cannot be run

Page 15: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Grid: Development Challenges

Many middleware systems are not compatible

Middleware is cumbersome Developing a Grid service is

often difficult

Page 16: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Outline

Grid computing introduction and motivation

Goals of The Lattice Project Basic architecture Our current production Grid

system Implementation details Usage statistics

Research and development

Page 17: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Terminology

Client: A Grid user interface OR a machine that performs computation

Grid Service: A Grid-enabled program

Scheduler: A program that decides where Grid jobs will run

Resource: Executes Grid jobs

Page 18: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Basic Architecture (1 of 3)

Page 19: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Basic Architecture (2 of 3)

Page 20: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Basic Architecture (3 of 3)

Page 21: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Outline

Grid computing introduction and motivation

Goals of The Lattice Project Basic architecture Our current production Grid

system Implementation details Results of usage

Research and development

Page 22: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Software Components Globus Toolkit version 3.2.1

Backbone of the Grid http://www.globus.org/

Condor-G Grid-level scheduler / resource broker http://www.cs.wisc.edu/condor/

BOINC: Berkeley Open Infrastructure for Network Computing SETI@home-style desktop grid http://boinc.berkeley.edu/

Custom components GSBL, GSG, Globus-BOINC adaptor, MDS-

matchmaking bridge, user interface(s), administrative scripts, and much more

Page 23: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Globus Toolkit 3

Key components: Globus Core

Grid service hosting environment GSI – Grid Security Infrastructure

Uses public key cryptography Secures communication Authenticates and authorizes Grid users

WS GRAM – Job management GASS – Point to point file transfer MDS2 – Information provider

Page 24: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Condor-G

Condor-G is part of the Condor suite

Resources and jobs send Condor-G descriptions of themselves called ClassAds

Condor-G matches Grid jobs to suitable resources, then submits and manages them

This process is called matchmaking

Page 25: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

BOINC

Most novel feature of our Grid Public computing model

Untrusted resources

Potentially our largest resource

We have targeted 3 platforms: Windows / Linux x86 / Mac OS X

Page 26: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Our Current Grid System

Page 27: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

User Interface The “Grid Brick”: a machine used to submit Grid

jobs Our primary interface for Grid users Command line clients mimic normal program

execution Lattice Intranet

Provides instructions for submitting jobs and managing data input and output

Provides tools for describing and monitoring jobs

Other possibilities: Web portal model of job submission A client capable of composing complex workflows

using Task Computing and Semantic Web technology developed by collaborators at Fujitsu

Page 28: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Basic Architecture – Client/Service

Page 29: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Grid Client Stack

lattice_submit / lattice_retrieve

Service-specific* submit / retrieve scripts

Client.pm – base Perl module

Service-specific* submit / retrieve classes

GSBL – Grid Service Base Library

Globus API

Command-line Interface Perl Java

* Service-specific templates and stubs are created by the Grid Service Generator

Page 30: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Grid Service Stack

Service-specific* Implementation

GSBL – Grid Service Base Library

Globus API

Grid Service Hosting Environment, a.k.a. “the container” Java

* Service-specific templates and stubs are created by the Grid Service Generator

Page 31: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Tools for Writing Grid Services

Grid Service Base Library (GSBL) Java API for building Grid services with the

Globus Toolkit Shields programmers from having to work with

the Globus API directly Provides a high-level interface for

operations such as job submission and file transfer

Grid Service Generator (GSG) Simplifies the process of creating Grid

Services Intended for use with GSBL

Page 32: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

GSBL: Design and Features Classes for:

Clients and services (base classes)

Argument description and processing

File transfers Job submission and

control Security

configuration Java synchronization

and Globus notifications to paper over event-based model

ClientApplication

(e.g., BLAST)Application

(e.g., BLAST)

GSBL GSBL

Globus API Globus API

Service

Page 33: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Grid Service Generator

Deploying a Grid service with Globus is absurdly complicated Many files, namespaces: lots of

potential typos GSG takes as input a few

parameters (service name, location, an XML argument description, etc.) and generates all requisite configuration files and skeleton Java classes

Page 34: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Grid Services

Application

Condor (Linux/UNIX)

BOINC

Linux X86 Win32 Mac OS X

BLAST1 Yes No No No

Clustal W Yes Yes Yes Yes

CNS Yes Yes Yes No

Lamarc Yes Yes Yes Yes

MDIV Yes Yes Yes Yes

Migrate-N Yes Yes Yes Yes

Modeltest Yes Yes Yes Yes

MrBayes Yes Yes Yes Yes

ms Yes Yes Yes Yes

Muscle Yes Yes Yes Yes

PAUP*2 Yes No No No

Phyml Yes Yes Yes Yes

Pknots Yes Yes Yes Yes

Seq-gen Yes Yes Yes Yes

Snn Yes Yes Yes Yes

ssearch Yes Yes Yes Yes

Structure3 Yes No No No

Page 35: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Grid Services Creating Grid Services requires:

Knowledge of the application Techniques for compiling and porting the

application to various platforms Knowledge of the infrastructure so it can

be effectively tested and deployed Challenges:

Maintaining bodies of Grid Service code as the number of applications grow and new versions of applications are released

Minimizing the number of updates that need to be applied when the framework changes

Page 36: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Basic Architecture - Scheduling

Page 37: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Condor-G: ClassAds

Resources and jobs send Condor-G descriptions of themselves called ClassAds Jobs require certain capabilities

of resources Resources advertise their

capabilities Similar to a dating service: central

broker points pairs of compatible jobs/resources at each other

Page 38: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Condor G: ClassAds

Condor Collector

Resource A Resource B Resource C

I haveMrBayes!

I haveSSEARCH!

I havePAUP*!

Condor user

I need MrBayes!

Resource CCondor user

I hear you haveMrBayes?

Well, let's talkabout that...

Page 39: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Generating ClassAds

Job ClassAds are generated by the Condor-G job manager Job requirements are specified in the

Grid service configuration files

Resource ClassAds are generated by extracting information from MDS Lattice information providers supply

data required for matchmaking

Page 40: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Monitoring and Discovery System (MDS2)

Globus information services component LDAP-based (new version XML-based)

Answers questions like: What resources are available? What capabilities do these resources

have? What is the load on these resources?

This in turn allows for intelligent decisions to be made in areas such as scheduling and resource accounting

Page 41: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Basic Architecture - Resources

Page 42: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Current Grid Resources

http://lattice.umiacs.umd.edu/resources/

UMIACS Condor pool > 400 processors

BOINC pools Clients on campus > 100 Public (off-campus) clients > 1000

Page 43: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

BOINC Works on the “pull” model, that is:

One or more servers create workunits Clients connect asynchronously, pull down

work, and return the results Clients are relatively lightweight and

easy to install and manage One client can process work for

multiple projects Participants can join teams and are

given credit for the work they complete http://lattice.umiacs.umd.edu/

boinc_public

Page 44: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Globus-BOINC Adapter

Consists of a number of components that allow us to run Grid Services on BOINC BOINC job manager Custom validator and assimilator

Registers BOINC with Globus as a GRAM-addressable resource

BOINC compatibility library eases the process of porting applications to BOINC

Page 45: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Research Projects Using the Grid

The Laboratory of David Fushman has run protein-protein docking algorithms on Lattice CNS is the primary Grid service in this

project Floyd Reed and Holly Mortensen from the

Laboratory of Sarah Tishkoff have run a number of population genetics analyses MDIV and IM are the primary Grid services

The Laboratory of Molecular Evolution has run statistical phylogenetic analyses GSI is the primary Grid service

Page 46: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Results of Grid Usage

IM – 0.13 CPU years (BOINC) MDIV – 4.93 CPU years (BOINC) CNS – 12.4 CPU years (BOINC) GSI – 94.05 CPU years (Condor)

Total: 111.51 CPU years BOINC participants in 21

countries

Page 47: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Outline

Grid computing introduction and motivation

Goals of The Lattice Project Basic architecture Our current production Grid

system Implementation details Results of usage

Research and development

Page 48: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

GT4 Research and Development We are currently upgrading the Grid system to

use Globus Toolkit 4.0 GT4 adheres strictly to emerging and

established Web service standards Actively developed and supported Many components have been greatly improved

GridFTP/RFT (will replace GASS) WS GRAM MDS4 (XML based; replaces MDS2, LDAP based)

Our basic architecture remains the same, and the upgrade has been made easier because of tools we have already developed (GSBL, GSG)

Page 49: Grid Computing - An Overview Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

More Information

Lattice Website http://lattice.umiacs.umd.edu/