1 grid computing in north carolina: past and present sura cyber-infrastructure workshop georgia...

21
1 Grid Computing in North Carolina: Grid Computing in North Carolina: Past and Present Past and Present SURA Cyber-infrastructure Workshop SURA Cyber-infrastructure Workshop Georgia State university Georgia State university January 6, 2005 January 6, 2005 MCNC Grid Computing and Networking Services MCNC Grid Computing and Networking Services Phil Emer Phil Emer Chuck Kesler Chuck Kesler

Post on 20-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

11

Grid Computing in North Carolina:Grid Computing in North Carolina:Past and PresentPast and Present

SURA Cyber-infrastructure WorkshopSURA Cyber-infrastructure WorkshopGeorgia State universityGeorgia State university

January 6, 2005January 6, 2005MCNC Grid Computing and Networking ServicesMCNC Grid Computing and Networking Services

Phil EmerPhil EmerChuck KeslerChuck Kesler

Page 2: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

22

MCNC’s Role in GridMCNC’s Role in Grid

MCNC is a service providerMCNC is a service provider Manages production infrastructureManages production infrastructure For R&E community across NCFor R&E community across NC

So to us, grid is:So to us, grid is: InfrastructureInfrastructure an access methodan access method A service delivery platform A service delivery platform

MCNC is the experiment support center for MCNC is the experiment support center for the National Lambda Railthe National Lambda Rail

Page 3: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

33

NC Research and Education NetworkNC Research and Education Network

Qwest Internet

Abilene

Level3 (GbE)

RaleighRTP

OC48 SRP Ring counter-rotating ring <=50ms reroute Fully active redundancy

Greensboro

Winston-Salem

Charlotte

Level3

Duke (GbE) NCSU (GbE)

UNC-CH (GbE)

UNC-GNCATASU

WFUWSSUNCSA

UNC-C

Qwest

76097609

Wilmington

Fayetteville

Greenville

ECUECSUCMST

FSUUNCP

UNCW

AshevilleUNCAWCU

Greenville

OC12 SRP Ring counter-rotating ring <=50ms reroute Fully active redundancy

Page 4: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

44

Toward GridToward Grid

We have many administrative domainsWe have many administrative domains

We have a network of distributed points of We have a network of distributed points of presencepresence

We provide access to shared resources – We provide access to shared resources – which are distributedwhich are distributed

So conditions are favorable for attaining a So conditions are favorable for attaining a state of grid-nessstate of grid-ness

What we need is an exercising applicationWhat we need is an exercising application

NC BioGrid!NC BioGrid!

Page 5: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

55

Grid Computing in North Carolina:Grid Computing in North Carolina:Past and PresentPast and Present

Chuck KeslerChuck Kesler

[email protected]@mcnc.orgJanuary 2005January 2005

Page 6: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

66

The Grid Revolution in NCThe Grid Revolution in NC

2002 2003 2004 2005

NC BioGrid

• Proving ground for Grid• Successful prototype apps• Catalyst for collaboration• International recognition

Page 7: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

77

Why Bio + Grid? (circa 2002)Why Bio + Grid? (circa 2002)

Moore’s law has allowed labs Moore’s law has allowed labs to keep ahead of data, but to keep ahead of data, but sequence data is now sequence data is now outpacing processing outpacing processing capabilitycapability

Biotech and pharma industries Biotech and pharma industries are highly competitive and are highly competitive and capital intensivecapital intensive

Getting ahead and staying Getting ahead and staying ahead of the competition will ahead of the competition will require the creation of new require the creation of new and unique capabilitiesand unique capabilities

1994 1995 1996 1997 1998 1999 2000 2001 2002

lab processing Base Pairs Sequences moore's law

It’s about staying ahead of the curve...It’s about staying ahead of the curve...

Page 8: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

88

The NC BioGrid PartnershipThe NC BioGrid Partnership

NC Biotech CenterNC Biotech Center Provided the catalyst through the Provided the catalyst through the

NC Genomics & Bioinformatics NC Genomics & Bioinformatics ConsortiumConsortium

MCNCMCNC Provided the funding and Provided the funding and

dedicated staffdedicated staff

SunSun Donated infrastructure hardwareDonated infrastructure hardware Established Sun Center of Established Sun Center of

Excellence in BioinformaticsExcellence in Bioinformatics

IBMIBM Donated human capital Donated human capital

(application developers)(application developers)

Triangle UniversitiesTriangle Universities Focal point for the collaborationFocal point for the collaboration Brought early adopters to the tableBrought early adopters to the table Created collaborative working Created collaborative working

groupsgroups

Page 9: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

99

NC BioGrid AccomplishmentsNC BioGrid AccomplishmentsIn the Summer of 2002, installed a In the Summer of 2002, installed a dedicated testbed for evaluating grid dedicated testbed for evaluating grid middleware and developing grid middleware and developing grid applications for bioinformaticsapplications for bioinformatics

Testbed spanned multiple administrative Testbed spanned multiple administrative domains with systems located at MCNC, domains with systems located at MCNC, NC State, UNC-CH & Duke, and included NC State, UNC-CH & Duke, and included representative heterogenity of hardware representative heterogenity of hardware and OS platforms found at those sitesand OS platforms found at those sites

Employed “best of breed” approach to Employed “best of breed” approach to grid middleware deploymentgrid middleware deployment

Working groups met up to twice a month Working groups met up to twice a month during 2002-2003during 2002-2003

Created several pilot applications using Created several pilot applications using the testbedthe testbed

Page 10: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

1010

Job SchedulingJob Scheduling Platform LSFPlatform LSF Sun Grid EngineSun Grid Engine

User PortalUser Portal CHEF / OGCECHEF / OGCE MyProxyMyProxy

NC BioGrid Middleware:NC BioGrid Middleware:Best-of-Breed ApproachBest-of-Breed Approach

Compute GridCompute Grid Globus V2 (NMI)Globus V2 (NMI) Avaki V2Avaki V2

Data GridData Grid Avaki Data Grid V4Avaki Data Grid V4 GridFTP (Globus)GridFTP (Globus)

Page 11: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

1111

NC BioGrid - Data GridNC BioGrid - Data Grid

Avaki 4.0 Data GridAvaki 4.0 Data Grid Federation of data providers across the WANFederation of data providers across the WAN

Provides a global name space for user home directories, Provides a global name space for user home directories, shared project spaces, databases, and applicationsshared project spaces, databases, and applicationsAbility to have results from canned SQL queries show up Ability to have results from canned SQL queries show up as files in the global name spaceas files in the global name space

Variety of access methodsVariety of access methodsWeb-based user interfaceWeb-based user interfaceNFS and CIFS through local “data grid access servers” to NFS and CIFS through local “data grid access servers” to provide access at the native OS levelprovide access at the native OS level

Simple deploymentSimple deploymentNo kernel mods requiredNo kernel mods requiredEach site can run a “share server” to distribute their local Each site can run a “share server” to distribute their local home and project directories to the gridhome and project directories to the gridWeb-based management interface Web-based management interface

Page 12: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

1212

NC BioGrid - Compute GridNC BioGrid - Compute Grid

Globus ToolkitGlobus Toolkit NSF Middleware Initiative (NMI) V2 (Globus NSF Middleware Initiative (NMI) V2 (Globus

2.4.3)2.4.3)Provides “gatekeeper” functionality for submitting Provides “gatekeeper” functionality for submitting jobs through to the local cluster managerjobs through to the local cluster manager

Provides GridFTP support for file transferProvides GridFTP support for file transfer

Provides MDS to track grid resource characteristicsProvides MDS to track grid resource characteristics MCNC provides infrastructure servicesMCNC provides infrastructure services

Certificate Authority (initially based on the Globus Certificate Authority (initially based on the Globus SimpleCA)SimpleCA)

GIIS (master resource directory for the grid)GIIS (master resource directory for the grid)

Page 13: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

1313

NC BioGrid - Web PortalNC BioGrid - Web Portal

CHEF/OGCE – a grid portal frameworkCHEF/OGCE – a grid portal framework Implements web-based interfaces for Implements web-based interfaces for

managing job submissions, file access, and managing job submissions, file access, and online meetingsonline meetings

Originally developed as a distance learning toolOriginally developed as a distance learning tool

MyProxy – security credential repositoryMyProxy – security credential repository Provides the portal with a mechanism for Provides the portal with a mechanism for

accessing and using Globus security accessing and using Globus security credentialscredentials

Page 14: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

1414

Portal ExamplePortal Example

Page 15: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

1515

NC BioGrid Proof of Concept NC BioGrid Proof of Concept ApplicationsApplications

Parameter Space Study with BLASTParameter Space Study with BLAST BLAST compares a target gene sequence against a BLAST compares a target gene sequence against a

known genome to find similaritiesknown genome to find similarities Grid BLAST distributed 1,000+ target sequences across Grid BLAST distributed 1,000+ target sequences across

the grid for comparisonthe grid for comparison

IBM Extreme Blue ProjectIBM Extreme Blue Project Built a grid interface to BioPerl libraries Built a grid interface to BioPerl libraries

UNC-CH/IBM QSAR ApplicationUNC-CH/IBM QSAR Application Grid-enabled version of a drug compound screening Grid-enabled version of a drug compound screening

applicationapplication Finds compounds that have promising biological activity Finds compounds that have promising biological activity

characteristics that should receive further researchcharacteristics that should receive further research

Page 16: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

1616

The Grid Revolution in NCThe Grid Revolution in NC

2002 2003 2004 2005

NC BioGrid

MCNC Enterprise Grid

• Apply NC BioGrid lessons• Cluster and SMP resources • Research platform for GTEC• Core component in NCGrid

Page 17: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

1717

32-CPU SGI AltixLinux SMP Server

128-CPU IBM LinuxCluster (64 nodes)

8-TB Storage

LSF Master Job Scheduler

Interactive Nodes / Grid Gatekeeper / GridFTP

Global Grid Resource DB

(GIIS)

Users

Campus Grids

The MCNC Enterprise GridThe MCNC Enterprise Grid

Portals

(FIREWALL)

Page 18: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

1818

The Enterprise Grid and MCNC’s The Enterprise Grid and MCNC’s Services StrategyServices Strategy

NCREN

State-wide Grid

Services

Enterprise Grid

Services

Value-add Information

Systems Services

Self-serve Data Center

Services

DATA CENTER

Hosting & Infrastructure Grid Computing

GTEC, NLR, ANR and other Innovation Initiatives

Information Security Services

Data Archival Services

Information Assurance

DEPLOYMENT

Page 19: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

1919

The Grid Revolution in NCThe Grid Revolution in NC

2002 2003 2004 2005

NC BioGrid

MCNC Enterprise Grid

NC Grid Initiative

• State-wide partnership• Leverage lessons learned• Grid education & training resource• Enable first mover applications

Page 20: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

2020

NC Grid: A Grid for Grid NC Grid: A Grid for Grid DevelopersDevelopers

(for now, at least)(for now, at least)Provide a development testbed that spans the stateProvide a development testbed that spans the stateMulti-institutional resourcesMulti-institutional resources

MCNC offers the Enterprise Grid as a resourceMCNC offers the Enterprise Grid as a resource MCNC is also developing a “grid appliance,” which can be MCNC is also developing a “grid appliance,” which can be

easily deployed and remotely supported as a campus or easily deployed and remotely supported as a campus or department point of presence on the griddepartment point of presence on the grid

Currently the community is working together to determine Currently the community is working together to determine the “middleware stack”the “middleware stack”

GT4 vs. GT3GT4 vs. GT3 OGCE vs. GridSphereOGCE vs. GridSphere CA architectureCA architecture Data grid strategyData grid strategy Platform standardsPlatform standards etc...etc...

Page 21: 1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and

2121

A Sampling of CurrentA Sampling of CurrentGrid Projects in NCGrid Projects in NC

GridNexus at UNC-WGridNexus at UNC-W Workflow builder for grid applicationsWorkflow builder for grid applications www.gridnexus.orgwww.gridnexus.org

SCOOPSCOOP UNC-CH, RENCI, and MCNCUNC-CH, RENCI, and MCNC Portal and grid infrastructure for running ADCIRC modelPortal and grid infrastructure for running ADCIRC model

BioPortalBioPortal RENCI at UNC-CHRENCI at UNC-CH

Grid Computing CS Course Grid Computing CS Course Offered by WCU to campuses across the state via Offered by WCU to campuses across the state via

NCREN video serviceNCREN video service First offered in Fall 2004 (~30 students), to be offered First offered in Fall 2004 (~30 students), to be offered

again in Fall 2005again in Fall 2005