1 grid computing in north carolina: past and present sura cyber-infrastructure workshop georgia...
Post on 20-Dec-2015
214 views
TRANSCRIPT
11
Grid Computing in North Carolina:Grid Computing in North Carolina:Past and PresentPast and Present
SURA Cyber-infrastructure WorkshopSURA Cyber-infrastructure WorkshopGeorgia State universityGeorgia State university
January 6, 2005January 6, 2005MCNC Grid Computing and Networking ServicesMCNC Grid Computing and Networking Services
Phil EmerPhil EmerChuck KeslerChuck Kesler
22
MCNC’s Role in GridMCNC’s Role in Grid
MCNC is a service providerMCNC is a service provider Manages production infrastructureManages production infrastructure For R&E community across NCFor R&E community across NC
So to us, grid is:So to us, grid is: InfrastructureInfrastructure an access methodan access method A service delivery platform A service delivery platform
MCNC is the experiment support center for MCNC is the experiment support center for the National Lambda Railthe National Lambda Rail
33
NC Research and Education NetworkNC Research and Education Network
Qwest Internet
Abilene
Level3 (GbE)
RaleighRTP
OC48 SRP Ring counter-rotating ring <=50ms reroute Fully active redundancy
Greensboro
Winston-Salem
Charlotte
Level3
Duke (GbE) NCSU (GbE)
UNC-CH (GbE)
UNC-GNCATASU
WFUWSSUNCSA
UNC-C
Qwest
76097609
Wilmington
Fayetteville
Greenville
ECUECSUCMST
FSUUNCP
UNCW
AshevilleUNCAWCU
Greenville
OC12 SRP Ring counter-rotating ring <=50ms reroute Fully active redundancy
44
Toward GridToward Grid
We have many administrative domainsWe have many administrative domains
We have a network of distributed points of We have a network of distributed points of presencepresence
We provide access to shared resources – We provide access to shared resources – which are distributedwhich are distributed
So conditions are favorable for attaining a So conditions are favorable for attaining a state of grid-nessstate of grid-ness
What we need is an exercising applicationWhat we need is an exercising application
NC BioGrid!NC BioGrid!
55
Grid Computing in North Carolina:Grid Computing in North Carolina:Past and PresentPast and Present
Chuck KeslerChuck Kesler
[email protected]@mcnc.orgJanuary 2005January 2005
66
The Grid Revolution in NCThe Grid Revolution in NC
2002 2003 2004 2005
NC BioGrid
• Proving ground for Grid• Successful prototype apps• Catalyst for collaboration• International recognition
77
Why Bio + Grid? (circa 2002)Why Bio + Grid? (circa 2002)
Moore’s law has allowed labs Moore’s law has allowed labs to keep ahead of data, but to keep ahead of data, but sequence data is now sequence data is now outpacing processing outpacing processing capabilitycapability
Biotech and pharma industries Biotech and pharma industries are highly competitive and are highly competitive and capital intensivecapital intensive
Getting ahead and staying Getting ahead and staying ahead of the competition will ahead of the competition will require the creation of new require the creation of new and unique capabilitiesand unique capabilities
1994 1995 1996 1997 1998 1999 2000 2001 2002
lab processing Base Pairs Sequences moore's law
It’s about staying ahead of the curve...It’s about staying ahead of the curve...
88
The NC BioGrid PartnershipThe NC BioGrid Partnership
NC Biotech CenterNC Biotech Center Provided the catalyst through the Provided the catalyst through the
NC Genomics & Bioinformatics NC Genomics & Bioinformatics ConsortiumConsortium
MCNCMCNC Provided the funding and Provided the funding and
dedicated staffdedicated staff
SunSun Donated infrastructure hardwareDonated infrastructure hardware Established Sun Center of Established Sun Center of
Excellence in BioinformaticsExcellence in Bioinformatics
IBMIBM Donated human capital Donated human capital
(application developers)(application developers)
Triangle UniversitiesTriangle Universities Focal point for the collaborationFocal point for the collaboration Brought early adopters to the tableBrought early adopters to the table Created collaborative working Created collaborative working
groupsgroups
99
NC BioGrid AccomplishmentsNC BioGrid AccomplishmentsIn the Summer of 2002, installed a In the Summer of 2002, installed a dedicated testbed for evaluating grid dedicated testbed for evaluating grid middleware and developing grid middleware and developing grid applications for bioinformaticsapplications for bioinformatics
Testbed spanned multiple administrative Testbed spanned multiple administrative domains with systems located at MCNC, domains with systems located at MCNC, NC State, UNC-CH & Duke, and included NC State, UNC-CH & Duke, and included representative heterogenity of hardware representative heterogenity of hardware and OS platforms found at those sitesand OS platforms found at those sites
Employed “best of breed” approach to Employed “best of breed” approach to grid middleware deploymentgrid middleware deployment
Working groups met up to twice a month Working groups met up to twice a month during 2002-2003during 2002-2003
Created several pilot applications using Created several pilot applications using the testbedthe testbed
1010
Job SchedulingJob Scheduling Platform LSFPlatform LSF Sun Grid EngineSun Grid Engine
User PortalUser Portal CHEF / OGCECHEF / OGCE MyProxyMyProxy
NC BioGrid Middleware:NC BioGrid Middleware:Best-of-Breed ApproachBest-of-Breed Approach
Compute GridCompute Grid Globus V2 (NMI)Globus V2 (NMI) Avaki V2Avaki V2
Data GridData Grid Avaki Data Grid V4Avaki Data Grid V4 GridFTP (Globus)GridFTP (Globus)
1111
NC BioGrid - Data GridNC BioGrid - Data Grid
Avaki 4.0 Data GridAvaki 4.0 Data Grid Federation of data providers across the WANFederation of data providers across the WAN
Provides a global name space for user home directories, Provides a global name space for user home directories, shared project spaces, databases, and applicationsshared project spaces, databases, and applicationsAbility to have results from canned SQL queries show up Ability to have results from canned SQL queries show up as files in the global name spaceas files in the global name space
Variety of access methodsVariety of access methodsWeb-based user interfaceWeb-based user interfaceNFS and CIFS through local “data grid access servers” to NFS and CIFS through local “data grid access servers” to provide access at the native OS levelprovide access at the native OS level
Simple deploymentSimple deploymentNo kernel mods requiredNo kernel mods requiredEach site can run a “share server” to distribute their local Each site can run a “share server” to distribute their local home and project directories to the gridhome and project directories to the gridWeb-based management interface Web-based management interface
1212
NC BioGrid - Compute GridNC BioGrid - Compute Grid
Globus ToolkitGlobus Toolkit NSF Middleware Initiative (NMI) V2 (Globus NSF Middleware Initiative (NMI) V2 (Globus
2.4.3)2.4.3)Provides “gatekeeper” functionality for submitting Provides “gatekeeper” functionality for submitting jobs through to the local cluster managerjobs through to the local cluster manager
Provides GridFTP support for file transferProvides GridFTP support for file transfer
Provides MDS to track grid resource characteristicsProvides MDS to track grid resource characteristics MCNC provides infrastructure servicesMCNC provides infrastructure services
Certificate Authority (initially based on the Globus Certificate Authority (initially based on the Globus SimpleCA)SimpleCA)
GIIS (master resource directory for the grid)GIIS (master resource directory for the grid)
1313
NC BioGrid - Web PortalNC BioGrid - Web Portal
CHEF/OGCE – a grid portal frameworkCHEF/OGCE – a grid portal framework Implements web-based interfaces for Implements web-based interfaces for
managing job submissions, file access, and managing job submissions, file access, and online meetingsonline meetings
Originally developed as a distance learning toolOriginally developed as a distance learning tool
MyProxy – security credential repositoryMyProxy – security credential repository Provides the portal with a mechanism for Provides the portal with a mechanism for
accessing and using Globus security accessing and using Globus security credentialscredentials
1414
Portal ExamplePortal Example
1515
NC BioGrid Proof of Concept NC BioGrid Proof of Concept ApplicationsApplications
Parameter Space Study with BLASTParameter Space Study with BLAST BLAST compares a target gene sequence against a BLAST compares a target gene sequence against a
known genome to find similaritiesknown genome to find similarities Grid BLAST distributed 1,000+ target sequences across Grid BLAST distributed 1,000+ target sequences across
the grid for comparisonthe grid for comparison
IBM Extreme Blue ProjectIBM Extreme Blue Project Built a grid interface to BioPerl libraries Built a grid interface to BioPerl libraries
UNC-CH/IBM QSAR ApplicationUNC-CH/IBM QSAR Application Grid-enabled version of a drug compound screening Grid-enabled version of a drug compound screening
applicationapplication Finds compounds that have promising biological activity Finds compounds that have promising biological activity
characteristics that should receive further researchcharacteristics that should receive further research
1616
The Grid Revolution in NCThe Grid Revolution in NC
2002 2003 2004 2005
NC BioGrid
MCNC Enterprise Grid
• Apply NC BioGrid lessons• Cluster and SMP resources • Research platform for GTEC• Core component in NCGrid
1717
32-CPU SGI AltixLinux SMP Server
128-CPU IBM LinuxCluster (64 nodes)
8-TB Storage
LSF Master Job Scheduler
Interactive Nodes / Grid Gatekeeper / GridFTP
Global Grid Resource DB
(GIIS)
Users
Campus Grids
The MCNC Enterprise GridThe MCNC Enterprise Grid
Portals
(FIREWALL)
1818
The Enterprise Grid and MCNC’s The Enterprise Grid and MCNC’s Services StrategyServices Strategy
NCREN
State-wide Grid
Services
Enterprise Grid
Services
Value-add Information
Systems Services
Self-serve Data Center
Services
DATA CENTER
Hosting & Infrastructure Grid Computing
GTEC, NLR, ANR and other Innovation Initiatives
Information Security Services
Data Archival Services
Information Assurance
DEPLOYMENT
1919
The Grid Revolution in NCThe Grid Revolution in NC
2002 2003 2004 2005
NC BioGrid
MCNC Enterprise Grid
NC Grid Initiative
• State-wide partnership• Leverage lessons learned• Grid education & training resource• Enable first mover applications
2020
NC Grid: A Grid for Grid NC Grid: A Grid for Grid DevelopersDevelopers
(for now, at least)(for now, at least)Provide a development testbed that spans the stateProvide a development testbed that spans the stateMulti-institutional resourcesMulti-institutional resources
MCNC offers the Enterprise Grid as a resourceMCNC offers the Enterprise Grid as a resource MCNC is also developing a “grid appliance,” which can be MCNC is also developing a “grid appliance,” which can be
easily deployed and remotely supported as a campus or easily deployed and remotely supported as a campus or department point of presence on the griddepartment point of presence on the grid
Currently the community is working together to determine Currently the community is working together to determine the “middleware stack”the “middleware stack”
GT4 vs. GT3GT4 vs. GT3 OGCE vs. GridSphereOGCE vs. GridSphere CA architectureCA architecture Data grid strategyData grid strategy Platform standardsPlatform standards etc...etc...
2121
A Sampling of CurrentA Sampling of CurrentGrid Projects in NCGrid Projects in NC
GridNexus at UNC-WGridNexus at UNC-W Workflow builder for grid applicationsWorkflow builder for grid applications www.gridnexus.orgwww.gridnexus.org
SCOOPSCOOP UNC-CH, RENCI, and MCNCUNC-CH, RENCI, and MCNC Portal and grid infrastructure for running ADCIRC modelPortal and grid infrastructure for running ADCIRC model
BioPortalBioPortal RENCI at UNC-CHRENCI at UNC-CH
Grid Computing CS Course Grid Computing CS Course Offered by WCU to campuses across the state via Offered by WCU to campuses across the state via
NCREN video serviceNCREN video service First offered in Fall 2004 (~30 students), to be offered First offered in Fall 2004 (~30 students), to be offered
again in Fall 2005again in Fall 2005