cindy zheng, pragma grid, 5/30/2006 the pragma testbed building a multi-application international...

33
Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng Pacific Rim Application and Grid Middleware Assembly University of California, San Diego San Diego Supercomputer Center http://www.pragma-grid.net

Upload: jackson-sutton

Post on 27-Mar-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

The PRAGMA TestbedBuilding a Multi-Application International Grid

Cindy Zheng

Pacific Rim Application and Grid Middleware AssemblyUniversity of California, San DiegoSan Diego Supercomputer Center

http://www.pragma-grid.net

Page 2: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Overview

• PRAGMA

• PRAGMA Grid testbed

• Routine-basis experiments– Applications– Grid middleware– Grid infrastructure software

• Grid interoperation

• Lessons Learned

Page 3: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

PRAGMA

• PRAGMA (2002 - )– Open international organization– Grid applications, practical issues– Build international scientific collaborations

• Characteristics– No central funding, but mutual interests– Friendship, trust, help among people– Doers

• Working groups– Bio, telescience, data, Geon, resources

• Meetings

Page 4: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Resources working group

• Improve– middleware interoperability– Global grid usability and productivity– Grid interoperability

• How to make a global grid easy to use?– For applications. Let applications drive– More organized testbed operation– Full-scale and integrated testing/research– Long daily application runs– Find problems, develop/research/test solutions

Page 5: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Routine-basis Experiments http://goc.pragma-grid.net

• Run applications while building testbed– Started 2004– Grass-roots, PRAGMA membership not necessary– Voluntary contribution of resources/work– long term, persistent– General grid

• Coordinator• Site supporters• Application drivers• Developers

Page 6: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

How We Operate• Heterogeneity

– fundings, policies, environments

• Motivation – learn, develop, test, interop

• Communication– email, VTC, Skype,

workshop, timezone, language

• Create operation procedures– joining testbed– running applications

• http://goc.pragma-grid.net – resources, contacts,

requirements, instructions, monitoring, status, tools, etc.

Page 7: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

How We Operatehttp://goc.pragma-grid.net/pragma-grid-status/work.htm

Page 8: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

PRAGMA Grid TestbedPRAGMA Grid Testbed

AIST, JapanCNIC, China

KISTI, Korea

ASCC, Taiwan

NCHC, TaiwanUoHyd, India

MU, Australia

BII, Singapore

KU, Thailand

USM, Malaysia

NCSA, USA

SDSC, USA

CICESE, Mexico

UNAM, Mexico

UChile, Chile

TITECH, Japan

QUT, Australia

UZurich, Switzerland

JLU, China

NGO, Singapore

MIMOS, Malaysia

OSAKAU, Japan

IOIT-HCM, Vietnam

Page 9: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

PRAGMA Grid resourceshttp://goc.pragma-grid.net/pragma-doc/resources.html

Page 10: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Software Layers

• Globus 2, 3, 4• GT4 pre-WS, 9 sites

• GT4 WS, 1• Moving requirements

Page 11: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Trust

• Trust all site CAs– tarball

• Experimental -> production

• Setup PRAGMA CA– GAMA/Naregi-CA

• APGRID PMA, IGTF (5 accr.)

Page 12: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Applications http://goc.pragma-grid.net

• Real science, multiple applications– Resource sharing

• Mpich-g2• Reservation and meta-scheduling

– TDDFT: quantum-chemistry, AIST, Japan– Savannah: climate Model, MU, Australia– QM-MD: quantum-mechanic, AIST, Japan– iGAP: bioinformatic, UCSD, USA– Gamess-APBS: organic chemistry, UZurich,

Switzerland– Siesta: molecular simulation, UZurich,

Switzerland– Amber: molecular simulation, USM, Malaysia– FMO: quantum-mechanics, AIST, Japan– HPM: Genomics, IOIT-HCM, Vietnam– (GEON, Sensor, … <data, sensor>)

Page 13: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Middleware

• Application middleware – enable application to run

in grid– Ninf-G

• AIST, Japan

• TDDFT, QM/MD, FMO

– Nimrod/G• MU, Australia

• Savannah, Siesta, Gamess

– Mpich-Gx• KISTI, Korea

• MM5, CICESE, Mexico

• Infrastructure middleware – provide grid services– Gfarm

• AIST, Japan • iGAP, testbed, 6 sites

– SCMSWeb• KU, Thailand• Testbed, 20 sites

– MOGAS• NTU, Singapore• Testbed, 14 sites

Page 14: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Server

Server

Server

ClientCompuer

Func. Handle

ClientComponent

Info. Manager

・・・・

・・・・

・・・・

・・・・

・・・・

・・・・

Remote Executables

GridRPC: A Programming Model based on RPCGridRPC API is a proposed recommendation at the GGFThree components

Information Manager - Manages and provides interface infoClient Component - Manages remote executables via function handlesRemote Executables - Dynamically generated on remote servers

Built on top of Globus Toolkit (MDS, GRAM, GSI)Simple and easy-to-use programming interface

Hiding complicated mechanism of the gridProviding RPC semantics

Page 15: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Nimrod Development Cycle

Prepare Jobs using Portal

Jobs Scheduled Executed DynamicallyResults displayed & interpreted

Sent to available machines

Page 16: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Application Middleware

• Ninf-G <http://ninf.apgrid.org>– Support GridRPC model which will be a GGF standard– Integrated to NMI release 8 (first non-US software in NMI)– Ninf roll for Rocks 4.x is also available– On PRAGMA testbed, TDDFT and QM/MD application achieved

long time executions (1 week ~ 50 days runs).

• Nimrod <http://www.csse.monash.edu.au/~davida/nimrod>– Supports large scale parameter sweeps on Grid infrastructure

• Study the behaviour of some of the output variables against a range of different input scenarios.

• Computer parameters that optimize model output• Computations are uncoupled (file transfer)• Allows robust analysis and more realistic simulations• Very wide range of applications from quantum chemistry to public

health policy– Climate experiment ran some 90 different scenarios of 6 weeks

each

Page 17: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Fault-Tolerance Enhanced

• Ninf-G monitors each RPC call– Return error code for failures

• Explicit Faults : Server down, Disconnection of network • Implicit Faults : Jobs not activated, unknown faults• Timeout - grpc_wait*()

– Retry/restart

• Nimrod/G monitors remote services and restarts failed jobs– Long jobs are split into many sequentially dependent jobs which can

be restarted• using sequential parameters called seqameters

• Improvement in the routine-basis experiment– developers test code on heterogeneous global grid

– results guide developers to improve detection and handle faults

Page 18: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Application Setup and Resource Management

• Heterogeneous platforms – Manual build, deploy applications, manage resources

• Labor intensive, time consuming, tidious

• Middleware solutions– For deployment

• Automatic distribution of executables use staging functions– For resource management

• Ninf-G client configuration allow description of server attributes– Port number of the Globus gatekeeper– Local scheduler type– Queue name for submitting jobs– Protocol for data transfer– Library path for dynamic linking

• Nimrod/G portal allows a user to generate a testbed and helps maintain information about resources, including use of different certificates.

Page 19: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Gfarm – Grid Virtual File Systemhttp://datafarm.apgrid.org/

- High transfer rate (parallel transfer, localization)- Scalable- File replication – user/application setup, fault tolerance- Support Linux, Solaris; also scp, gridftp, SMB- POSIX compliant- Gfarm-FUSE- 6 sites, 3786 GBytes, 1527 MB/sec (70 I/O nodes)

Page 20: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Application Benefit

• No modification required– Existing legacy application can access files in Gfarm

file system without any modification• Easy application deployment

– Install Application in Gfarm file system, run everywhere

• It supports binary execution and shared library loading• Different kinds of binaries can be stored at the same

pathname, which will be automatically selected depending on client architecture

• Fault tolerance– Automatic selection of file replicas in access time

tolerates disk and network failure• File sharing – Community Software Area

Page 21: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Performance Enhancements

Original Improved metadata

management

W/ metadata cache server

44.0 3.54 1.69

Performance for small files– Improve meta-cache

management– add meta-cache server

Directory listing of 16,393 files

Page 22: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

SCMSWebhttp://www.opensce.org/components/SCMSWeb

• Web-based monitoring system for clusters and grid– System usage– Performance metrics

• Reliability– Grid service monitoring– Spot problems at a glance

Page 23: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

PRAGMA-Driven Development• Heterogeneity

– Add platform support• Solaris (CICESE, Mexico)• IA64 (CNIC, China)

• Software deployment– NPACI Rocks Roll

• Support ROCKS 3.3.0 – 4.1– Native Linux RPM for various Linux platform

• Enhancement– Hierarchical monitoring on large scale Grid– Compress data exchange between Grid side

• For some site with slow network– Better and cleaner graphics user interfaces

• Standardize & more collaboration– GRMAP (Grid Resource Management & Account Project)

– Collaboration between NTU and TNGC– GIN (Grid Interoperation Now) Monitoring – standardize

data exchange between monitoring softwares

Page 24: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Multi-organisation Grid Accounting Systemhttp://ntu-cg.ntu.edu.sg/pragma

Page 25: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Information for grid resource managers/administrators:– Resource usage based on organization – Daily, weekly, monthly, yearly records– Resource usage based on project/individual/organisation– Individual log of jobs– Metering and charging tool, can decide a pricing system, e.g.

Price = f(hardware specifications, software license, usage measurement)

MOGAS Web information

Page 26: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

PRAGMA MOGAS statusPRAGMA MOGAS status(27/3/2006)(27/3/2006)

AIST, JapanCNIC, China

KISTI, Korea

ASCC, Taiwan

NCHC, TaiwanUoHyd, India

MU, Australia

BII, Singapore

KU, Thailand

USM, Malaysia

NCSA, USA

SDSC, USA

CICESE, Mexico

UNAM, Mexico

UChile, Chile

TITECH, Japan

Cindy Zheng, GGF13, 3/14/05 modified by A/Prof. Bu-Sung Lee

MIMOS

IOIT-HCM

GT4GT2

NGO, Singapore

QUT

Page 27: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Integrations and Collaborations• Naregi-CA (AIST, Japan) and

Gama (SDSC, USA) Integration• Rocks (SDSC, USA) and SCE

(KU, Thailand), Ninf-G (AIST), Gfarm (AIST), KISTI etc.

• PRAGMA and NLANR• PRAGMA and GEON

– PRAGMA grid testbed– UMC, SDSC (USA)– GSCAS, CNIC (China)– UoHyd (India)– AIST (Japan)

• PRAGMA and sensor networks– PRAGMA grid testbed– NCHC, Taiwan– Binghamton University, NY, USA

GAMAGAMA

Page 28: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Grid Interoperation Now (GIN)

• GIN testbed (started Feb. 2006)– PRAGMA– TeraGrid– EGEE

• Fist application: TDDFT/Ninf-G– Lead: Yoshio Tanaka, Yusuke Tanimura (AIST)– Deployed and run

• PRAGMA - AIST, NCSA, SDSC• TeraGrid – ANL

– Working on deployment to EGEE – LCG• Middleware interoperability problem

– Assumptions by middleware about local architecture – Standard protocol

Page 29: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Lessons Learned, Issues and Work (1)

• Authentication– User obtain initial access

• Process documented by Cindy Zheng, http://pragma-goc.rocksclusters.org/gin/gin-egee.htm

• Not easy, not simple• Need documentation to guide users• Develop software to simply the process

– DN incompatibility• Summarized by Oscar Koeroo,

http://goc.pragma-grid.net/gin/Cert-probs-GIN.pdf• Commented by Charles Bacon (Globus),

http://goc.pragma-grid.net/gin/DN_Charles-Bacon.htm• Need both standard and flexibility• Voms server is modified to handle both styles of DN

strings

Page 30: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Lessons Learned, Issues and Work (2)

• Software stack and Community Software Area (CSA)– Software stack is different among grids.

Problems with conflicting requirements.• CSA as a solution for users to deploy their sub-

stack and share installed software

– Near term - work on CSA within each grid• Gfarm-FUSE

– Need focused discussion on solution for GIN

Page 31: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Lessons Learned, Issues and Work (3)

• Cross-grid monitoring– Summary by Somsak Sriprayoonsakul,

http://goc.pragma-grid.net/gin/gin-monitor.htm• Get some monitoring software

together, develop a common schema– Wiki -

http://wiki.pragma-grid.net/index.php?title=GIN_%28Grid_Inter-operation_Now%29_Monitoring

Page 32: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Lessons Summary http://goc.pragma-grid.net/applications/tddft/Lessons.htm

• Problems and solutions– Information sharing (pragma-goc)– Trust and access (Naregi-CA, GAMA, myproxy)– Resource requirements (INCA)– User/application environment (Gfarm)– Job submission (Portal/service/middleware)– System/job monitoring (SCMSWeb, +)– Network monitoring (APAN, NLANR)– Resource/job accounting (SCMSWeb, NTU)– Fault tolerance (Ninf-G, Nimrod)

• Publications– Infrastructure, applications, software integration,

organization

Page 33: Cindy Zheng, Pragma Grid, 5/30/2006 The PRAGMA Testbed Building a Multi-Application International Grid Cindy Zheng P acific R im A pplication and G rid

Cindy Zheng, Pragma Grid, 5/30/2006

Thank You

Pointers• PRAGMA: http://www.pragma-grid.net• PRAGMA Testbed: http://goc.pragma-

grid.net• “PRAGMA: Example of Grass-Roots Grid

Promoting Collaborative e-science Teams. CTWatch. Vol 2, No. 1 Feb 2006

• “The PRAGMA testbed – Building a Multi-application International Grid”, CCGrid2006

• “Deploying Scientific Applications to the PRAGMA Grid Testbed: Strategies and Lessons”, CCGrid2006

• MOGAS: “Analysis of Job in a Multi-Organizational Grid Test-bed”, CCGrid2006