introduction to cngrid gos 3.0 omii-euro & cngrid joint training material 刘杰 (liu jie)...
TRANSCRIPT
Introduction to CNGrid GOS 3.0
OMII-Euro & CNGrid Joint Training Material 刘杰 (Liu Jie) [email protected] Jan. 11 2008
2
Outline
CNGrid snapshot Motivation Architecture Components
– Core layer– HPCG
Summary
3
CNGrid snapshot Project Background
– CNGrid (China National Grid)– CNGrid GOS 2.0
• Sponsored by China Ministry of Science and Technology (2002~2005), the tenth five-year plan
– CNGrid GOS 3.0• Sponsored by China Ministry of Science and Technolog
y (2006~2009), the eleventh five-year plan• ICT CAS, Tsinghua U, Beihang U, etc
4
CNGrid snapshot
5
CNGrid snapshot
International cooperation– OMII_EU/OMII_UK
• Provide software suite• Integrated into OMII software stack • Use OMII leading technology in CNGrid.
– XtreemOS• Building and Promoting a Linux-based Operating System
to Support Virtual Organizations for Next Generation Grids.
• WP2.1Virtual Organization support in Linux• WP3.5 Security in Virtual Organizations
6
Motivation Why CNGrid GOS?
– Need for Internet based grid system software• Manage large scale distributed resource effectively• provide uniform approach accessing the heterogeneous resou
rces in grid• Enable Internet based resource sharing and collaborating
– Need for Easy-to-use grid• Low cost: Hiding interior details for grid applications develop
ment, deployment, management and using.• Multiple access mode:
– Client/Server, Browser/Server and other modes– Batch mode and interactive mode
7
Motivation Goals
– Develop a virtualized resource sharing mechanism and framework on computing, data, software and combined resources
– Provide secured, unified and friendly interfaces accessing the scientific computing and information services
– Support multiple domain specific applications running on above
8
CNGrid GOS 3.0 Architecture
Tomcat(Apache)+Axis, GT4, gLite, OMII
Dynamic DeployService
CA Service
System Mgmt Portal
Hosting Environment
Core
System
Tool/App
Message Service
Agora
User Mgmt Res MgmtAgora Mgmt
Gsh & cmd tools
GSML Browser
Naming
HPCG Portal
WorkflowIDE
WP6 Other WPs
ServiceControllerOther
RController
BatchJob mgmt
MetaScheduleAccount mgmt
File mgmt
metainfo mgmt
HPCG
Resource Space
GOS System Call (Resource mgmt,Agora mgmt, User mgmt, Grip mgmt, etc)GOS Library (Batch, Message, File, etc)
Other applications
Grip Runtime
Grip Instance MgmtSecurity
Res AC & Sharing
Other 3rd software &
tools
Java J2SE
GridWorkflowDataGrid
Science Data Grid
IDE Compiler
GSML Composer
Programming Env.DataGrid
Using Env.Workflow
Using Env.Debugger
WP2
VegaSSH
Railway Info Process Grid
Running & Mgmt Center
Batch mgmt portal
Grip
Grid Portal, Gsh, GSML Workshop and Grid Apps
OS (Linux/Unix/Windows)
PC Server (Grid Server)
J2SE(1.4.2_07, 1.5.0_07)
Tomcat(5.0.28) +Axis(1.2 rc2)
Axis Handlers for Message Level Security
Core, System and App Level Services
9
Components overview
Components – Core layer– HPCG (High Performance Computing Gateway
)• Deployment• Management• Usage: Job , File & Accounting Mgmt• Application Development
10
Components: System software
Core layer– Agora service (aka. VO)
• organize and manage related users and resources locally• serve as trust third part for resource providers and consumers
to negotiate sharing policies• Provide user mgmt, resource mgmt, agora mgmt functions bas
ed on underlying Naming layer– A resilience decentralized registry for variety kinds of global obje
ct– Provide low latency object locating by object GUID– Provide high success rate searching by multiple attributes match– provide stable object view based on linked naming services to en
able the effective-virtual-physical address space• Use RController to provide a uniform resource provision and m
anagement interface
11
Components: System software
Core layer– Grip
• Runtime abstraction: a grip is once running of an application
• Create grips to run applications in a managed way, interact with an existing grip, kill a grip and release consuming resources in automatic way
12
Components: HPCG HPCG motivation
– Aim to provide a high performance business computing environment for enterprise users
– Features• Easy to install, configure and use• Provide functions what users really need• High reliability• Professional interface• Based on GOS, but can easy to port to other grid middleware• Standard compliant
– JSDL (Job Submission Description Language)– BES (OGSA Basic Execution Service)– SAGA (A Simple API for Grid Application)– SOA and plain Web services (WS-related standards ) – RUS: Resource Usage Service (RUS) based on WS-I Basic Profile
1.0
13
HPCG Components
Mgmt PortalPortal
HPCG Server
CML tools
HPCG Client
Metainfo Mgmt
File mgmt
Message
Dynamic metainfo mgmt
Environment abstraction
User ExceptionSecurity
Static metainfo mgmt
Account mgmt
Batch job mgmt
Database
Meta schedule
14
Scenarios of HPCG
Internet
Enterprise user
Enterprise Intranet
Cluster
GOS
GOS
HPC gateway server
Enterprise user
Grid Site
Grid Site
Grid Site(Grid Operation &
Mgmt Center)
Message Subscribe/Notification
GOS
Requirements for High performance computation gateway
– Uniformed Web UI for HPC users and resource providers
– Many enterprise users share one HPC account
– Job submission to different HPC transparently
– Job status acquirement efficiently
– File transport without relay– Computation resource
accounting
15
HPCG - Deploy
Several deploy styles– Front-end and back-end– All vs. split– Relationship with clusters
• Deploy in clusters• Deploy in a machine outside of the clusters
16
HPCG - Deploy
Pre-require – Software
• JDK 1.5• Ant1.6.5 or above• Mysql1.4.12 or above• Standard Ftp server• OpenPBS (PBSPro or Torque) , LSF, etc
– Hardware• Cpu : P4 2.4G• Memory : 4GB (at least 2GB)• Disk Space : 160GB (at least 80GB)
– Network• Double Network Cards• ftp port : 21• ssh port : 22• http port : 8080, 18080• Message port : 61616
17
HPCG Management portal– Manage all meta-info, such as cluster info,
jobqueue info, user mapping, software type, software instance etc.
HPCG Application portal– End users to submit and manage jobs, manage
temp files and output files, query history accounting info, etc
HPCG - Portal
18
HPCG Management
Several kinds of static meta-info– Mapping of grid user to local cluster users– Cluster meta-info– Software type info– Software instance info– Jobqueue info
Dynamic meta-info– The pending job length of each job queue– The available count of license
Support scheduling
19
HPCG - Management
20
HPCG - Application portal
Batch job management– Submit job– Manage job
File management Accounting management
21
HPCG - Batch Job mgmt
Submit jobs to the grid and schedule among multiple HPC sites
Monitor the detailed job status Cancel or rerun jobs Query history job information Job status change subscribe and notification Support both JSDL and BES standard
22
Batch Job management: Job status transform diagram
Submitted Staging In Staged In
Executed
Staging OutStaged OutDone
Active:Running
Failed fail
Active:Queuing
Active:Suspended:Suspend
Terminated terminate
Re-runRe-run
Re-run
23
HPCG - Batch job mgmt
24
HPCG - Batch job mgmt
25
HPCG - File mgmt View, create and delete of working directory in compu
tation node With zip and tar support for multiple output files Reliable big file (about 2GB) transfer between gatewa
y server and working directory View text files(<0.5MB) and pictures in working direct
ory with web browsers Support multiple ftp servers (wuftp, vsftp) with ipv6 su
pport Pause and resume of file transfer process
26
HPCG - File mgmt
27
HPCG - File mgmt
28
HPCG - Accounting mgmt
Accounting info about jobs come from grid user and local
Standard Usage Record format Service for query, add, remove, update and
statistics for both local and global accounting info with ACL
Global Accounting statistics
29
HPCG - Account mgmt
30
HPCG - Development
HPCG Template– function
• Describe the public logic when submitting jobs• Have nothing with the Grid site• Every software should have at least one Template
– form• Xml file
31
HPCG - Development
Schema of HPCG Template
32
HPCG - Development
Benefits of the HPCG Template– Easy to develop ( No need to know GOS API
s )– Easy to share the Template – Shield the heterogeneous of the resource– Global job-schedule– Sharing of software license
33
Summary
Summary of CNGrid GOS 3.0– A software suite to support multiple domain applicati
ons and enable the sharing resources among HPC sites
– Major components: System software, HPCG,– Other components: Programming & using environm
ent, Grid workflow and Data Grid Time schedule
– 2008.1 release of CNGrid GOS 3.0– 2008.2 deployed on CNGrid
34
Thanks!