wgiss workshop, september 12, 2006 cyberinfrastructure for research and education and its challenges...
TRANSCRIPT
WGISS Workshop, September 12, 2006
Cyberinfrastructure for Research and Education and its challenges
Dr. Sebastien GoasguenDr. Carol Song
Rosen Center for Advanced ComputingPurdue University
[email protected]://www.rcac.purdue.edu
(Work presented here are supported by OCI-0438246 (NMI nanoHUB), OCI-0503992 (TeraGrid RP))
WGISS Workshop, September 12, 2006
Highlights
• Infrastructure building– Community clusters, cycles harvest across campus, high
speed network links, storage capacity– Data collections
• System interoperability– Integrate computing infrastructures– Integrate services
• Enabling multidisciplinary research and education– Most design decisions are guided by this principle
WGISS Workshop, September 12, 2006
Outline
• TeraGrid– HPC through community resources and interoperability
• NanoHUB Science Gateway– Online Simulations & Education
• Multidisciplinary Data Management – Data source aggregation– Data Management – Workflow
• Seamless integration of grids and services– Integration with Education (Sakai, podcast, Merlot)– NanoHUB & TeraGrid, OSG– TG & campus infrastructures
WGISS Workshop, September 12, 2006
TeraGrid
• Grid Infrastructure Group (U Chicago)– TG integration, planning, management and
coordination.
• Resource Partners– 9 partners– Provide system resources, user support– Provide access to resources through policies, software
and other mechanism
• Individual PIs access TG high performance computing resources through unified user support, coordinated software and services, and extensive documentation and training.
WGISS Workshop, September 12, 2006
TG Internals
Globus enabled resources
GT2 or GT4 WSRF
ssh using std unix practices
Globus submit
Condor-g submit
Wrappers scripts to get work done (compute, store, move data)
PBS, LSF etc…local batch system and schedulers
Condor talks to Globus talks to scheduler….
CTSS software Stack: HPC + Grid
WGISS Workshop, September 12, 2006
“easy” access
ClusterTeraGrid
Remote access to simulators and compute power
Condor-GGlobus
Condor-GGlobus
internet
nanoHUB infrastructure
Browser(VNC)
nanoHUB.orgWeb site
Physical Machine
Virtual Machine
NMI Cluster
WGISS Workshop, September 12, 2006
nanoHUB – Sakai Integration for Assessment Services
• Assessment of learning impact is a key metric• Sakai – Service-oriented Assessment Service Integration
WGISS Workshop, September 12, 2006
SAKAI Integration Architecture
WS Clientsakai_mambo.p
hp
SAF—Kernel
SAF—Common Services
Application Services
Tool Code
Tool Layout
Presentation
ServiceInterface (i.e. API)
Axis
WS End Point
FrameworFrameworkk
ApplicatioApplicationn
Web SvcsWeb Svcs
SAKAI
nanoHUB Learning Learning ModuleModuleLearning Learning
ModuleModuleLearning Learning ModuleModule
Session based launch
SakaiLogin.jwsSakaiSite.jws
WGISS Workshop, September 12, 2006
nanoHUB Internals
Delegated trust
Local Virtual Machines
Migratable
Isolated from Local infrastructure
VIOLIN Virtual Cluster
Virtual Infrastructure over WAN
WGISS Workshop, September 12, 2006
Purdue TG Data Management System
TeraGrid network- Provides HPC, storage resources
Multidisciplinary scientific data
- Remote sensing, weather, modeling data
SRB middleware system developed at SDSC
- Provides distributed data management- Logical and System Attributes
Server-side data processing tools
- OPeNDAP/THREDDS data server
Web Services interface- File query, File listing, Metadata query, File download
Purdue TG data portal- JSR-168 compliant portlets, based on Gridsphere- Uses SRB Jargon API for data access
WGISS Workshop, September 12, 2006
LARS Dataset (Laboratory for Applications of Remote
Sensing)
• Multispectral and Hyperspectral remote sensing images for Indiana
• ERDAS LAN, Leica Geosystems Imagine, GeoTIFF, and HDF formats
• 1972 to 2004• IndianaView Glovis web
access– Part of the AmericaView
initiative– Funded through USGS– Graphical Interface for
viewing and downloading remote sensing image data
– http://indianaview.envision.purdue.edu/glovis/index.htm
WGISS Workshop, September 12, 2006
PTO Satellite Data(Purdue Terrestrial
Observatory)
• GOES-GVAR sensor (L band), 3.7m. fixed antenna, Feb. 2005.
• Terra-MODIS, Aqua-MODIS, NOAA-AVHRR and FY1-MVISR sensors (L- and X- band), 4.27 m. tracking antenna , April. 2006.
• 10 Node cluster data processing and visualization server, more than 25 different products.
WGISS Workshop, September 12, 2006
National Weather Service Data
• Next Generation Radar (NEXRAD) Level II data
• 159 Weather Surveillance Radar-1988 Doppler (WSR-88D) sites
• Real-time streaming, high-resolution data from the national network
• Reflectivity, mean radial velocity, and spectrum width
• One of the four top-level distributors
• THREDDS/OPeNDAP data servers
WGISS Workshop, September 12, 2006
CCSM Climate Simulation Data
• Community Climate System Model (CCSM) to simulate climate change on Earth
• Ocean, Land, and Atmospheric models
• NetCDF format• OPeNDAP server
provides post-processing functionalities
WGISS Workshop, September 12, 2006
Architecture
1. Data Capture– Commercial vendor HW, SW– Data drivers to
• Harvest and register meta data• Ingest data to SRB server• Normalize application data to standards
2. SRB (Storage Resource Broker) - SDSC– Stores data in logical collections, associated with meta data.– Stores raw and processed data for access– Meta data catalog (MCAT) in SDSC, data servers at Purdue.
3. Application layer: Integrates applications for enhanced data access– THREDDS (Thematic Real-time Environmental Distributed Data Services)
for Doppler radar data– OPeNDAP (Open-source Project for a Network Data Access Protocol) for
climate modeling data
4. Presentation layer• Gridsphere based portlets: browse, search, download data.
WGISS Workshop, September 12, 2006
Data Access
• Command line (SRB S-commands)– Sinit, Sls, Sget, Sexit
• Web Interface: MySRB • Windows GUI Client: inQ• OPeNDAP/THREDDS clients• Purdue Environmental Data Portal• Web Services
WGISS Workshop, September 12, 2006
Security Challenges of Interoperable Grid Infrastructure
Services Services
Services
SOA
Certificate Delegation
Trust level
WGISS Workshop, September 12, 2006
SOA with Authorization
Services
Services
Certificate Delegation
Policies
Trust level
Attribute Server
Authorization Policy
WGISS Workshop, September 12, 2006
ShibbolethIdP
Exec(Condor_Submit)
Mambo
Apache Web server
UsernamePassword
Back end
PHP scripting
LDAP
SAML authentication assertion
Front end
nanoShib
Attribute request SAML
assertion
(6) Attributes request
GlobusGatekeeper
<SAML>grid_proxy_init
nanoHUB Community Credential
Username+ Shibboleth IdP Id
Policy Information Point
SAML-enabled attributes handlers for GT4 -extract SAML assertion from proxy - query Shib AA based on SAML assertion from proxy- render access control decision based on attributes from Shib AA
TG RP
AA
VirtualMachine
(1)
(7) SAML authorization assertion
(5) Globus request
(2)(3)
(4)Attribute-based policies