23rd. june 2003jjb, gae workshop1 gae (grid analysis environment) overview of caltech effort slides...

17
23rd. June 2003 JJB, GAE Workshop 1 GAE (Grid Analysis Environment) Overview of Caltech effort Slides for the Caltech GAE Workshop June 2003

Post on 19-Dec-2015

228 views

Category:

Documents


4 download

TRANSCRIPT

23rd. June 2003 JJB, GAE Workshop 1

GAE(Grid Analysis Environment)

Overview of Caltech effort

Slides for the Caltech GAE Workshop June 2003

23rd. June 2003 JJB, GAE Workshop 2

Overview• GAE crucial for LHC experiments

– Utility of Grids proven for production– Their use for Analysis will be the Acid Test of Grids– Large, Diverse, Distributed community of users– Support for hundreds/thousands of analysis tasks– Widely varying requirements– Need for Priority Schemes, robust authentication and

security– Operation in a severely resource-limited and constrained

global system

• GAE is where the physics gets done– Where physicists learn to collaborate on analysis at a

distance

23rd. June 2003 JJB, GAE Workshop 3

Scope

• Diagram shows “snapshot” in time of analysis activities

• Groups of individuals, geographically separated, work on specific analysis topics (e.g. Supersymmetry)

• Resources in the Grid system are shared between the groups

• Boundaries enclosing the groups move and change shape as the composition or requirements of the groups change

23rd. June 2003 JJB, GAE Workshop 4

Architecture• Several candidate computing system

architectures have been proposed to support GAE

• At Caltech we have defined the “CAIGEE” Architecture, in collaboration with UCSD, UCR, FNAL and UCD

• Our work is focussed on developing critical missing components of the CAIGEE architecture, creating demonstration-grade applications to determine its validity, and working with other groups on integration of existing software into the CAIGEE scheme

23rd. June 2003 JJB, GAE Workshop 5

CAIGEE Architecture

23rd. June 2003 JJB, GAE Workshop 6

CAIGEE (continued)• Based on the use of Web Services or

Portals to provide heterogeneous clients access to analysis tools and data– An important feature is support for even

semi-infinitely thin clients, such as PDAs with very limited CPU/Memory

• Grid Authentication and transport built in – mediates client/service (portal) traffic

23rd. June 2003 JJB, GAE Workshop 7

Web Services• Data/Processing services offered via the Web• Widely adopted in the commercial world

– Good tools, de facto standard protocols, support etc.

• We have been confirming their usefulness for scientific data and services– Access to RDBMS-resident Tags and nTuples (Oracle,

SQLServer, PostgreSQL)– Access to ROOT files– Access to Objectivity object collections

• To do this, we have updated existing tools to “talk” with Web Services:– ROOT– COJAC (3D event viewer)– Others

23rd. June 2003 JJB, GAE Workshop 8

Web Services - Principles• Publish makes the service

description publicly available.– WSDL( Web Services Description

Language) is the language used to create the service description.

• Find discovers the web service– UDDI (Universal Description

Discovery and Integration) is the directory technology used by service registries. The registries contain descriptions of web services, and support lookup.

• Bind allows the service to be used by the client.

– SOAP (Simple Object Access Protocol) through which the service provider, service registry and service requestor communicate.

SERVICEPROVIDER

SERVICEREQUESTOR

SERVICE REGISTRY

1Publish3

Bind

2Find

23rd. June 2003 JJB, GAE Workshop 9

Web Services: Experimental Setup

ORACLE9i SERVERDATA(META DATA)

ORACLE9i SERVERDATA(META DATA)

MS-SQLDATA(META DATA)

JAVA XML API

to connect with DatabaseServer

ProxyServer

UUDI Registry Node

Client Web Application to connect with database

Bind with the provided service

SOAP Processor

WSDL file

UDDI SOAP Request and Response

Server withMaterialized View Database

Available On Fabric layer ofGrid

(Service Provider)

Available at Connectivity and Resource layer of Grid

(Service Requestor)

Provided at authentication (Service Registry)and security layer of Grid.

SOAP

SOAP

Server withMaster Database

HTTP Server

Data Replication through SSL

23rd. June 2003 JJB, GAE Workshop 10

Example Web Services

23rd. June 2003 JJB, GAE Workshop 11

GAE Tools (1) Clarens• Our emphasis is on accomodating existing

analysis tools in our CAIGEE architecture• To facilitate this, we use the “Clarens

Dataserver”• Clarens is server software that makes

datasets and services available to clients in a suitable lingua franca

• Clients initially Grid-authenticate with a Clarens server, and then are able to make use of a wide set of data and analysis services on offer

23rd. June 2003 JJB, GAE Workshop 12

GAE Tools (2) Clarens

• Clarens uses an interpreted Python framework running inside Apache

• PKI security for CA certificates• Commodity protocols (http/https) used to

talk with clients• Authorization of Web Service requests

using hierarchical ACLs for Virtual Organisations– Distributed administration of VO/ACLs

• Creating new Clarens services is straightforward and easy: this was one of the design goals.

23rd. June 2003 JJB, GAE Workshop 13

GAE Tools (4) Clarens• Services include:

– Access to SOCATS (next slide)– Storage Resource Broker interface– Application execution (submit jobs to

cluster schedulers)– Proxy escrow– File access to files in server filesystem

or SRB files

23rd. June 2003 JJB, GAE Workshop 14

GAE Tools (5) SOCATS• “STL Optimized Caching and Transport

System”• SOCATS is a general-purpose tool we have

developed that is able to deliver large object collections (result sets) in response to an SQL query on an RDBMS

• Targetted at C++ clients who wish to send a SQL Query to a remote RDBMS (using the Clarens dataserver) and receive back the database rows/result set as a collection of C++ objects

• Data delivered in binary format (avoid heavy overhead of explicit XML encoding)

• Large result sets are streamed efficiently to the client, so allowing client processing to begin as soon as the first data are available

23rd. June 2003 JJB, GAE Workshop 15

GAE Tools (6) GroupMan• Developed in response to need for user-

friendly administration of LDAP based “Virtual Organisations”

• Import to the LDAP server of certificates from CA

• User-friendly GUI allows ad hoc creation of user groups and VOs

• VO data stored to allow easy extraction by standard Grid-based tools– E.g. creation of Globus gridmap files

• Part of the DPE distribution

23rd. June 2003 JJB, GAE Workshop 16

GAE Tools (5) PDA Client• A handheld GAE client:

fruits of collaboration between NUST and Caltech

• Software is Java Analysis Studio (JAS) ported to the Pocket PC 2002 OS

• Hardware is any Pocket PC 2002 device

• This tool is still under development and currently lacks authentication/security components

23rd. June 2003 JJB, GAE Workshop 17

GAE Tools (6) Collaboration Desktop

• Four-screen desktop analysis setup

• Driven by a single server and single graphics card

• Four flat panel monitors• Allows simultaneous work on:

– Traditional analysis tools (e.g. ROOT)

– Software development (e.g. VS.NET)

– Even displays (e.g. IGUANA)– MonALISA monitoring displays– Persistent collaboration (e.g.

VRVS)– Online event or detector

monitoring– Web browsing, email– Chat windows, instant

messaging– Shared whiteboards etc.