usgs grid exploratory status review stuart doescher mike neiers usgs/edc may 10 2004

21
USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

Upload: shawn-tate

Post on 18-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

Current Data delivery – FTP based Pull – Semi anonymous ftp –Product ready – sent to user with instructions and password –User ftp via “anonymous” and with provided password –Ftp demon positions user to appropriate directory –User pulls data Push – routine data flows to high volume users –Account provided on remote system –When data available it is pushed to remote system

TRANSCRIPT

Page 1: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

USGSGRID Exploratory

Status Review

Stuart DoescherMike NeiersUSGS/EDCMay 10 2004

Page 2: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

GRID ExploratoryGRID Exploratory

Preliminary investigation to explore feasibility of utilizing GRID technologies for the improvements to a variety of business needs within the USGS and its communications with external users.

• First focus area is on the delivery and reception of data that will primarily employ the services of the GridFTP and certificate authority.

• The second focus area will to explore the utility of the GRID to scale data sharing.

Page 3: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

Current Data delivery – FTP basedCurrent Data delivery – FTP based

• Pull – Semi anonymous ftp– Product ready– Email sent to user with instructions and password– User ftp via “anonymous” and with provided

password– Ftp demon positions user to appropriate directory– User pulls data

• Push – routine data flows to high volume users– Account provided on remote system– When data available it is pushed to remote system

Page 4: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

• For routine multiple usage customers– Establish “Certificate process” with customer

• Self-signed certificate authority• Customer generates private/public key pair• Generate user certificate with public key• Add user certificate to list of trusted users

– Customer must install GridFTP client• Globus toolkit data management client bundle• Gsincftp• Java Commodity Grid Kit for Windows

Potential Future data delivery –Potential Future data delivery –GRIDftp basedGRIDftp based

Page 5: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

• For routine multiple usage customers– Pull –

• Product ready• Email notifies user that data is ready• User using GRIDftp and user certificate for authentication provided access and

pulls data– Push –

• Account provided on remote system with host certificate and our user certificate

• These GRID certificate establish Virtual Organization between the two parties• When data available, is GRIDftp used to pushed data to remote system

Potential Future data delivery -Potential Future data delivery -GRIDftp basedGRIDftp based

Page 6: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

• For single usage customersProcess to

• Establish “Certificate process” with customer• Customer must install GridFTP client

Currently seems too complex (not worth the effort)

Would like to have simplified method such as• Email a one time use a “user certificate”• Integrated with browser, built in GRIDftp client

Potential Future data delivery –Potential Future data delivery –GRIDftp basedGRIDftp based

Page 7: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

Exploritory # 2Exploritory # 2• Currently, underway is a project working on Calibration

and Validation (Cal/Val).• The current approach is to establish a Web based

mechanism to promote and ease the sharing of data between the Cal/Val collaborators.

• Phase 1 of the project has been completed and can be found at http://edcsgs16.cr.usgs.gov/wgiss/ (user code = calval99 password = wgiss03). The project is beginning to move into Phase 2..

• The manual mechanics for building the web site and coordination of the data and data storage sites bring in the question of the scalability as a move is made toward an operational state.

Page 8: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004
Page 9: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004
Page 10: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004
Page 11: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

Cal/Val WTF StrategyCal/Val WTF Strategy

Phase 1 Phase 2 Operational

Data Coverage Sites

5 5 75 +

Data Type 4 6 30 +User sites 3 5 10 +Data Provider Sites

1 2 10 +

Page 12: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

GRID OpportunitiesGRID Opportunities

• Explore GRID services to identify opportunities that will improve the ability to scale Cal/Val

• In parallel to the transition to Phase 2 explore and evaluate Catalogue Serves methods.

• The catalog manager services will be examined will be Metadata Catalog Service (MCS) and the Storage Resource Broker (SRB) Metadata Catalog (MCAT).

• Evaluate results and propose scenario for GRID service to support future phases of Cal/Val.

Page 13: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

Opportunities with the GRIDOpportunities with the GRID

• The catalog manager services will be examined will be Metadata Catalog Service (MCS) and the Storage Resource Broker (SRB) Metadata Catalog (MCAT).

• Chose MCS from standpoint of “simplicity”

Page 14: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

ConceptConcept

Page 15: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

MCS ExampleMCS Example

Stand alone Java

Page 16: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

Stand alone Java

Page 17: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

Web based Java Server Pages (JSP)

Page 18: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

Web based Java Server Pages (JSP)

Page 19: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

Observations for MCS 2Observations for MCS 2

• Installation easier than the Globus Toolkit• Required installation of other packages

– My SQL– Java Virtual Machine– TomCat (Java Webserver Apache Support)– Apache AXIS (Webserver container)

• Need to write code to load metadata

Page 20: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

Observations for MCS 2 (cont)Observations for MCS 2 (cont)

• No tools to coordinate metadata from multiple sites

• Easy to use Java API– Writing simple query application was quick– Both stand-alone Java and web-based (JSP)

• Authentication not incorporated

Page 21: USGS GRID Exploratory Status Review Stuart Doescher Mike Neiers USGS/EDC May 10 2004

Next StepsNext Steps

• Move to version 3 of Globus toolkit– Includes backward compatibility with version 2– Web services may reduce firewall issues

• Explore additional possible GRID opportunities– RLS - Replicate Location Service – MDS – Monitor & Discover Services– Retry MCS version 3