gridsolve: a network enabled solver

13
Presented by GridSolve: A Network Enabled Solver Asim YarKhan and Jack Dongarra University of Tennessee

Upload: ursala

Post on 14-Jan-2016

33 views

Category:

Documents


0 download

DESCRIPTION

GridSolve: A Network Enabled Solver. Asim YarKhan and Jack Dongarra University of Tennessee. GridSolve. Grid based software-hardware- data server Based on a Remote Procedure Call model but with … - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: GridSolve: A Network Enabled Solver

Presented by

GridSolve: A Network Enabled Solver

Asim YarKhan and Jack Dongarra

University of Tennessee

Page 2: GridSolve: A Network Enabled Solver

2 2 YarKhan_GridSolve_0611

GridSolve

Grid based software-hardware-data server

Based on a Remote ProcedureCall model but with …

resource discovery, dynamicproblem solving capabilities,load balancing, fault tolerance,asynchronous calls, security, …

Easy-of-use paramount

It’s about providing transparent access to resources.

Make it easy to wrap legacy codes into services

Evolution of successful NetSolve project

Page 3: GridSolve: A Network Enabled Solver

3 3 YarKhan_GridSolve_0611

GridSolve Architecture

Agent

server list

Cluster

data

Cluster

request Single

processor

Batch queueresul

tClient

[x,y,z,info] = gridsolve(‘dgesv’, A, B)

GridSolve clients: Matlab, C, Fortran[NetSolve clients: Java, Mathematica, Excel, IDL, Octave]

`

Resource discoverySchedulingLoad balancingFault tolerance

Page 4: GridSolve: A Network Enabled Solver

4 4 YarKhan_GridSolve_0611

GridSolve Client Dynamic service bindings

Client does not need to have stubsfor the services that it wishes to use

Opaque networking interactions.

API provides a variety of methods Blocking, non-blocking, task farms, …

Intuitive and easy to use. Matlab: Solve using dgesv

[x,y,z,info]=gs_call('dgesv',m,1,a,m,b,m) C: Call dgesv using GridRPC

grpc_initialize() grpc_function_handle_default(&handle, "dgesv") status = grpc_call(&handle, n, nrhs, a, lda, ipiv, b, ldb, &info);

Client

Page 5: GridSolve: A Network Enabled Solver

5 5 YarKhan_GridSolve_0611

GridRPC – Grid Remote Procedure Call

GGF proposed standard Global Grid Forum Research Group on Programming Models Implementations: Ninf-G (AIST), GridSolve/NetSolve (UTK),

DIET (INRIA, ENS)

GridRPC API grpc_initialize, grpc_finalize Function handle create, initialize, destroy, get Grpc_call blocking, grpc_call_async non-blocking Grpc_probe, cancel, wait, wait_and/or/any

GridSolve uses GridRPC as primary API Older NetSolve API available as wrapper Added calls based on GridRPC API to support fault tolerance,

dynamic scheduling, …

Page 6: GridSolve: A Network Enabled Solver

6 6 YarKhan_GridSolve_0611

GridSolve Agent

Agent acts as name serverand information service Client users and administrators

can query the hardware and software services available.

Interactions mediated by agent Scheduling, tracking, server fault tolerance, etc

Resource scheduler Maintains both static and dynamic information regarding

server components Can use execution history to build performance models

for services Can simulate multi-service executions to predict best server

Page 7: GridSolve: A Network Enabled Solver

7 7 YarKhan_GridSolve_0611

Adding Services to GridSolve Server

GSIDLParser/

Compiler

ServerService

Service

Service

Service

NewService

New Service Added!

Fortran ROUTINE dgesv(IN int N, IN int NRHS, INOUT double A[LDA][N], IN int LDA, OUT int IPIV[N], INOUT double B[LDB][NRHS], IN int LDB, OUT int INFO)"Solves a general system of linear equations AX = B"LIBS = "/usr/local/lib/liblapack.a /usr/local/lib/libf77blas.a /usr/local/lib/libatlas.a"LANGUAGE = "FORTRAN"LIBS = "$(LAPACK_LIBS) $(BLAS_LIBS)"COMPLEXITY = "2.0*pow(N,3.0)*(double)NRHS"MAJOR="COLUMN"

Page 8: GridSolve: A Network Enabled Solver

8 8 YarKhan_GridSolve_0611

GridSolve Backends

Scripts encapsulate service management for PBS, MS Compute Cluster (job submit, probe, cancel)

Agent

Server Server

Server Server

GridSolveClient

GridSolveClient

Server Server

MS Compute Cluster, PBS[Condor, ScaLAPACK, LFC, etc.]GridSolve System

User maybe unaware of parallel processing

Page 9: GridSolve: A Network Enabled Solver

9 9 YarKhan_GridSolve_0611

Distributed Storage Infrastructure

Client optionally pushes argument data to DSI

DSI data caches

Server

Server cluster

Server

client

DSI API currently instantiatedusing IBP (Internet BackplaneProtocol)

Page 10: GridSolve: A Network Enabled Solver

10 10 YarKhan_GridSolve_0611

GridSolve: Benefits

Domain Scientists use SCEs GridSolve provides the ability for SCE environments

to easily access and use grid resources [Ease of use!]

Libraries GridSolve can provide easy access to high performance

libraries, so that end users do not have to install them

Scheduling GridSolve can choose the software/hardware resource

appropriate for the problem

Resource Aggregation GridSolve agent provides a single access point for multiple

resources/clusters

Page 11: GridSolve: A Network Enabled Solver

11 11 YarKhan_GridSolve_0611

GridSolve In-Progress

Scheduling work Work with Emmanuel Jeannot, INRIA History based performance estimation

Adds more accurate server/service performance model based on prior history

Communication cost estimates Client estimates communication costs for a subset of servers

via a simple probe Perturbation model for scheduling

The agent uses a model of the currently executing jobs on the servers to schedule jobs (includes estimated completion times)

Client interfaces IDL – Interactive Data Language

Page 12: GridSolve: A Network Enabled Solver

12 12 YarKhan_GridSolve_0611

GridSolve Status

Version 0.15 (Sept 2006)

http://icl.cs.utk.edu/netsolve

Supported Platform

Linux, Solaris, BSDs, MacOS X,

Should work in most POSIX environments

Windows native client (MSVC)

Windows Compute Cluster backend

PBS backend

Page 13: GridSolve: A Network Enabled Solver

13 13 YarKhan_GridSolve_0611

Contacts

Asim YarKhan and Jack Dongarra

University of Tennessee