restful web services for scientific computing

25
RESTful Web Services for Scientific Computing Joshua Boverhof, LBNL Shreyas Cholia, NERSC/LBNL OSCON 2011 July 28 2011, Portland OR

Upload: sugar

Post on 12-Jan-2016

46 views

Category:

Documents


0 download

DESCRIPTION

RESTful Web Services for Scientific Computing. Joshua Boverhof, LBNL Shreyas Cholia, NERSC/LBNL OSCON 2011 July 28 2011, Portland OR. NERSC. National Energy Research Scientific Computing Center DOE Office of Science HPC User Facility at Lawrence Berkeley Lab - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: RESTful Web Services for Scientific Computing

RESTful Web Services for Scientific ComputingJoshua Boverhof, LBNLShreyas Cholia, NERSC/LBNLOSCON 2011July 28 2011, Portland OR

Page 2: RESTful Web Services for Scientific Computing

NERSC

•National Energy Research Scientific Computing Center

•DOE Office of Science HPC User Facility at Lawrence Berkeley Lab

•Provides high performance compute, data, network and information services to scientists across the world

Page 3: RESTful Web Services for Scientific Computing

NERSC HPC Clusters

Page 4: RESTful Web Services for Scientific Computing
Page 5: RESTful Web Services for Scientific Computing

Web Gateways

•Old way - SSH + command line + batch system

•People now expect web interfaces for everything

•Usability - scientific computing should be as easy as online-banking

•don’t want generic options/tools not applicable to your science

•don’t want to deal with backend, middleware, UNIX CLI etc.

Page 6: RESTful Web Services for Scientific Computing

NERSC Scientific Gateways

DeepSky

- Astronomical Image Database 11 million images (70TB) The Gauge Connection

- QCD Lattice Gauge Data CXIDB

- X-Ray Image Data Bank 20th Century Reanalysis

- Reanalysis of 20th Century Climate Data Dayabay

- Dayabay Neutrino Detector Gateway ESG

- Earth System Grid Climate Gateway and Data-node

Page 7: RESTful Web Services for Scientific Computing

Motives for developing NERSC Web Toolkit (NEWT)

•Make it very easy for science teams to build web gateways to their data and computation

•We have already built several science specific gateways - want to encapsulate common patterns

•Provide Web APIs for access to backend resources for portal and web front-end developers.

Page 8: RESTful Web Services for Scientific Computing

NEWT Web Stack

•Web Service

•Built with Django Web Framework

•Exposes NERSC Resources as HTTP URLs

•Generally use REST conventions

•Access HPC Resources over the web using HTTP + JSON

•Frontend Development

• javascript Library “newt.js”

•AJAX

Page 9: RESTful Web Services for Scientific Computing

Things you can do ...

•Authenticate using NERSC credentials

•Check machine status

•Upload and download files

•Submit a compute job

•Monitor a job

•Get user account information

•Store app data

•Issue UNIX commands

Page 10: RESTful Web Services for Scientific Computing

Architecture

NEWT Django

Client: Web Application - HTML 5/AJAX

System Resources (via Globus)

Persistent Store (NoSQL DB)

Accounting Information

Files

Batch Jobs

Shell Commands

Status

CouchDB NIM

Authentication

MyProxy CA

Internal DB:session, cred, user information

http request JSON data

Page 11: RESTful Web Services for Scientific Computing

RESTful Conventions

•Resources represented as a set of URLs

• HTTP verbs

• GET: Idempotent operation, retrieve resource representation

• PUT: Idempotent operation, set resource representation

• DELETE: Idempotent operation, delete resource

• POST: Avoid overloading to use as RPC. Typically use as a factory resource.

Page 12: RESTful Web Services for Scientific Computing

NEWT Resources

resource description

login Login information

/file/[machine]/[path]/ List, Upload, Download file

/status/[machine] Machine status, uptime, queue stats

/job/jobs/[id] The user’s jobs across all resources

/job/[machine]/fork/ Fork factory resource

/job/[machine]/batch/ Batch factory resource

/queue/[machine]/ Batch queue factory resource

/account/[NIM resource] Account information, cpu hours, accounts

Page 13: RESTful Web Services for Scientific Computing

Login Resource: Authenticate

$.newt_ajax({

url: ”/auth/",

type: ”POST",

data: {'username':username, 'password':password},

success: (res, textStatus, jXHR) {} });

Page 14: RESTful Web Services for Scientific Computing

Login Resource

$.newt_ajax({

url: "/login/",

type: ”GET",

success: function(data){},

});

•200 OK

•{"username": ”joe", "session_lifetime": 14384, "auth": true}

Page 15: RESTful Web Services for Scientific Computing

Queue Resource: PBS job submission

$.newt_ajax({

url: "/queue/hopper/",

type: "POST",

data: {"jobfile": filename},

success: function(data){

$("#output").append(data.jobid);

},

});

This is a jQuery JavaScript function that calls the NEWT API. NEWT returns a JSON object that looks like

{"status": "OK", "error": "", "jobid" : "hop1234.id" }

Page 16: RESTful Web Services for Scientific Computing

Queue Resource: PBS job submission

$.newt_ajax({

url: "/queue/franklin/",

type: "POST",

data: {"jobscript”: “#PBS -l mppwidth=8\n mpirun -n 8 /bin/hostname”},

success: function(data){

$("#output").append(data.jobid);

},

});

This is a jQuery JavaScript function that calls the NEWT API. NEWT returns a JSON object that looks like

{"status": "OK", "error": "", "jobid" : "7259874 " }

Page 17: RESTful Web Services for Scientific Computing

Command Resource: Fork job submission

$.newt_ajax({

url: “/command/franklin",

type: "POST",

data: {”executable": “/bin/date”},

success: function(data){

$("#output").append(data.jobid);

},

});

This is a jQuery JavaScript function that calls the NEWT API. NEWT returns a JSON object that looks like

{"output": "Wed Jul 20 22:51:58 PDT 2011", "error": ""}

Page 18: RESTful Web Services for Scientific Computing

Simple Usage curl$ curl -k -c cookies.txt -X POST -d "username=boverhof&password=$PASS" https://portal-auth.nersc.gov/newt/auth;

{"username": "boverhof", "session_lifetime": 14397, "auth": true}

$ curl -k –b cookies.txt -X GET https://portal-auth.nersc.gov/newt/status/franklin;

{"status": "up", "system": "franklin"}

$ curl -k –b cookies.txt -d "executable=/bin/date" https://portal-auth.nersc.gov/newt/job/franklin/fork/;

{"status": null, "executable": "/bin/date", "user_id": 18, "url": "https://franklingrid.nersc.gov:60886/81661/1311833735/", "jobmanager": "fork", "submitted": "2011-07-20T06:15:35", "machine": "franklin", "finished": null, "output": null, "id": 47789}

$ curl -k –b cookies.txt -X GET https://portal-auth.nersc.gov/newt/job/jobs/47789;

{"status": "DONE", "executable": "/bin/date", "user_id": 18, "url": "https://franklingrid.nersc.gov:60886/81661/1311833735/", "jobmanager": "fork”, "submitted": "2011-07-20T06:15:35", "machine": "franklin", "finished": "2011-07-20T06:15:36", "output": "Wed Jul 20 23:15:36 PDT 2011\n", "id": 47789}

Page 19: RESTful Web Services for Scientific Computing

Django settings: Pluggable Authentication

•Authenticate using NERSC credentials to a myproxy-server

•Add AuthenticationMiddleware

django.contrib.auth.middleware.AuthenticationMiddleware

• Configure authentication backend

AUTHENTICATION_BACKENDS = ( 'newt.authnz.myproxy_backend.MyProxyBackend’ )

• Implement authentication backend

class MyProxyBackend:

def authenticate(self, username=None, password=None):

# Myproxy logon

Page 20: RESTful Web Services for Scientific Computing

Django settings: File Upload

•File Upload: Upload to portal, store in temporary file, then transfer to remote file system.

•Configure file upload handler ( settings.py )

FILE_UPLOAD_HANDLERS= ( 'newt.file.uploadhandler.RemoteCopyTemporaryFileUploadHandler’ )

• Implement authentication backend

from django.core.files.uploadhandler import TemporaryFileUploadHandler as _TemporaryFileUploadHandler

class RemoteCopyTemporaryFileUploadHandler(_TemporaryFileUploadHandler):

def upload_complete(self):

# Transfer to remote filesystem

Page 21: RESTful Web Services for Scientific Computing

Implementation Details ( Hacks )

• Django v1.[1,2,3?] support for HTTP verbs lacking

•PUT: Data is not loaded, used code “coerce_put_post” from django-piston

• Looking at using Tastypie, a webservice API framework for Django. It provides a convenient, yet powerful and highly customizable, abstraction for creating REST-style interfaces.

Page 22: RESTful Web Services for Scientific Computing

NOVA: VASP portal

Page 23: RESTful Web Services for Scientific Computing

NOVA: VASP portal

Page 24: RESTful Web Services for Scientific Computing

NOVA: VASP portal

Page 25: RESTful Web Services for Scientific Computing

NOVA: VASP portal

https://newt.nersc.govhttps://portal-auth.nersc.gov/nova/