cs5547 1 e-science & grid computing - introduction - what is e-science? what is the grid? grid...

20
http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 1 CS5547 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware Virtual Organisations - some issues Data access & integration Metadata MSc in e-Science Technology at-a-glance

Upload: blanche-davis

Post on 12-Jan-2016

223 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 1

CS5547 e-Science & Grid Computing- introduction -

What is e-Science? What is the Grid?Grid middleware

Virtual Organisations - some issuesData access & integration

MetadataMSc in e-Science Technology at-a-glance

Page 2: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 2

CS5547 Some definitions

e-Science“The large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet. “Typically, a feature of such collaborative scientific enterprises is that they will require access to very large data collections, very large scale computing resources and high performance visualisation back to the individual user scientists.”

[nesc.ac.uk]

Grid“An infrastructure that enables flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions and resources.”

[Foster & Kesselman, globus.org]

Page 3: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 3

CS5547 The Global Grid

htt

p:/

/ww

w.n

esc

.ac.

uk/

even

ts/a

hm

20

04

/pre

sen

tati

on

s/Ton

yH

ey.p

pt

Page 4: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 4

CS5547 UK SuperJANET 4/5

htt

p:/

/ww

w.n

esc

.ac.

uk/

even

ts/a

hm

20

04

/pre

sen

tati

on

s/Ton

yH

ey.p

pt

(Links up to 2.5Gbit/s)

Page 5: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 5

CS5547 Scale, distribution, complexity

Multiscale modelling of the heart

Cell

Person

Multiscale modelling of cancer

htt

p:/

/ww

w.n

esc

.ac.

uk/

even

ts/a

hm

20

04

/pre

sen

tati

on

s/Ton

yH

ey.p

pt

Page 6: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 6

CS5547 Large Hadron Collider (LHC)

http://gridportal.hep.ph.ic.ac.uk/rtm/

htt

p:/

/ww

w.n

esc

.ac.

uk/

even

ts/a

hm

20

04

/pre

sen

tati

on

s/B

ob

Jon

es.

pp

t

Page 7: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 7

CS5547 e-Science & engineering

Engine flight data

Airline office

Maintenance Centre

European data center

London Airport

New York Airport

American data center

Grid

Diagnostics Centre

“A Significant factor in the success of the Rolls-Royce campaign to power the Boeing 7E7 with the Trent 1000 was the emphasis on the new aftermarket support service for the engines provided via DS&S. Boeing personnel were shown DAME as an example of the new ways of gathering and processing the large amounts of data that could be retrieved from an advanced aircraft such as the 7E7, and they were very impressed”, DS&S 2004

XTO

Engine Model

Case Based Reasoning

Signal Data Explorer

Companies:Rolls-RoyceDS&S Cybula

Universities:York,Leeds,Sheffield, Oxford

htt

p:/

/ww

w.n

esc

.ac.

uk/

even

ts/a

hm

20

04

/pre

sen

tati

on

s/Ton

yH

ey.p

pt

Page 8: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 8

CS5547

A B C

A: Identification of overlapping sequenceB: Characterisation of nucleotide sequenceC: Characterisation of protein sequence

e-Science workflows

htt

p:/

/ww

w.n

esc

.ac.

uk/

even

ts/a

hm

20

04

/pre

sen

tati

on

s/Ton

yH

ey.p

pt

Page 9: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 9

CS5547Grid middleware: Globus toolkit (GT)

The Anatomy of the Grid: Enabling Scalable Virtual Organizations. I. Foster, C. Kesselman, S. Tuecke. International J. Supercomputer Applications, 15(3), 2001.

The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. I. Foster, C. Kesselman, J. Nick, S. Tuecke, Open Grid Service Infrastructure WG, Global Grid Forum, 2002. h

ttp

://w

ww

.glo

bu

s.org

Page 10: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 10

CS5547 Grid & Web Services convergence

The definition of WSRF means that the Grid and Web services communities can move forward on a common base.

htt

p:/

/ww

w.g

lob

us.

org

Page 11: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 11

CS5547 Web & Grid Services

‘WS-I+’profile

WS-I

Standards that havebroad industry support

and multiple interoperableimplementations

Specifications that are emergingfrom standardisation process

and are recognised as being ‘useful’

Specifications that have/will enter a standardisation processbut are not stable and are still experimental

htt

p:/

/ww

w.g

lob

us.

org

Page 12: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 12

CS5547 UK National Grid Service

Projectse-Mineralse-MaterialsOrbital Dynamics of GalaxiesBioinformatics (using BLAST) GEODISE projectUKQCD Singlet meson projectCensus data analysis MIAKT projecte-HTPX project.RealityGrid (chemistry)

Users LeedsOxfordUCLCardiffSouthamptonImperialLiverpoolSheffieldCambridgeEdinburghQUBBBSRCCCLRC

Interfaces

OGSI::LiteOGSI::Lite

htt

p:/

/ww

w.n

esc

.ac.

uk/

even

ts/a

hm

20

04

/pre

sen

tati

on

s/Ton

yH

ey.p

pt

Page 13: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 13

CS5547 Grid Virtual Organisations - some issues

Forming a VO dynamically• partner identification• Service Level Agreements

(SLAs)• QoS, trust, reputation

Operating a VO• monitoring QoS• perturbation: coping with

failures - and new opportunities!

• policing: what went wrong? who’s to blame?

www.conoise.org

Page 14: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 14

CS5547 Grid Data Service

Data ResourceImplementation

Role Mapper

TheEngine

datadata

dataquery

perform document

response document

elementelement element

credentials

QueryActivity

TransformActivity

DeliveryActivity

role

credentialsconnection

connection

role

htt

p:/

/ww

w.o

gsa

dai.org

.uk/

Page 15: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 15

CS5547 GDS - pipeline example

DeliverToURL

<sqlQueryStatement name="statement"> <expression> select * from myTable where id=10 </expression> <resultSetStream name=“MyOutput"/></sqlQueryStatement>

<deliverToURL name="deliverOutput"> <fromLocal from=“MyOutput"/> <toURL> ftp://anon:[email protected]/home </toURL></deliverToURL>

SqlQuery

Statement

htt

p:/

/ww

w.o

gsa

dai.org

.uk/

Page 16: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 16

CS5547 Grid data access & integration

Solutions in place to handle• heterogeneous data storage• pipelines / dataflows• access control• … within the Grid svc arch

Not specific to e-Science!e.g. see FirstDIG project

Major issues remain, including• provenance - where did it

come from, who did what to it?

• data quality - living with variable-quality data (www.qurator.org)

htt

p:/

/ww

w.o

gsa

dai.org

.uk/

Page 17: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 17

CS5547 Metadata in e-Science

Publications• formal/reviewed• “grey”• associated artefects

People• expert directories• communities of practice

Projects• formal/funded• working groups

Experiment datasets• formally curated• raw/pre-processed• in vivo / in vitro / in silico

Scientific method• experiment workflow• knowledge roles:

hypotheses, observations, predictions, deductions, …

• Discourse & natural arguments: proof, refutation, agreement, …

Page 18: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 18

CS5547 Managing scientific metadata

e-Science metadata management platform

Hypothesis

Hypothesis Publication

Agrees With Hypothesis

Disagrees With Hypothesis

Hypothesis Publication Publication

HypothesisPublication

Experiment

Experiment

Described

In

Evidence

Page 19: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 19

CS5547 Fearlus-Gpilot project

desktop client

metadata schema(ontology)

metadata client

Globusclient

Page 20: CS5547  1 e-Science & Grid Computing - introduction - What is e-Science? What is the Grid? Grid middleware

http://www.csd.abdn.ac.uk/teaching/levelfive/CS5547 20

CS5547 MSc e-Science Technologies: next…

CS5547 e-Science & Grid Computing• Grid middleware, e-Science workflow, metadata

CS5553 Intelligent Architectures• technologies for Virtual Organisations

CS5545 Data Interpretation & Communication• technologies at the data/user-scientist interface

CS5544 E-Technology Workshop• group project, with an e-Science application

CS5945 MSc Project in E-Technology• potential to do a project with user-scientists