tony hey director of uk e-science programme [email protected] e-science, databases and the grid

17
Tony Hey Director of UK e-Science Programme [email protected] e-Science, Databases and the Grid

Upload: madelyn-harne

Post on 15-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Tony HeyDirector of UK

e-Science [email protected]

e-Science, Databases and the Grid

e-Science and the Grid ‘e-Science is about global collaboration in

key areas of science, and the next generation of infrastructure that will enable it.’

‘e-Science will change the dynamic of the way science is undertaken.’

John Taylor Director General of Research Councils Office of Science and Technology

NASA’s IPG• The vision for the Information Power Grid

is to promote a revolution in how NASA addresses large-scale science and engineering problems by providing persistent infrastructure for– “highly capable” computing and data

management services that, on-demand, will locate and co-schedule the multi-Center resources needed to address large-scale and/or widely distributed problems

– the ancillary services that are needed to support the workflow management frameworks that coordinate the processes of distributed science and engineering problems

IPG Baseline System

ARC

SDSC

LaRC

GSFC

MSFC

KSCJSC

NCSA

Boeing

JPL

NGIX

EDC

NRENCMU

GRC

300 node Condor pool

NTON-II/SuperNet

MCAT/SRB

O2000

DMF MDSCA

O2000

O2000

cluster

clusterO2000

MDS

MDS

•Lift Capabilities•Drag Capabilities•Responsiveness

•Deflection capabilities•Responsiveness

•Thrust performance•Reverse Thrust performance•Responsiveness•Fuel Consumption

•Braking performance•Steering capabilities•Traction•Dampening capabilities

Crew Capabilities- accuracy- perception- stamina- re-action times- SOP’s Engine Models

Airframe Models

Wing Models

Landing Gear Models

Stabilizer Models

Human Models

Multi-disciplinary Simulations

Whole system simulations are produced by couplingall of the sub-system simulations

The Grid as an Enabler for Virtual Organisations Ian Foster and Carl Kesselman – ‘Take 2’• The Grid is a software infrastructure that

enables flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions and resources- includes computational systems and data storage resources and specialized facilities

• Enabling infrastructure for transient ‘Virtual Organisations’

Globus Grid Middleware• Single Sign-On

– Proxy credentials, GRAM

• Mapping to local security mechanisms– Kerberos, Unix, GSI

• Delegation– Restricted proxies

• Community authorization and policy– Group membership, trust

• File-based– GridFTP gives high performance FTP

integrated with GSI

US Grid Projects• NASA Information Power Grid• DOE Science Grid• NSF National Virtual Observatory• NSF GriPhyN• DOE Particle Physics Data Grid• NSF Distributed Terascale Facility• DOE ASCI Grid• DOE Earth Systems Grid• DARPA CoABS Grid• NEESGrid• DOH BIRN• NSF iVDGL

EU GridProjects• DataGrid (CERN, ..)• EuroGrid (Unicore)• DataTag (TTT…)• Astrophysical Virtual Observatory• GRIP (Globus/Unicore)• GRIA (Industrial applications)• GridLab (Cactus Toolkit)• CrossGrid (Infrastructure Components)• EGSO (Solar Physics)

National Grid Projects• UK e-Science Grid• Japan – Grid Data Farm, ITBL• Netherlands – VLAM, PolderGrid• Germany – UNICORE, Grid proposal• France – Grid funding approved• Italy – INFN Grid• Eire – Grid proposals• Switzerland - Grid proposal• Hungary – DemoGrid, Grid proposal• ApGrid• ……

UK e-Science Initiative

• £120M Programme over 3 years• £75M is for Grid Applications in all

areas of science and engineering• £10M for Supercomputer upgrade• £35M for development of ‘industrial

strength’ Grid middleware Require £20M additional ‘matching’

funds from industry

Cambridge

Newcastle

Edinburgh

Oxford

Glasgow

Manchester

Cardiff

Southampton

London

Belfast

DL

RAL Hinxton

UK e-Science Grid

IBM Grid Press Release: 2/8/01

Interview with Irving Wladawsky-Berger:

• ‘Grid computing is a set of research management services that sit on top of the OS to link different systems together’

• ‘We will work with the Globus community to build this layer of software to help share resources’

• ‘All of our systems will be enabled to work with the grid, and all of our middleware will integrate with the software’

Grid Database Requirements (1)

• Scalability– Store Petabytes of data at TB/hr– Low response time for complex queries to

retrieve data for more processing– Large number of clients needing high access

throughput

• Grid Standards for Security, Accounting, ..– GSI with digital certificates– Data from multiple DBMS– Co-schedule database and compute servers

Grid Database Requirements (2)

• Handling Unpredictable Usage– Most existing DB applications have reasonably

predictable access patterns and usage ond DB resources can be restricted

– Typical commercial applications generate large numbers of small transactions from large number of users

– Grid applications can have small number of large transactions needing more ad hoc access to DBMS resources

much greater variations in time and resource usage

Grid Database Requirements (3)

• Metadata-driven access– Expect need 2-step access to dataStep 1: Metadata search to locate required

data on one or more DBMSStep 2: Data accessed, sent to compute

server for further analysis– Application writer does not know which

specific DBMS accessed in Step 2 Need standard API for Grid-enabled DBMS

• Multiple Database Integration- Support distributed queries and

transactions- Scalability requirements

• Application projects use Clusters, Supercomputers, Data Repositories

• Emphasis on support for data federation and annotation as much as computation

• Metadata and ontologies key to higher level Grid services

• For commercial success Grid needs to have interface to DBMS

Summary