an overview of high performance computing infrastructures ... · –slower adoption: prefer...

56
An overview of high performance computing infrastructures, technologies and services Dr. Fabrizio Gagliardi EMEA Director External Research Microsoft Research ESPOL Ciencia Ecuador, November 13-15 th

Upload: others

Post on 18-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

An overview of high performance computing

infrastructures, technologies and services

Dr. Fabrizio GagliardiEMEA Director

External Research

Microsoft Research

ESPOL Ciencia

Ecuador, November 13-15th

Page 2: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Outline

• Yesterday and today:

Achievements in the area of e-Infrastructures and Grid computing

• Examples beyond e-Science

• Issues : Complexity, Cost, Security, Standards

• The future:

• Cloud Computing, Virtualisation, Data Centers, Software as a Service, Multi-core architectures, Green IT

Conclusions

Page 3: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

The European Commission strategy for e-Science

12/15/2008 Ecuador 3

Mario Campolargo, EC, DG INFSOM, Director of Directorate F: Emerging Technologies and Infrastructures

http://cordis.europa.eu/fp7/ict/programme/events-20070524_en.html

Page 4: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

e-Infrastructure achievements: Research Networks

12/15/2008 Ecuador 4

Page 5: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Ecuador 5

RedCLARA + int’l links

(until March 2008)

• RedCLARA (ALICE project):

– Latin American backbone

– to Europe (GÉANT2)622 Mbps

• Financed by EU (in part)

– Mexico – USA (Pacific Wave): 1 Gbps

• Financed by NSF

• WHREN/LILA

– to USA (Atlantic Wave): 2.5 Gbps (with ANSP, RedCLARA)

• Financed by NSF + FAPESP

Page 6: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Kasetsart University, Thailand 6

Steady growth / Improved reliability~80.000 cores, ~400PBs, 261 sites, >270 VOsStats by GStat, 09/09/2008

The EGEE Production Grid Infrastructure

0

50

100

150

200

250

300

Apr 04

Jul 04

Okt 04

Jan 05

Apr 05

Jul 05

Okt 05

Jan 06

Apr 06

Jul 06

Okt 06

Jan 07

Apr 07

Jul 07

Okt 07

Jan 08

Apr 08

No. Sites

01000020000300004000050000600007000080000

Ap

r 04

Jul 0

4

Okt

04

Jan

05

Ap

r 05

Jul 0

5

Okt

05

Jan

06

Ap

r 06

Jul 0

6

Okt

06

Jan

07

Ap

r 07

Jul 0

7

Okt

07

Jan

08

Ap

r 08

No. Cores

Courtesy of Erwin Laure19/9/2008

Page 7: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

7 Ecuador

The Globalisation of Grid Infrastructures

12/15/2008

Page 8: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

• E-science grid facility for Europe and Latin America

– EELA1: www.eu-eela.org/first-phase.php

• A Grid Infrastructure of 16 Resource Centres (RCs) summing up to over 730 CPU cores and 60 TB of storage

– EELA2: Consolidate and expand the current EELA e-Infrastructure and work on its sustainability

EELA and EELA-2

12/15/2008 Ecuador 8

Page 9: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

e-Infrastructure HPC next steps: EGI and PRACE

12/15/2008 Ecuador 9

European

Ecosystem

tier 0

Page 10: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

• EC is expected to further invest on e-Infrastructures and Grids in particular*– Call 4 - GEANT3 and Data interoperability call closed earlier in September

(113M)– Call 5 – Small call (~10M) on Scientific publications repositories, ERA-NET

and international cooperation due March 2009– Call 6 – ICT based e-Infrastructures (EGI plus other Grid infrastructures and

Virtual Communities) due Autumn 2009 – estimated 75M € for around 2 years (50 M € EGI plus other Grids and 25 M € Virtual Communities)

– Call 7 – ICT based e-Infrastructures (Scientific Data infrastructures) due Spring 2010

– Call 8 - estimated for Spring 2012

• EC consultation on 2010 Work Programme on Grids, Data and Virtual Communities taking place in November (by invitation only)

• Next e-IRG workshop (21-22 October, Paris) is expected to promote the use of e-Infrastructures by the ESFRI facilities (ESFRI preparatory projects invited) – www.e-irg.eu

* Main source: EC Work Programme 2008 (WP2009 will be updating the above and be released in October)

EC continues to invest in Grid-based e-Infrastructures

19/9/2008 Kasetsart University, Thailand 10

Page 11: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

• Grids for e-Science: mainly a success story!

– Several maturing Grid Middleware stacks

– Many HPC applications using the Grid• Some (HEP, Bio) in production use

• Some still in testing phase: more effort

required to make the Grid their day-to-day workhorse

– e-Health applications also part of the Grid

– Some industrial applications: • CGG Energy

The experience of Grid / HPC 1/3

19/9/2008 11Kasetsart University, Thailand

Page 12: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

• Grids beyond e-Science?

– Slower adoption: prefer different environments, tools and have different TCOs

• Intra grids, internal dedicated clusters , cloud computing

– e-Business applications• Finance, ERP, SMEs and Banking!

– Industrial applications• Energy, Automotive, Aerospace, Pharmaceutical industry,

Telecom

– e-Government applications• Earth Observation, Civil protection:

• e.g. The Cyclops project

The experience of Grid / HPC 2/3

19/9/2008 12Kasetsart University, Thailand

Page 13: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

• IT Industry demonstrated interest in becoming an HPC infrastructure provider and/or user (intra-grids):– On-demand infrastructures:

• Cloud and Elastic computing, pay as you go…• Data centers: Data getting more and more attention

– Service hosting: outsourced integrated services• Software as a Service (SaaS)

– Virtualisation being exploited in Cloud and Elastic computing (e.g. Amazon EC2 virtual instances)

• “Pre-commercial procurement”– Research-industry collaboration in Europe to achieve new

leading-edge products• Example: PRACE building a PetaFlop Supercomputing Centre in Europe

The experience of Grid / HPC 3/3

19/9/2008 13Kasetsart University, Thailand

Page 14: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

14

Examples beyond e-Science

CitiGroup (Citigroup Inc., operating as Citi, is a major American financial

services company based in New York City) adopted Grid computing

http://www.americanbanker.com/usb_article.html?id=20080825IXTFW8BS

19/9/2008 Kasetsart University, Thailand

- Citi chose Platform Computing's Symphony

grid product to consolidate its computing

assets into a single resource pool with increased

utilization

- At Citi, since the grid was implemented,

individual business units are charged for the

processing power they use, creating a shared

services environment

-Citi is now using near 20,000 CPUs and there

are periods of the day where the utilization rate

is 100 percent

-Citi is planning of using the cloud in cases their

data centers do not suffice (overflow model or

cooperative data centers)

Page 15: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

15

Examples beyond e-Science

EU BEinGRID: Computational Fluid Dynamics

19/9/2008 Kasetsart University, Thailand

Page 16: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

16

Examples beyond e-Science

CYCLOPS: Forest Fire propagation

19/9/2008 Kasetsart University, Thailand

Page 17: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

17

Examples beyond e-Science

EGEODE VO : Seismic processing based on Geocluster

application by CGG company (France)

19/9/2008 Kasetsart University, Thailand

Page 18: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

18

In summary…

• Grid computing has delivered an affordable high

performance computing infrastructure to scientists all

over the world to solve intense computing and storage

problems within constrained research budget

• This has also been effectively used by industry to

increase the usage of their HPC infrastructure and

reduce Total Cost of Ownership (TCO)

• Grid is not only aggregating computing resources but

also leveraging international research networks to

deliver an effective and irreplaceable channel for

international collaboration

19/9/2008 Kasetsart University, Thailand

Page 19: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

19

The flip side…

• Major issues with wide adoption of Grid computing in e-

Science, e-Business, industry etc. have to do with:

• Cost of operations and complexity of management

• Not a solution for all problems (latency, fine grain

parallelism are difficult)

• Difficult to use for the average single scientist

• Security and reliability in general

• Power consumption and heat dissipation are becoming

a limiting factor to consumer based distributed systems

• We are observing the limits of Moore‟s law

19/9/2008 Kasetsart University, Thailand

Page 20: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Kasetsart University, Thailand

Switching Gears:

“To Distribute or Not To Distribute”

• Prof. Satoshi Matsuoka, TITech

• Keynote at Mardi Gras Conference, Baton Rouge, Jan.31, 2008

• In the late 90s, petaflops were considered very hard and

at least 20 years off …

• while grids were supposed to happen right way

• After 10 years (around now) petaflos are “real close” but

there‟s still no “global grid”

• What happened: It was easier to put together massive clusters than to get people to agree about how to share their resources For tightly coupled HPC applications, tightly coupled machines are still necessary Grids are inherently suited for loosely coupled apps or enabling access to machines and/or data

• With Gilder's Law*, bandwidth to the compute resources will promote thin client approach

* “Bandwidth grows at least three times faster than computer power." This means that if computer power doubles every eighteen months (per Moore's Law), then communications power doubles every six months

• Example: Tsubame machine in Tokyo

19/9/2008 20

Page 21: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Supercomputing Reached the Petaflop

IBM RoadRunner at

Los Alamos National Lab

19/9/2008 21Kasetsart University, Thailand

Page 22: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

22

Emerging trends: Green IT, pay per CPU/GB

virtualisation and/or HPC in every lab? (1/4)

• The Green Grid (www.thegreengrid.org) consortium (AMD,

APC, Dell, HP, IBM, Intel, Microsoft, Rackable Systems,

Sun Microsystems and Vmware) , IBM Project Big Green (a

$1 billion investment to dramatically increase the

efficiency of IBM products) and other IT industry initiatives try

to address current HPC limits in energy and environmental

impact requirements

• Computer and data centers in energy and environmental

favorable locations are becoming important

• Elastic computing, Computing on the Cloud, Data Centers and

Service Hosting - Software as a Service, are becoming the new

emerging solutions for HPC applications

• Many-multi-core and CPU accelerators are promising potential

breakthroughs

19/9/2008 Kasetsart University, Thailand

Page 23: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

23

Emerging trends: Cloud computing and

storage on demand (2/4)

•Cloud Computing: http://en.wikipedia.org/wiki/Cloud_computing

•Amazon, IBM, Google, Microsoft, Sun, Yahoo, major „Cloud Platform‟

potential providers

•Operating compute and storage facilities around the world

•Have developed middleware technologies for resource sharing and

software services

•First services already operational - Examples:

•Amazon Elastic Computing Cloud (EC2) -Simple Storage Service (S3)

•Google Apps www.google.com/a

•Sun Network.com www.network.com (1$/CPU hour, no contract cost)

•IBM Grid solutions (www.ibm.com/grid)

•GoGrid – a division of ServePath company (www.gogrid.com) Beta

released (pay-as-you-go and pre-paid plans, server manageability)

19/9/2008 Kasetsart University, Thailand

Page 25: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

25

Today and the future: Cloud computing

and storage on demand

•Cloud Computing: http://en.wikipedia.org/wiki/Cloud_computing

•Amazon, IBM, Google, Microsoft, Sun, Yahoo, major „Cloud Platform‟

potential providers

•Operating compute and storage facilities around the world

•Have developed middleware technologies for resource sharing and

software services

•First services already operational - Examples:

•Amazon Elastic Computing Cloud (EC2) -Simple Storage Service (S3)

•Google Apps www.google.com/a

•Sun Network.com www.network.com (1$/CPU hour, no contract cost)

•IBM Grid solutions (www.ibm.com/grid)

•GoGrid – a division of ServePath company (www.gogrid.com) Beta

released (pay-as-you-go and pre-paid plans, server manageability)

12/15/2008 Ecuador

Page 26: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

12/15/2008 Ecuador 26

The Microsoft Cloud: http://www.microsoft.com/azure

Page 27: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Emerging trends: Virtualisation (4/4)

Virtual PresentationPresentation layer separate from process

Virtual StorageStorage and backup over the network

Virtual NetworkLocalizing dispersed resources

Virtual MachineOS can be assigned to any desktop or server

Virtual ApplicationsAny application on any computer on-demand

Virtualization is the isolation of one computing

resource from the others (not only Virtual Machine):

19/9/2008

Slide Courtesy of Neil Sanderson,

UK Product Manager,

Virtualisation and Management,

Microsoft Ltd.27Kasetsart University, Thailand

Page 28: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Virtualisation market: The game is only starting Only 5% of servers are virtualised

93.00%

4.90%1.75% 0.35%

World WideVirtualization Adoption

Non-virtualized servers

VMware

• Computerworld

– “Although virtualization has been the buzz among technology providers, only 6% of enterprises have actually deployed virtualization on their networks, said Levine, citing a TWP Research report. That makes the other 94% a wide-open market.”

– “We calculate that roughly 6% of new servers sold last year were virtualized and project that 7% of those sold this year will be virtualized and believe that less than 4% of the X86 server installed base has been virtualized to date.

• Pat Gelsinger, Intel VP Sept. 2007

– “Only 5% of servers are virtualized.”

Slide Courtesy of Neil Sanderson19/9/2008 28Kasetsart University, Thailand

Page 29: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

29

EGEE cost estimation (1/2)

Capital Expenditures (CAPEX):

a. Hardware costs: 80.000 CPUs ~ in the order of 120M Euros

(80-160M)

Depreciating the infrastructure in 4 years:30Meuros per year

(20M to 40M)

b. Cooling and power installations (supposing existing housing

facilities available)

25% of H/W costs: 30M, depreciated over 5 years: 6M Euros

Total: ~ 36M Euros / year (26M-46M)

Slide Courtesy of Fotis Karayannis19/9/2008 Kasetsart University, Thailand

Page 30: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

30

EGEE cost estimation (2/2)

Operational Expenditures (OPEX):

a. 20 MEuros per year for all EGEE costs (including site

administration, operations, middleware etc.)

b. Electricity ~10% of h/w costs: 12M Euros per year (other

calculations lead to similar results)

c. Internet connectivity: Supposing no connectivity costs

(existing over-provisioned NREN connectivity)

*If other model is used (to construct the service from scratch), then

network costs should be taken into account

Total 32M / year

CAPEX+OPEX= 68M per year (58-78M)

Slide Courtesy of Fotis Karayannis19/9/2008 Kasetsart University, Thailand

Page 31: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

31

EGEE if performed with Amazon EC2 and S3

In the order of ~50M Euros, probably more cost effective of EGEE

actual cost, depending on the promotion of the EC2/S3 service

Slide Courtesy of Bob Jones19/9/2008 Kasetsart University, Thailand

Page 32: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

32

Cloud mature enough for big sciences?

http://www.symmetrymagazine.org/breaking/2008/05/23/are-commercial-computing-clouds-

ready-for-high-energy-physics/

http://www.csee.usf.edu/~anda/papers/dadc108-palankar.pdf

19/9/2008Kasetsart University, Thailand

Probably not yet, as not designed for them; Does not support complex

Scenarios: “S3 lacks in terms of flexible access control and support for

delegation and auditing, and it makes implicit trust assumptions”

Page 33: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Challenge: High Productivity Computing

“Make high-end computing easier and

more productive to use.

Emphasis should be placed on time to

solution, the major metric of value to

high-end computing users…

A common software environment for

scientific computation encompassing

desktop to high-end systems will

enhance productivity gains by promoting

ease of use and manageability of

systems.”

2004 High-End Computing

Revitalization Task Force

Office of Science and

Technology Policy,

Executive Office of the

President

Page 34: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

The target 1

Data Gathering

Discovery and

Browsing

Science

Exploration

Domain specific

analyses Scientific Output

“Raw” data includes

sensor output, data

downloaded from

agency or collaboration

web sites, papers

(especially for ancillary

data)

“Raw” data browsing for

discovery (do I have

enough data in the right

places?), cleaning (does

the data look obviously

wrong?), and light weight

science via browsing

“Science variables” and

data summaries for early

science exploration and

hypothesis testing.

Similar to discovery and

browsing, but with

science variables

computed via gap filling,

units conversions, or

simple equation.

“Science variables”

combined with models,

other specialized code,

or statistics for deep

science understanding.

Scientific results via

packages such as

MatLab or R2. Special

rendering package such

as ArcGIS.

Paper preparation.

Courtesy Catherine Van Ingen, MSR

The data pipeline

19/9/2008 34Kasetsart University, Thailand

Page 36: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

External Research engagements in EMEA

37

Page 37: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

The Swiss Experiment

• Climate change effects on the regional hydrologic cycle will have profound implications for the Alps and therefore Europe

• Goals: Environment monitoring through a global sensor network across Switzerland based on collection of appropriate spatial and temporal data

– Appropriate simulation models can provide answers on the causes and consequences of climate changes

– Real time predictions of natural hazards such as floods, avalanches, and landslides

– Document environmental degradation and changes

• Outcome-Impact: Partnering with schools to deploy sensors over a wider area

– 1000 children from 10-14 targeted first

– Direct experience of the environment modifies public perception of environmental challenges

• Makes use of rich set of tools including MSR SensorMap (Virtual Earth planned), enables publishing, visualization and analysis of the data

The water tower of

Europe

EPFL: Karl Aberer, Marc ParlangeMSR: Feng Zhao, Catharine Van Ingen

Page 38: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

HPC projects enabled by MS HPC clusterPIs: Anne Trefethen/David Wallom/Myles Allen/Mike Giles

Goals-Vision:• Windows cluster part of the Oxford Grid (http://www.oerc.ox.ac.uk/facilities/oxgrid.xml)

• Computational Finance: New algorithms and training for finance community• MS HPC cluster: testing parallel Matlab; used as a backend for large Excel

calculations; analysis of data from remote-accessed microscopes (in collaboration with US NIH)

• ClimatePrediction.netOutcome-Impact:• Mike Giles lectures to finance community• The only Windows Cluster on the Oxford Grid• Microsoft support of ClimatePrediction.net on BBC program• Supercomputing ‘07 demonstration: Enabler for Optiputer-GLIF collaborations

with US

39

Oxford University eResearch Centre HPC projects

Page 39: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Climateprediction.net

• Oxford University – OeRC

Running an ensemble simulation of climate change from 1920-2080 using coupled atmosphere-ocean general circulation models running on a volunteer network of personal computers

• OeRC PI: Myles Allen

• Goals-Vision:

– A demonstration of the scientific value of large-ensemble climate simulations for regional climate forecasting in two regions: Southern Africa and the US Northwest

– A data grid portal for output from the coupled experiment

– Improve methods to quantify uncertainties of climate projection and scenarios

• Outcome-Impact:

– Regional modeling with PRECIS

– Using the BOINC voluntary desktop grid and OeRC cluster

– Collaboration with Cape Town University and UK Met Office

– Collaboration with BBC – Additional funds from BBC and NeRC

– Climate prediction experiment wins Prix Europa

– Climate change experiment nominated for BAFTA award

Page 40: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Climate Induced Vegetation Change Analysis Tool

Development of high performance systems for the analysis of how the regional climate change impacts on Northern Eurasia’s terrestrial

ecosystems

Analysis of regional climate change

Satellite remote sensing data on status and dynamics of terrestrial

ecosystems

Terrestrial ecosystems dynamics induced by climate change

Validation of methods and tools on selected test sides

Open tools and source of information for analysis of regional climate change impact on terrestrial ecosystems

Examples of scenario for terrestrial ecosystems dynamic

Developed at Russian Academy of Sciences Space Research Institute (IKI RAN) satellite remote sensing data archivesand data processing methods for vegetation monitoring (PI: Evgeny Loupian, Mikhail Zhizhin PIs)

Outcome-Impact: Active Storage Array Data Model (ASADM) on a Microsoft SQL Server database cluster for the NCEP/NCAR and ECMWF

weather reanalysis projects for the last 50 years Environmental Scenario Search Engine (ESSE) and Functional Array Manipulation Language (FAML) developed on Microsoft

platforms in collaboration with Microsoft Research Cambridge (Vassily Lyutsarev, MSRC) Windows Compute Cluster Server for mesoscale (10 km) parallel atmospheric circulation modeling (MM5) and user

interface development Microsoft Virtual Earth for visualization of the regional climate and vegetation change

Goals-Vision:

Page 41: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

PIs: Yuriy Boldyrev and Alexander SnegirevGoals-Vision: development of a new efficient computational methodology capable of predicting the interaction of fine water spray with buoyant turbulent diffusion flame occurring in fires in human sensitive environments

Windows-based parallelized software being developed to allow regimes of flame extinguishment to be investigated and optimum droplet size determined for different

fire scenariosOutcome-Impact: Halon replacement for fire protection in Historical

buildings, High-rise buildings, Libraries, Archives, Car parks, Oil and gas platforms, ships and planes

Experimental observation and Fire3D simulation of buoyant turbulent enclosed flame and water spray impacting the flame

Page 42: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Goals-Vision: detailed mathematical simulation ofdynamic behavior of the dam of Saint Petersburgflood defense system ship passing channel underthe load of turbulent moving water.Accurate time dependent CFD (computational fluiddynamics) simulations are needed to determinethe structure of the flow around the gate section.Such simulations need massive parallelcomputation on HPC platforms.Outcome-Impact: production use of the MS HPCcluster with CFD codes Ansys Fluent and Ansys CFXto support the completion of the Saint Petersburgflood defense system and avoid the regular floodsin this famous and cultural heritage rich city.Numerical simulation and experimental observation

of the turbulence vortex under lower edge of the gate.

St. Petersburg Polytechnic University (Russia)

Sergey Lupuleac

PIs: Yuriy Boldyrev and

Sergey Lupuleac

Page 43: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Genome-level Computational Analysis

Better prevention, diagnosis and treatment of common diseases by detecting subtle relationships between genetic variation and gene expression• Richard Durbin (Sanger)• John Winn (MSRC)

Goals

– Study effects of multiple genes jointly

– Analyze larger combined data sets

– Combine environmental and genomic factorsImpacts

– Developed and validated a method which leads to three times as many genetic variant (SNP) to gene expression associations, than found using standard methods (around 2800 novel cis associations)

– A much larger percentage of genes are in this category than previously thought

– Development of general purpose tools (across multiple diseases)

– Current exemplar is in Crohn’s Disease (CD). This work should enable the pathogenic mechanisms of CD to be better understood and linked back to genetic causes

Page 44: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Software and Algorithms for Genetics

Identify genetic causes of diseases by collecting genetic marker data (SNP) and developing new statistical methods and software to analyze the data• Dan Geiger, Assaf Shuster (Technion)• David Heckerman, Chris Meek (MSR USA)

Goals-Vision:

– Detection of diseases before birth

– Risk assessment and corresponding life style changes

– Finding the mutant proteins and developing medicine

– Understanding basic biological functions

Outcomes-Impacts:– Locating the responsible for neurological and kidney disorders diseases genes is the major

expected result– Found a novel gene mutation causing a severe neurodegenerative disorder associated

with brain hypomyelinization and leukodystrophy– Developing a new study method of diseases via admixed populations (such as Ashkenazi-

Sephardic Jews or African-American population)– Deployed heterogeneous system of large scale with increased functionality (superlink-

online). Virtual super computer for genetics users (based on Boinc, Condor, EGEE and commercial cloud computing)

Page 45: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Inference of Gene Regulation Network: Developing new probabilistic Message Passing Algorithms (MPAs) for combinatorial optimization problems, which are used in the entire range of modern system biology, including data clustering and classification problems, as well as reverse-engineering problems

• Riccardo Zecchina, University of Turin, Italy

• Jennifer Chayes, Christian Borgs, MSR USA

• Goals

– Knowledge of gene-regulatory networks

• discovery of causal interaction chains (pathways, network modules)

• identification of drug targets repressing or activating set of genes

• therapy for diseases with complex regulatory failure (e.g., cancer)

• Impact/Results– Algorithms applicable to a series of sub-projects, such as on the HapMap project, a biological dataset

containing DNA changes in 4 human sub-populations, or on a study of gene data from HIV infected monkeys

– Patenting an algorithm on survey propagation for finding Steiner trees – with applications to multicasting, peer-to-peer networks, many-core, ad matching, etc

– MSR ER investment leveraged 10 fold contribution by the local funding agencies

– 3 Postdoc MSR investment leverages 25 other postdocs locally funded46

Survey propagation algorithms in Computational Biology

Page 46: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Imaging algorithms for cancer& Image segmentation of the heart Oxford University – OeRC

Develop algorithms to advance image analysis for cancer and cardiac research

• OeRC PI: Sir Mike Brady

• MSRC PI: Andrew Blake

• Goals-Vision:

– Identify & develop new imaging algorithms for cancer along with easy access framework (e.g. portal) for researchers and clinicians

– May lead to breakthrough in visualising tumors, cardiac anomalies!

– Combining science and engineering to create a step change in patient management

• Outcomes-Impacts:

– Used a semi-automatic method to segment colorectal cancer MR images through calculating non-parametric statistics of intensity values in images

– Lymph node detection and its malignancy status to be determined based on a multi-scale feature point detection algorithm : initial experiments - in progress

– Paper on integer-valued optimisation in images-Computer Vision Int’l Conference

» Poster on initial results at Microsoft e-Science workshop

– Moving to a fully automatic (unsupervised) method: initial results – in progress

Page 47: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Swiss Institute of Bioinformatics (SIB): Saving Patients Using MS

Using Mass Spectrometry to systematically screen the patients for drugs, present in the patients system, and to use a reference database to warn of possible conflicts with other prescribed drugs

• Ron D. Appel, Frederique Lisacek, SIB

• Brendan Murphy, MSRC• Goals: While mass spectrometry is being increasingly introduced in clinical settings, no large-scale

screening methods of the effect of ingested drugs on circulating proteins exist at the moment. The final goal is to design and implement an efficient platform (software and database) that can automatically and quickly recognise in numerous mass spectra obtained from patient blood or urine samples, the effect(s) on circulating peptides (protein profile) of the variable presence of drug chemical fragments. The project, if successful, will result in a significant impact in hospitals worldwide.

• Outcome/Impact:

• Use of the Catalogue Archive Server Jobs System (CASJobs) intended for batch query services for web-based SQL access to databases (cf: Jim Gray, Microsoft Research, Redmond). Implementation of rough database scheme based on semi-standardised format of proteomics data. Early tests completed

• The department of pathology of the Geneva hospital, partner and end-user of the SIB-SPUMS project, has been recently granted an important multiyear funding from the national Swiss funding agency for the establishment of a national centre of toxicology; another confirmation of the important leveraging effect of MS ER investments.

• Collaboration with Georges Pompidou European Hospital (Prof. Beaune, Paris) to be started in 2009

Page 48: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

University of Cambridge:FENS (Fighting Emissions with Nuclear

Magnetic Resonance) How to cut worldwide emissions by using Nuclear Magnetic Resonance and Bayesian signal

analysis

• PI: Lynn Gladden (Univ. Cambridge)

• MSRC PI: Andrew Blake (MSRC)

• Goals:To establish a research collaboration with Microsoft to develop the next generation of magnetic resonance (MR) techniques for Chemical Engineering

It could lead to a decrease in CO2 emissions of 400 million tonnes per annum. This is approximately equal to 2% of the annual worldwide production of CO2

• Outcome/Impact:

• Advances in Magnetic Resonance as a tool for measurement and analysis of chemical processes, yielding much shorter duty cycles in measurement of density and flow, and hence much improved temporal resolution

• Efficiency of statistical use of information, leading to much improved effective signal to noise performance, raising the prospect of low cost, field-deployable MR sensing

• Showcase Bayesian inference in state of the art instrumentation, with high impact on clean technology and climate related issues

Page 49: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

High Performance Computing Center Stuttgart : Research and development of Software and Visualization Tools for

the Automotive IndustryUnique combination worldwide to demonstrate the value of Microsoft solutions in the HPC space and for mission critical

applications in world leading car manufactures

• Michael Resch (Stuttgart Univ.)

• Dennis Crain (MS)

• Goals: Augmented Reality (AR) visualization and porting OpenMPI to Windows HPC Server 2008: Collaborating with Daimler AG which assures that the research results can later be applied in real engineering environments. Several prototypical AR applications will be developed, adopted to selected scenarios and finally evaluated in terms of their usability during the optimization process of thermal comfort.

• OpenMPI: aims at building a completely new MPI implementation, based on the experience of several earlier MPI-projects. The goal is to offer a complete high-performance and thread-safe MPI-2 implementation.

• Outcome/Impact: Good test-bench for MSR and MS technology. Interoperability, demo and deployment in a critical market segment.

• The Augmented Reality methods developed within this project will be applicable to many more applications than just engineering. Mobile devices like Mobile phones with graphics hardware and built in cameras or ultra mobile PCs are ideal platforms for AR applications. Applications can be everything from Medical assistance by remote doctors, variation studies in construction or refurbishments to games and entertainment. In the short run, this project will show where and how Augmented Reality can be applied in automotive engineering. In the long run this research might lead to new applications for Microsoft in all the before mentioned areas.

• First performance results show that the performance of OpenMPI is comparable to MS-MPI event though it is not yet using winsock direct API, thus even better results are expected in the future.

Page 50: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

University of Southampton: Spitfire Cluster

• A computing cluster that would allow various industrial and academic users to work on their problems and codes in collaboration with academics at Southampton on a machine built using the Microsoft windows 64bit HPC platform

• Andy Keane, Simon Cox (Southampton University)

• Goals: Work on the Cluster enables a broad community of mainly engineering users who are in turn working with a range of companies, including in particular Rolls-Royce, Airbus, Ansys/Fluent and DePuy. The principal aim is to allow both the research and industrial engineering community to gain experience with using the MS compute cluster server tools in supporting their work.

• Outcome/Impact: Good interactions have been had with Ansys Inc and their Fluent team who have helped tune up and assess the use of their CFD codes on the machine.

• Extensive work has also been carried out for Rolls-Royce including a very large optimization study of one of the blade designs on their current Trent 1000 engine programme for the Boeing Corporation 787 Dreamliner. Significant product improvement has been achieved and the utility of the MS system in this work reported to RR senior engineering staff.

• DePuy are engaged on research for medical implant design in collaboration with the team using Spitfire and a range of new statistical techniques are being developed to improve this process.

• Research is being performed to investigate landing gear noise sources using the SotonCAA high-order numerical aeroacoustics code. The objective is to predict and better understand important acoustic sources and resulting far-field noise characteristics to aid the development of future low-noise landing gear.

Page 51: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

MSRC joint institutes in Europe

• Advanced parallel computing architecture at theBarcelona Super Computing Centre of the PolytechnicUniversity of Barcelona

• Formal methods for computer security at the INRIALaboratory in Orsay (Paris)

• Computational System Biology at the CoSBi centre withthe University of Trento in Italy

Page 52: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Computing research and development including new hardware and software modules for the next generation of multiprocessor chips

• Mateo Valero, Barcelona Supercomputing Center

• Tim Harris, Satnam Sigh and Simon Peyton Jones, MSRC

• Goals

– Use of transaction programming models / transactional memory improving software applications performance and reducing implementation and content costs

– New hardware modules in the Field-Programmable-Gate-Array architectures, hardware support for garbage collector acceleration

• Outcome/Impact– Extending OpenMP with TM

– The Quake game and the Intel RMS application - an application software model for terascale computing which stands for Recognition, Mining and Synthesis (a killer apps of the not-too-distant future)

– Successful Workshop at joint BSC-MSR Research Center organized in Barcelona in June 2008 with > 200 top computer architects from around the world

– Many important publications and ideas for new research subjects

53

Computer architecture research at the Barcelona SuperComputer Centre

Page 53: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

The Microsoft Research - University of Trento Centre for Computational and Systems Biology (CoSBi)

• The CoSBi cluster is a HPC (High Performance Computing) cluster, a distributed memory supercomputer built using a set of interconnected nodes. The nodes use a standard PC x86 64-bit architecture, but are made to maximize performance and optimize space and power dissipation.

• Corrado Priami (CoSBi)

• Goals: The main goal is to acquire a cluster architecture to carry out simulations of biological systems modelled through the techniques developed at CoSBi and to allow the Politecnicodi Torino to access the cluster to run ab initio simulations.

• Outcome-Impact: One major achievement has been the abilityto use the cluster for time-consuming analyses, especiallyanalyses and simulations that require to be run on multiple input data and with different parameters. Running this kind ofprogram was impossible for CoSBi scientists before theintroduction of the cluster in their working environment.

Page 54: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

• We are at a flex point in the evolution of distributed computing (nothing new under the sun…)

• A very exciting time for entering research in computing

• Grid remains a good solution for a reduced number of communities (and often for social/political reasons)

• Cloud computing and hosted services are emerging as the next incarnation of distributed computing with some obvious additional advantages (think of data centres located in Iceland or Siberia)

55

Conclusion (1/2)

19/9/2008 Kasetsart University, Thailand

Page 55: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

• Many changes are happening in the basic underpinning technology (parallel everywhere!)

• We need much more effort in parallel computing

• New boundary constraints and very much energy are becoming the limiting factor to the otherwise still valid Moore’s Law…

• If we will be able to harness the potential enormous power of parallel computing (not so good story so far) then we might be able to provide better computing solutions for research also in energy and eventually better energy solutions for our computing needs!

56

Conclusion (2/2)

19/9/2008 Kasetsart University, Thailand

Page 56: An overview of high performance computing infrastructures ... · –Slower adoption: prefer different environments, tools and have different TCOs •Intra grids, internal dedicated

Thanks to the organizers for the kind invitation and to all of you

for your attentionContact me at:

Fabrig @ microsoft.com

57

Thanks

12/15/2008 Ecuador