grid applications tuğba taşkaya temizel 20 february 2006

47
GRID Applications GRID Applications Tu Tu ğba Taşkaya Temizel ğba Taşkaya Temizel 20 February 2006 20 February 2006

Upload: charlotte-henderson

Post on 28-Dec-2015

229 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

GRID ApplicationsGRID ApplicationsTuTuğba Taşkaya Temizelğba Taşkaya Temizel

20 February 200620 February 2006

Page 2: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

Problems Where Grids Have Problems Where Grids Have Been SuccessfulBeen Successful

Megacomputing problems: The problems Megacomputing problems: The problems are divided into parallel independent parts.are divided into parallel independent parts.

Mega and seamless access problems: Mega and seamless access problems: Integrate access, Use of multiple data and Integrate access, Use of multiple data and resources.resources.

Loosely coupled nets: Functionally Loosely coupled nets: Functionally decomposed sequential problems.decomposed sequential problems.

Page 3: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

Grid ApplicationsGrid Applications

Community centric: Get the organisations Community centric: Get the organisations together for collaboration.together for collaboration.

Data-centric: Integration of multiple Data-centric: Integration of multiple resourcesresources

Compute-centric: Certain coupled Compute-centric: Certain coupled applications and seamless access to applications and seamless access to multiple back-end hostsmultiple back-end hosts

Interaction-centric: Corresponds to Interaction-centric: Corresponds to problems requiring real-time responsesproblems requiring real-time responses

Page 4: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

Application FieldsApplication Fields

AstronomyAstronomy BioinformaticsBioinformatics Environmental ScienceEnvironmental Science Particle physicsParticle physics Medicine and HealthMedicine and Health Social SciencesSocial Sciences Combinatorial Chemistry Combinatorial Chemistry ……..

Page 5: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYVirtual ObservatoryVirtual Observatory

TOTAL BUDGET : $ 20 million (US)

DURATION : 2002-2005

TYPE : INTERNATIONAL

URL : http://www.ivoa.net

Page 6: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYVirtual ObservatoryVirtual Observatory

ObjectiveObjective::To facilitate the international To facilitate the international coordination and collaboration necessary coordination and collaboration necessary for the development and deployment of for the development and deployment of the tools, systems and organizational the tools, systems and organizational structures necessary to enable the structures necessary to enable the international utilization of astronomical international utilization of astronomical archives as an integrated and archives as an integrated and interoperating virtual observatory.interoperating virtual observatory.

Page 7: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYVirtual ObservatoryVirtual Observatory

Data creatorsData creators create the data and store in archivecreate the data and store in archive describe process of data creation in standard modelling termsdescribe process of data creation in standard modelling terms describe data products according to IVOA standardsdescribe data products according to IVOA standards implement automated publication and registration mechanismimplement automated publication and registration mechanism

Data providers:Data providers: enable web access to archivesenable web access to archives choose data products to be publishedchoose data products to be published register data products with IVOAregister data products with IVOA support discovery/query services on data productssupport discovery/query services on data products support federationsupport federation

Service providers:Service providers: implement data discovery/query/analysis/creation servicesimplement data discovery/query/analysis/creation services enable web access to results of these servicesenable web access to results of these services

Page 8: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYVirtual ObservatoryVirtual Observatory

ProblemsProblems:: One common data format structure: Translation One common data format structure: Translation

mechanisms exist. Each data provider should advertise mechanisms exist. Each data provider should advertise their data format. HDF5 format is proposed has been their data format. HDF5 format is proposed has been proposed recently to overcome this difficulty.proposed recently to overcome this difficulty.

Query services: Basic queries (query for specific data Query services: Basic queries (query for specific data product) have been provided but more complex queries product) have been provided but more complex queries are needed for theoretical results.are needed for theoretical results.

Simulators: Algorithms that create new data, from Simulators: Algorithms that create new data, from previously published data resourcespreviously published data resources

Modelling/Describing Simulations: Right classification of Modelling/Describing Simulations: Right classification of simulations (classification in terms of subject, type, simulations (classification in terms of subject, type, implementation choice, data product. implementation choice, data product.

Page 9: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYVirtual SkyVirtual Sky

PARTNERS: Caltech Center for Advanced Computing ResearchJohns Hopkins University the Sloan Sky Survey Microsoft ResearchPORTED TO TERAGRIDURL : http://virtualsky.org

Page 10: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYVirtual SkyVirtual Sky

Provides seamless, federated images of the Provides seamless, federated images of the night sky; not just an album of popular places, night sky; not just an album of popular places, but also the entire sky at multiple resolutions but also the entire sky at multiple resolutions and multiple wavelengthsand multiple wavelengths

Federates many different image sources into a Federates many different image sources into a unified interfaceunified interface

Architecture is based on a hierarchy of Architecture is based on a hierarchy of precomputed image tiles(mosaic), so that precomputed image tiles(mosaic), so that response is fast.response is fast.

Page 11: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYVirtual SkyVirtual Sky

ProblemProblem: Demand for high computational power : Demand for high computational power for resampling the raw images. For each pixel of for resampling the raw images. For each pixel of the image, several projections from pixel to sky the image, several projections from pixel to sky and the same number of inverse projections are and the same number of inverse projections are required.required.

ProblemProblem: Federation of the heterogeneous : Federation of the heterogeneous image resources causes a loss of informationimage resources causes a loss of information

Page 12: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYMONTAGEMONTAGE

Partners: California Institute of Technology, Nasa, Caltech University

Duration: 2002-2005

URL : http://montage.ipac.caltech.edu/

PORTED TO TERAGRID

Page 13: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYMONTAGEMONTAGE

Comprehensive mosaicking system Comprehensive mosaicking system that allows broad choice in the that allows broad choice in the resampling and photometric resampling and photometric algorithmsalgorithms

Offer simultaneous, parallel Offer simultaneous, parallel processing of multiple images to processing of multiple images to enable fast, deep, robust source enable fast, deep, robust source detection in multi-wavelength image detection in multi-wavelength image space.space.

Page 14: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYMONTAGEMONTAGE

Data fetched from the Data fetched from the most convenient placemost convenient place

Computing is done at Computing is done at any available platformany available platform

Replica Management: Replica Management: Intermediate products Intermediate products are cached for reuseare cached for reuse

Virtual Data: User Virtual Data: User specifies the desired specifies the desired data using domain data using domain specific attributes and specific attributes and not by specifying how to not by specifying how to derive the dataderive the data

Page 15: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYQUESTQUEST

Partners: Yale University, Indiana University, Centro de Investigaciones de Astronomía, Universidad de Los Andes

URL : http://hepwww.physics.yale.edu/www_info/astro/quest.html

Page 16: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYQUESTQUEST

Objectives:Objectives: Transient gravitational lensing:Transient gravitational lensing: This will lead to a better This will lead to a better

understanding of the nature of the non-luminous mass of the understanding of the nature of the non-luminous mass of the Galaxy.Galaxy.

Quasar gravitational lensing:Quasar gravitational lensing: At much larger scales than our At much larger scales than our Galaxy, the Quest team hopes to detect strong lensing of Galaxy, the Quest team hopes to detect strong lensing of very remote objects such as quasars.very remote objects such as quasars.

SupernovaeSupernovae: The Quest system will be able to detect large : The Quest system will be able to detect large numbers of very distant supernovae, leading to prompt numbers of very distant supernovae, leading to prompt follow-up observations, and a better understanding of follow-up observations, and a better understanding of supernova classification, as well as their role as standard supernova classification, as well as their role as standard candles for understanding the early Universe.candles for understanding the early Universe.

Gamma-ray burst (GRB) afterglows:Gamma-ray burst (GRB) afterglows: Quest will search for Quest will search for these fading sources, and try to correlate them with known these fading sources, and try to correlate them with known GRBs.GRBs.

Page 17: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

ASTRONOMYASTRONOMYQUESTQUEST

Architecture:

Page 18: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

COMBINATORIAL CHEMISTRYCOMBINATORIAL CHEMISTRYCOMB-E-CHEMCOMB-E-CHEM

Partners: Southampton Chemistry Partners: Southampton Chemistry Department, Department, Mathematics, ECS, Bristol Chemistry with Mathematics, ECS, Bristol Chemistry with backing Pfizer, Roche and IBM backing Pfizer, Roche and IBM

£2.2M project£2.2M project Started in 2001Started in 2001 National e-science Pilot projectNational e-science Pilot project URL: http://www.combechem.orgURL: http://www.combechem.org

Page 19: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

COMBINATORIAL CHEMISTRYCOMBINATORIAL CHEMISTRYCOMB-E-CHEMCOMB-E-CHEM

Objective: Develop new ways of collaborative Objective: Develop new ways of collaborative working over the Grid to handle the hugely working over the Grid to handle the hugely increasing flow of information on molecular and increasing flow of information on molecular and crystal structures arising from the application of crystal structures arising from the application of Combinatorial Chemistry.Combinatorial Chemistry.

Facilitate the understanding of how molecular Facilitate the understanding of how molecular structure influences the crystal and material structure influences the crystal and material properties.properties.

Page 20: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

COMBINATORIAL CHEMISTRYCOMBINATORIAL CHEMISTRYCOMB-E-CHEMCOMB-E-CHEM

Page 21: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICSHIGHER ENERGY PHYSICSGoalsGoals

Find the mechanism responsible for mass Find the mechanism responsible for mass in the universe, and the “Higgs” particles in the universe, and the “Higgs” particles associated with mass generation, as well associated with mass generation, as well as the fundamental mechanism that led to as the fundamental mechanism that led to the predominance of matter over the predominance of matter over antimatter in the observable cosmos.antimatter in the observable cosmos.

Page 22: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICS HIGHER ENERGY PHYSICS ChallengesChallenges

Providing rapid access to data subsets drawn Providing rapid access to data subsets drawn from massive data stores , rising from petabytes from massive data stores , rising from petabytes in 2002 to ~100 petabytes by 2007, and exabtes in 2002 to ~100 petabytes by 2007, and exabtes (10(101818 bytes) by approximately 2012 to 2015. bytes) by approximately 2012 to 2015.

Providing secure, efficient, and transparent Providing secure, efficient, and transparent managed access to heterogeneous worldwide-managed access to heterogeneous worldwide-distributed computing and data-handling distributed computing and data-handling resources, across an ensemble of networks of resources, across an ensemble of networks of varying capability, and reliability.varying capability, and reliability.

Page 23: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICS HIGHER ENERGY PHYSICS ChallengesChallenges

Tracking the state and usage patterns of Tracking the state and usage patterns of computing and data resources in order to make computing and data resources in order to make possible rapid turnaround as well as efficient possible rapid turnaround as well as efficient utilisation of global resourcesutilisation of global resources

Providing the collaborative infrastructure that will Providing the collaborative infrastructure that will make it possible for physicists to contribute make it possible for physicists to contribute effectively.effectively.

Building regional, national, continental, and Building regional, national, continental, and transoceanic networks, with bandwidths rising transoceanic networks, with bandwidths rising from the gigabit per second to the terabit per from the gigabit per second to the terabit per second range over the next decade.second range over the next decade.

Page 24: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICS HIGHER ENERGY PHYSICS Grid projectsGrid projects

PPDG (Particle Physics Data Grid)PPDG (Particle Physics Data Grid) GriPhyN (Grid Physics Network)GriPhyN (Grid Physics Network) iVDGL (International Virtual Data Grid iVDGL (International Virtual Data Grid

Laboratory)Laboratory) DataGridDataGrid LCG (LCG (Large Hadron Collider Large Hadron Collider

Computing Grid)Computing Grid) CrossGridCrossGrid

Page 25: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICS HIGHER ENERGY PHYSICS PPDG (Particle Physics Data Grid)PPDG (Particle Physics Data Grid) Formed in 1999Formed in 1999 Objective: To address the need for Data Objective: To address the need for Data

Grid services to enable the worldwide-Grid services to enable the worldwide-distributed computing model of current and distributed computing model of current and future high-energy and nuclear physics future high-energy and nuclear physics experiments.experiments.

URL: www.ppdg.netURL: www.ppdg.net

Page 26: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICS HIGHER ENERGY PHYSICS GriPhyN (Grid Physics Network)GriPhyN (Grid Physics Network) Objective: Focused on the creation of Objective: Focused on the creation of

Petabyte Virtual Data Grids that meet the Petabyte Virtual Data Grids that meet the data-intensive computational needs of a data-intensive computational needs of a diverse community of thousands of diverse community of thousands of scientists spread across the globe.scientists spread across the globe.

URL: (http://www.griphyn.org)URL: (http://www.griphyn.org)

Page 27: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICSHIGHER ENERGY PHYSICSiVDGL(International Virtual Data Grid iVDGL(International Virtual Data Grid

Laboratory)Laboratory) The The iVDGLiVDGL is tasked with establishing and is tasked with establishing and

utilizing an international Virtual-Data Grid utilizing an international Virtual-Data Grid Laboratory (iVDGL) of unprecedented scale and Laboratory (iVDGL) of unprecedented scale and scope, comprising heterogeneous computing and scope, comprising heterogeneous computing and storage resources in the U.S., Europe and storage resources in the U.S., Europe and ultimately other regions linked by high-speed ultimately other regions linked by high-speed networks, and operated as a single system for the networks, and operated as a single system for the purposes of interdisciplinary experimentation in purposes of interdisciplinary experimentation in grid-enabled, data-intensive scientific computing.grid-enabled, data-intensive scientific computing.

URL: http://www.ivdgl.org/URL: http://www.ivdgl.org/

Page 28: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICSHIGHER ENERGY PHYSICSGoalsGoals

Deploy a Grid laboratoryDeploy a Grid laboratory Support research mission of data intensive experimentsSupport research mission of data intensive experiments Provide computing and personnel resources at university sitesProvide computing and personnel resources at university sites Provide platform for computer science technology developmentProvide platform for computer science technology development Prototype and deploy a Grid Operations Center (iGOC)Prototype and deploy a Grid Operations Center (iGOC)

Integrate Grid software toolsIntegrate Grid software tools Into computing infrastructures of the experimentsInto computing infrastructures of the experiments

Support delivery of Grid technologiesSupport delivery of Grid technologies Hardening of the Virtual Data Toolkit (VDT) and other middleware Hardening of the Virtual Data Toolkit (VDT) and other middleware

technologies developed by GriPhyN and other Grid projectstechnologies developed by GriPhyN and other Grid projects Education and OutreachEducation and Outreach

Lead and collaborate with Education and Outreach effortsLead and collaborate with Education and Outreach efforts Provide tools and mechanisms for underrepresented groups and Provide tools and mechanisms for underrepresented groups and

remote regions to participate in international science projectsremote regions to participate in international science projects

Page 29: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICSHIGHER ENERGY PHYSICSiVDGL Sites (February 2004)iVDGL Sites (February 2004)

UF

UW MadisonBNL

Indiana

Boston USKC

Brownsville

Hampton

PSU

J. Hopkins

Caltech

Tier1Tier2Other

FIU

Austin

Michigan

LBL Argonne

Vanderbilt

UCSD

Fermilab

PartnersEUBrazilKorea

Iowa Chicago

UW Milwaukee

ISI

Page 30: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICSHIGHER ENERGY PHYSICSDataGridDataGrid

DataGrid is a project funded by European Union. DataGrid is a project funded by European Union. The objective is to build the next generation The objective is to build the next generation

computing infrastructure providing intensive computing infrastructure providing intensive computation and analysis of shared large-scale computation and analysis of shared large-scale databases, from hundreds of TeraBytes to databases, from hundreds of TeraBytes to PetaBytes, across widely distributed scientific PetaBytes, across widely distributed scientific communities.communities.

URL: URL: eu.datagrid.webcern.cheu.datagrid.webcern.ch Duration : 2001- 2003Duration : 2001- 2003

Page 31: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICSHIGHER ENERGY PHYSICSLCG(LCG(Large Hadron Collider Large Hadron Collider Computing Computing

Grid)Grid) The aim to prepare the computing The aim to prepare the computing

infrastructure for the simulation, infrastructure for the simulation, processing, and analysis of LHC data for processing, and analysis of LHC data for all four of the LHC collaborations.all four of the LHC collaborations.

URL : http://lcgrid.web.cern.chURL : http://lcgrid.web.cern.ch

Page 32: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

CMS Experiment

HIGHER ENERGY PHYSICSHIGHER ENERGY PHYSICSGlobal LHC Data Grid HierarchyGlobal LHC Data Grid Hierarchy

Online System

CERN Computer Center

USAKorea RussiaUK

Institute

0.1 - 1.5 GBytes/s

2.5-10 Gb/s

1-10 Gb/s

10-40 Gb/s

1-2.5 Gb/s

Tier 0

Tier 1

Tier 3

Tier 4

Tier 2

Physics caches

PCs

Institute

Institute

Institute

Tier2 Center

Tier2 Center

Tier2 Center

Tier2 Center

~10s of Petabytes/yr by 2007-8~1000 Petabytes in < 10 yrs?

Page 33: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICSHIGHER ENERGY PHYSICSCrossGridCrossGrid

Objective: Developing, implementing, and Objective: Developing, implementing, and exploiting new Grid components for interactive exploiting new Grid components for interactive compute- and data-intensive applications such compute- and data-intensive applications such as simulation and visualization for surgical as simulation and visualization for surgical procedures, flooding crisis team decision-procedures, flooding crisis team decision-support systems, distributed data analysis in support systems, distributed data analysis in high-energy physics, and air pollution combined high-energy physics, and air pollution combined with weather forecasting.with weather forecasting.

URL: www.crossgrid.orgURL: www.crossgrid.org

Page 34: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

HIGHER ENERGY PHYSICSHIGHER ENERGY PHYSICSThe CrossGrid architectureThe CrossGrid architecture

Supporting Tools

1.4Meteo

Pollution

1.4Meteo

Pollution

3.1 Portal & Migrating Desktop

3.1 Portal & Migrating Desktop

ApplicationsDevelopment

Support

2.4Performance

Analysis

2.4Performance

Analysis

2.2 MPI Verification

2.2 MPI Verification

2.3 Metrics and Benchmarks

2.3 Metrics and Benchmarks

App. Spec Services

1.1 Grid Visualisation

Kernel

1.1 Grid Visualisation

Kernel

1.3 DataMining on Grid (NN)

1.3 DataMining on Grid (NN)

1.3 Interactive Distributed

Data Access

1.3 Interactive Distributed

Data Access

3.1RoamingAccess

3.1RoamingAccess

3.2Scheduling

Agents

3.2Scheduling

Agents

3.3Grid

Monitoring

3.3Grid

Monitoring

MPICH-GMPICH-G

Fabric

1.1, 1.2 HLA and others

1.1, 1.2 HLA and others

3.4Optimization of

Grid Data Access

3.4Optimization of

Grid Data Access

1.2Flooding

1.2Flooding

1.1BioMed

1.1BioMed

Applications

Generic Services

1.3Interactive

Session Services

1.3Interactive

Session Services

GRAMGRAM GSIGSIReplica Catalog

Replica CatalogGIS / MDSGIS / MDSGridFTPGridFTP Globus-IOGlobus-IO

DataGridReplica

Manager

DataGridReplica

Manager

DataGrid Job Submission

Service

DataGrid Job Submission

Service

Resource Manager

(CE)

Resource Manager

(CE)

CPUCPU

ResourceManager

ResourceManager

Resource Manager

(SE)

Resource Manager

(SE)

Secondary Storage

Secondary Storage

ResourceManager

ResourceManager

Instruments ( Satelites,

Radars)

Instruments ( Satelites,

Radars)

3.4Optimization of

Local Data Access

3.4Optimization of

Local Data Access

Tertiary StorageTertiary Storage

Replica Catalog

Replica Catalog

GlobusReplica

Manager

GlobusReplica

Manager

1.1User Interaction

Services

1.1User Interaction

Services

Supporting Tools

1.4Meteo

Pollution

1.4Meteo

Pollution

3.1 Portal & Migrating Desktop

3.1 Portal & Migrating Desktop

ApplicationsDevelopment

Support

2.4Performance

Analysis

2.4Performance

Analysis

2.2 MPI Verification

2.2 MPI Verification

2.3 Metrics and Benchmarks

2.3 Metrics and Benchmarks

App. Spec Services

1.1 Grid Visualisation

Kernel

1.1 Grid Visualisation

Kernel

1.3 DataMining on Grid (NN)

1.3 DataMining on Grid (NN)

1.3 Interactive Distributed

Data Access

1.3 Interactive Distributed

Data Access

3.1RoamingAccess

3.1RoamingAccess

3.2Scheduling

Agents

3.2Scheduling

Agents

3.3Grid

Monitoring

3.3Grid

Monitoring

MPICH-GMPICH-G

Fabric

1.1, 1.2 HLA and others

1.1, 1.2 HLA and others

3.4Optimization of

Grid Data Access

3.4Optimization of

Grid Data Access

1.2Flooding

1.2Flooding

1.1BioMed

1.1BioMed

Applications

Generic Services

1.3Interactive

Session Services

1.3Interactive

Session Services

GRAMGRAM GSIGSIReplica Catalog

Replica CatalogGIS / MDSGIS / MDSGridFTPGridFTP Globus-IOGlobus-IO

DataGridReplica

Manager

DataGridReplica

Manager

DataGrid Job Submission

Service

DataGrid Job Submission

Service

Resource Manager

(CE)

Resource Manager

(CE)

CPUCPU

ResourceManager

ResourceManager

Resource Manager

(SE)

Resource Manager

(SE)

Secondary Storage

Secondary Storage

ResourceManager

ResourceManager

Instruments ( Satelites,

Radars)

Instruments ( Satelites,

Radars)

3.4Optimization of

Local Data Access

3.4Optimization of

Local Data Access

Tertiary StorageTertiary Storage

Replica Catalog

Replica Catalog

GlobusReplica

Manager

GlobusReplica

Manager

1.1User Interaction

Services

1.1User Interaction

Services

Page 35: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

BIOINFORMATICS BIOINFORMATICS ChallengesChallenges

To provide a usable and accessible To provide a usable and accessible computational and data management computational and data management environmentenvironment

To provide sufficient support servicesTo provide sufficient support services To ensure that the science performed on the To ensure that the science performed on the

grid constitutes the next generation of advancesgrid constitutes the next generation of advances To accept feedback from bioinformaticians and To accept feedback from bioinformaticians and

to improve the next generation of infrastructureto improve the next generation of infrastructure

Page 36: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

BIOINFORMATICS BIOINFORMATICS Grid ApplicationsGrid Applications

CEPAR(Combinatorial Extension in CEPAR(Combinatorial Extension in PARallel) and CEPort – 3D protein PARallel) and CEPort – 3D protein structure comparisonstructure comparison

Chemport – a quantum mechanical Chemport – a quantum mechanical biomedical frameworkbiomedical framework

Page 37: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

BIOINFORMATICS BIOINFORMATICS Cepar:a computational biology applicationCepar:a computational biology application

A typical protein consists of 300 of one of 20 of A typical protein consists of 300 of one of 20 of amino acid amino acid a total of 20 a total of 20300300 possibilities. possibilities.

with 30000 protein chain in PDB (Protein Data with 30000 protein chain in PDB (Protein Data Bank), and each pair takes 30s to compare, (30k Bank), and each pair takes 30s to compare, (30k * 30k /2) *30s size * 30k /2) *30s size 428 CPU years on one 428 CPU years on one processor.processor.

Strategy: data reduction, data optimization, Strategy: data reduction, data optimization, efficient scheduling efficient scheduling CE (Combinatorial CE (Combinatorial Extension) algorithm 1000 CPU of 1.7 Teraflop Extension) algorithm 1000 CPU of 1.7 Teraflop IBM Blue Horizon solved in few daysIBM Blue Horizon solved in few days

Page 38: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

BIOINFORMATICS BIOINFORMATICS Chemport: a computational chemistry frameworkChemport: a computational chemistry framework

Chemistry computation for general atomic Chemistry computation for general atomic molecular and Electronic Structure System molecular and Electronic Structure System

Computational and functional analysis in Computational and functional analysis in biomolecular via classical and quantum biomolecular via classical and quantum mechanical simulationmechanical simulation

Page 39: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

BIOINFORMATICS BIOINFORMATICS eDiamondeDiamond

A Grid-enabled federated database of A Grid-enabled federated database of annotated mammogramsannotated mammograms

eDiaMoND is a collaborative project eDiaMoND is a collaborative project funded through an EPSRC grant and funded through an EPSRC grant and IBM's SUR grantIBM's SUR grant

URL : URL : www.ediamond.ox.ac.ukwww.ediamond.ox.ac.uk

Page 40: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

BIOINFORMATICS BIOINFORMATICS ediamond goalsediamond goals

It has a significantly large distributed database of It has a significantly large distributed database of mammograms (400 cases per site with a majority mammograms (400 cases per site with a majority annotated). annotated).

It aligns with and complies with new IT policies for the It aligns with and complies with new IT policies for the NHS in that it is secure and wins the confidence of NHS in that it is secure and wins the confidence of the relevant legal, ethical and NHS Trust IT officers. the relevant legal, ethical and NHS Trust IT officers. In addition, the system will follow all known guidelines In addition, the system will follow all known guidelines for the deployment of NHS patient and health records. for the deployment of NHS patient and health records.

It is scalable and is designed in such a way that it It is scalable and is designed in such a way that it could scale to cope conceptually with millions of could scale to cope conceptually with millions of images spread around the 90+ Breast Care Units in images spread around the 90+ Breast Care Units in the UK. the UK.

Page 41: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

BIOINFORMATICS BIOINFORMATICS ediamond goalsediamond goals

It is effective in that it is fast, it is useful to the It is effective in that it is fast, it is useful to the clinicians in the areas of screening, training, clinicians in the areas of screening, training, epidemiology and computer aided detection, and epidemiology and computer aided detection, and it is intuitive for the users. it is intuitive for the users.

It must be built such that upgrades of platform or It must be built such that upgrades of platform or image analysis software are graceful. image analysis software are graceful.

It is reusable, in that the platform could be used It is reusable, in that the platform could be used as a foundation for other e-health projects. as a foundation for other e-health projects.

It is based on Grid architecture.It is based on Grid architecture.

Page 42: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

Grid ApplicationsGrid Applications

What new challenges do these application represent?

• Are there new paradigms and problems here?

Page 43: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

Case Study: News Service Case Study: News Service ApplicationApplication

Problem:Problem: The underlying application is to be used by The underlying application is to be used by

News Service organization whose purpose is News Service organization whose purpose is to electronically publish news bulletin to electronically publish news bulletin messages to various subscribers. The News messages to various subscribers. The News Service organization publishes bulletin Service organization publishes bulletin messages within various categories, such as messages within various categories, such as Business News, Sports, and Weather.Business News, Sports, and Weather.

Page 44: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

Case Study: News Service Case Study: News Service ApplicationApplication

Tasks:Tasks: WritersWriters gather news and submit the news bulletins for approval via gather news and submit the news bulletins for approval via

this applicationthis application EditorsEditors are informed of any pending bulletins that the writers have are informed of any pending bulletins that the writers have

submitted. The editors log on to the application, are authenticated by submitted. The editors log on to the application, are authenticated by the application and retrieve the pending news. Upon review of the the application and retrieve the pending news. Upon review of the news bulletins, they either approve or disapprove of the news bulletins news bulletins, they either approve or disapprove of the news bulletins submitted by the writers. All approved news bulletins are submitted by the writers. All approved news bulletins are subsequently published by the application to all registered subsequently published by the application to all registered subscribers. subscribers.

AdministratorAdministrator is responsible for starting and stopping the application is responsible for starting and stopping the application and performing other necessary administrative functions.and performing other necessary administrative functions.

Service organization allows other business partner organizations to Service organization allows other business partner organizations to submit news bulletins. Upon receipt of news bulletins from the submit news bulletins. Upon receipt of news bulletins from the business partner organizations, the administrator loads the news business partner organizations, the administrator loads the news bulletins into the application for further review by the editor and bulletins into the application for further review by the editor and publishing to the subscribers.publishing to the subscribers.

Page 45: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

Case Study: News Service Case Study: News Service ApplicationApplication

The System Context:The System Context:

Page 46: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

Case Study: News Service Case Study: News Service ApplicationApplication

The use cases:The use cases:

Page 47: GRID Applications Tuğba Taşkaya Temizel 20 February 2006

Case Study: News Service Case Study: News Service ApplicationApplication

The architecture overview:The architecture overview: