grid07 6 jacq

31
www.healthgrid.org World-wide in silico drug discovery against neglected and emerging diseases on grid infrastructures Dr Nicolas jacq HealthGrid association Credit : the WISDOM collaboration http://wisdom.healthgrid.org International Symposium on Grids for Science and Business 12 June 2007

Upload: iminds

Post on 08-Jul-2015

369 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Grid07 6 Jacq

www.healthgrid.org

World-wide in silico drug discovery against neglected and emerging diseases on grid infrastructures

Dr Nicolas jacqHealthGrid association

Credit : the WISDOM collaborationhttp://wisdom.healthgrid.org

International Symposium on Grids for Science and Business12 June 2007

Page 2: Grid07 6 Jacq

Jacq - 12 June 2007 2

The HealthGrid association

• The vision of HealthGrid is the deployment of e-infrastructures able to interoperate geographically distributed repositories of health-related data and the integration of high-end processing services on top of them.

• Some key aspects are:– The integration of health-related actors in grid projects– The integration of grid standards and medical informatics standards for

interoperability– The deployment of pilots for new ways of research and new methods– The integration of bioinformatics community and medical informatics

• The mission of HealthGrid is to foster the communication among the different key actors and to catalyse joint research actions at international level

Page 3: Grid07 6 Jacq

Jacq - 12 June 2007 3

Main achievements

• Edition of the HealthGrid Whitepaper in 2005 outlining the concept, benefits and opportunities offered by applying grids indifferent applications in biomedicine and healthcare– http://whitepaper.healthgrid.org

• Involvement as full partner in several projects– SHARE (SSA): http://www.eu-share.org– EGEE II (I3): http://www.eu-egee.org– ACGT (IP): http://www.eu-acgt.org

• Organisation of the HealthGrid conference since 2003– HealthGrid.US Alliance will host the 6th International HealthGrid

Conference in Chicago – Spring 2008

• Development of the health grids knowledge base– http://kb.healthgrid.org

Page 4: Grid07 6 Jacq

Jacq - 12 June 2007 4

Content

• WISDOM, an initiative for grid-enabled drug discoveryagainst neglected and emerging diseases

• Deployment and results of grid-enabled large scalevirtual screening against malaria and avian influenza

• Deployment method

• Conclusion and perspectives

Page 5: Grid07 6 Jacq

Jacq - 12 June 2007 5

Goal of the WISDOM initiative

• WISDOM stands for World-wide In Silico Docking On Malaria

• Goal: contribute to develop new drugs for neglected and emerging diseases with a particular focus on malaria and avian flu

• Specificity: extensively rely on emerging information technologiesto provide new tools and environments for drug discovery

• Initial focus: virtual screening

• Web site: http://wisdom.healthgrid.org

Page 6: Grid07 6 Jacq

Jacq - 12 June 2007 6

WISDOM collaboration

7 partners, 4 associated laboratories providing targets and/or in vitro facilities

Univ. Los Andes:Bioinformatics, Malaria biology

LPC Clermont-Ferrand:Biomedical grid

Web service

SCAI Fraunhofer:Knowledge extraction,

Chemoinformatics

Univ. Modena:Malaria biology,

Molecular Dynamics

ITB CNR:Bioinformatics,

Molecular modelling

Univ. Pretoria:Bioinformatics, Malaria biology

Academica Sinica:Grid user interfaceAvian flu biologyIn vitro testing

HealthGrid:Biomedical grid, Dissemination

CEA, Acamba project:Malaria biology, Chemogenomics

Chonnam Nat. Univ.:In vitro testing

PartnersAssociated labs

Mahidol Univ. Bangkok:In vitro testing

New

Page 7: Grid07 6 Jacq

Jacq - 12 June 2007 7

Benefits from using the grid (1/2)

• World-wide distribution of malaria resistance• 1975-2004: Only 21 new drugs for tropical diseases on 1,556 were

marketed (Chirac P. Toreele. E Lancet. May 2006)

• Neglected diseases keep suffering lack of R&D

• Grids allow reduced costs

Page 8: Grid07 6 Jacq

Jacq - 12 June 2007 8

Benefits from using the grid (2/2)

• H5N1 virus has the potential to cause a large-scale pandemic• H5N1 may mutate and acquire the ability of drug resistance

• Time is a critical factor for handling emerging diseases

• Grids provide accelerating factor

Source : Ross E.G. Upshur BA(HONS), MA, MD, MSc, CCFP, FRCPCDeaths from all causes each week expressed as an annual rate per 1000

months

Page 9: Grid07 6 Jacq

Jacq - 12 June 2007 9

In silico drug discovery

• Problem: development of a drug takes 12 to 15 yearsand costs approximately 800 million dollars

TargetIdentification

TargetValidation

LeadIdentification

Lead Optimization

Target discovery Lead discovery

Clinical Phases

(I-III)

Page 10: Grid07 6 Jacq

Jacq - 12 June 2007 10

Grid impact on drug discovery workflow down

to drug delivery (1/2)

• Grids provide the necessary tools and data to identify new biological targets– Bioinformatics services (database replication, workflow…)– Resources for CPU intensive tasks (genomics comparative analysis,

inverse docking…)

• Grids provide the resources to speed up lead discovery– Large scale in silico docking to identify potentially promising

compounds– Molecular dynamics computations to refine virtual screening and further

assess selected compounds

• Grid offers very interesting perspectives to enable collaboration between public and private partners– Platform for information and knowledge sharing

Page 11: Grid07 6 Jacq

Jacq - 12 June 2007 11

Grid impact on drug discovery workflow down

to drug delivery (2/2)

• Grids provide environments for epidemiology– Federation of databases to collect data in endemic areas to

study a disease and to evaluate impact of vaccine, vector control measures

– Resources for data analysis and mathematical modelling

• Grids provide the services needed for clinical trials– Federation of databases to collect data in the centres

participating to the clinical trials

• Grids provide the tools to monitor drug delivery– Federation of databases to monitor drug delivery

Page 12: Grid07 6 Jacq

Jacq - 12 June 2007 12

Content

• WISDOM, an initiative for grid-enabled drug discoveryagainst neglected and emerging diseases

• Deployment and results of grid-enabled large scalevirtual screening against malaria and avian influenza

• Deployment method

• Conclusion and perspectives

Page 13: Grid07 6 Jacq

Jacq - 12 June 2007 13

Compound database

Target structure model

DOCKING

Predicted binding models

Post-analysis

Compounds for assay

Docking: predict how small molecules bind to a receptor

of known 3D structure

Virtual screening by docking

Page 14: Grid07 6 Jacq

Jacq - 12 June 2007 14

Grid-enabled high throughput virtual

screening by dockingMillions of potential drugs to test againstinteresting proteins!

High Throughput Screening1-10$/compound, several hours

Data challenge on EGEE~ 2 to 30 days on ~5,000 computers

Hits screeningusing assaysperformed onliving cells

Leads

Clinical testing

Drug

Selection of the best hits

Too costly for neglected disease!

Molecular docking (FlexX, Autodock)~1 to 15 minutes

Targets:PDB: 3D structures

Compounds:ZINC: 4.3M

Chembridge: 500,000

Cheap and fast!

Page 15: Grid07 6 Jacq

Jacq - 12 June 2007 15

Statistics of deployment

• First Data Challenge: July 1st - August 15th 2005– Target: malaria– 80 CPU years, 1 TB of data produced, 1,700 CPUs used in parallel– 1st large scale docking deployment world-wide on a e-infrastructure

• Second Data Challenge: April 15th - June 30th 2006 – Target: avian flu– 100 CPU years, 800 GB of data produced, 1,700 CPUs used in parallel– Collaboration initiated on March 1st: deployment preparation achieved in 45

days

• Third Data Challenge: October 1st - 15th December 2006 – Target: malaria– 400 CPU years, 1.6 TB of data produced, Up to 5,000 CPUs used in parallel– Very high docking throughput: > 100,000 compounds per hour

Page 16: Grid07 6 Jacq

Jacq - 12 June 2007 16

A huge international effort for the third data challenge

1% 2% 2% 3%3%

3%3%

5%

6%

7%

12%15%

38%

EGEE Germany SwitzerlandEGEE Asia Pacific EGEE RussiaAuvergridEuChinaGridEELAEGEE South Western EuropeEGEE Central Europe EGEE Northern EuropeEGEE ItalyEGEE South Eastern EuropeEGEE FranceEGEE UKI

Over 420 CPU years in 10 weeksA record throughput of 100,000 docked compounds per hour

WISDOM calculations used FlexX from BioSolveIT(6k free, floating licenses)

Page 17: Grid07 6 Jacq

Jacq - 12 June 2007 17

Biological objectives

• Malaria– Plasmepsin

– DHFR Plasmodium falciparum– DHFR Plasmodium vivax– GST– Tubulin

• Avian influenza– Neuraminidase N1

N1

H5

Credit: Y-T Wu (ASGC)

Page 18: Grid07 6 Jacq

Jacq - 12 June 2007 18

Results from avian fludata challenge (1/2)

• 5 out of 6 known effective inhibitors can be identified in the first 15% of the ranking and in the first 5% reranked (2,250 compounds)– Enrichment: (5/6)/(15%x5%) = 111 (<1 in most cases)

• Most known effective inhibitors lose their affinity in binding with a mutated target

GNA 2.4%

15% cut off

E119A

11.5%

E119A mutated type

GNA 11.5%

Original type

GNA=zanamivir

Page 19: Grid07 6 Jacq

Jacq - 12 June 2007 19

• Experimental assay confirms 7 actives out of 123 purchased “potential hits” (interacting complexes with higher affinities and proper docked poses) = 6%

• Average success rate of in vitro testing = 0.1%• To be confirmed on more hits, tests are running in Univ. of

Chonnam (South Korea)

NA

Results from avian fludata challenge (2/2)

Page 20: Grid07 6 Jacq

Jacq - 12 June 2007 20

Results from first malaria data challenge

1,000, 000 chemical compounds

Sorting based on scoring in different parameter sets;Consensus scoring

10,000 compounds selected

Based on key interactions, binding modes, etc.

1,000 compounds

MD

100 compounds will be tested in July by Univ. of Chonnam (South Korea)Credit: V. Kasam

Fraunhofer Institute

Page 21: Grid07 6 Jacq

Jacq - 12 June 2007 21

Content

• WISDOM, an initiative for grid-enabled drug discoveryagainst neglected and emerging diseases

• Deployment and results of grid-enabled large scalevirtual screening against malaria and avian influenza

• Deployment method

• Conclusion and perspectives

Page 22: Grid07 6 Jacq

Jacq - 12 June 2007 22

Requirements for a deployment on grid

• Adaptation of the application to the grid

• Access to a large infrastructure providing maintained resources

• Use of a production system providing automated and fault-tolerant job and file management

Page 23: Grid07 6 Jacq

Jacq - 12 June 2007 23

Adaptation of the application to the grid

• The application codes can not be modified and are not designed for grid computing.

• A common strategy is to split the application into shorter tasks

• License management for commercial software is not adapted for large infrastructure

Docking softwareDocking software

DBDBDB

OutputOutputOutput

InputdataInputdata

ParametersParameters

Docking softwareDocking software

DBDBDB

OutputOutputOutput

InputdataInputdata

ParametersParameters

DataDataDataDataDBsubset

DBsubset

Embarrassingly parallel application

Page 24: Grid07 6 Jacq

Jacq - 12 June 2007 24

Grid Added Value

• Large number of CPUs available

• Reliable and secured Data Management Services– Sharing of results– Replication of the data– ACLs

• Availability of the resources

Real Time Monitor (Imperial College London)http://gridportal.hep.ph.ic.ac.uk/rtm/

Page 25: Grid07 6 Jacq

Jacq - 12 June 2007 25

Grid infrastructures and projects contributing to the

data challenges

: European grid infrastructure : European grid project

EELA

EUMedGrid EUChinaGrid

: Regional/national grid infrastructure

Auvergrid EGEE

TWGrid

EMBRACE BioinfoGridSHARE

Page 26: Grid07 6 Jacq

Jacq - 12 June 2007 26

WISDOM production environment

Credit: CNRS-IN2P3

Page 27: Grid07 6 Jacq

Jacq - 12 June 2007 27

GUI designed by biologists

Target selection

Compound selection

Docking parameter setter

Energy table

Complex visualization

Credit: H-C Lee (ASGC)

Page 28: Grid07 6 Jacq

Jacq - 12 June 2007 28

Content

• WISDOM, an initiative for grid-enabled drug discoveryagainst neglected and emerging diseases

• Deployment and results of grid-enabled large scalevirtual screening against malaria and avian influenza

• Deployment method

• Conclusion and perspectives

Page 29: Grid07 6 Jacq

Jacq - 12 June 2007 29

Conclusion

• WISDOM proposes a new approach to drug discoverythanks to the grid– Rapid deployment of large scale virtual screening– Collaborative environment for the sharing of data in the

research community

• First biochemical results demonstrate grid relevance to the drug discovery community

Page 30: Grid07 6 Jacq

Jacq - 12 June 2007 30

Perspectives

• Summer 2007– 2nd data challenge against avian flu– In vitro tests of the best molecules from the data challenges

• Winter 2007– Discussion with WHO and Novartis

Targets provided by the Drug Target Portfolio Network from the Tropical Disease Research initiative

– Discussion with Africa@home initiativeWISDOM deployment on a desktop grid

Page 31: Grid07 6 Jacq

Jacq - 12 June 2007 31

Thank you

• To all members of the WISDOM collaboration for theircontribution to the project (CNRS-IN2P3, ASGC, ITB-CNR, SCAI Fraunhofer, Univ of Modena…)

• To all grid nodes which committed resources and allowedthe success of the initiative

• To all projects which supported the initiative by providingeither computing resources or manpower to develop the WISDOM environment (EGEE, BioinfoGRID, Embrace, SHARE…)

• To BioSolveIT by offering up to 6,000 free licenses of FlexX