e-science technologies in the simulation of complex materials l. blanshard, r. tyer, k. kleese s. a....

33
e-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart, W. Emmerich – CS H. Nowell, S. L. Price – Chem eMaterials

Upload: caleb-hahn

Post on 28-Mar-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

e-Science Technologies in the Simulation of Complex Materials

L. Blanshard, R. Tyer, K. Kleese

S. A. French, D. S. Coombes, C. R. A. CatlowB. Butchart, W. Emmerich – CSH. Nowell, S. L. Price – Chem

eMaterials

Page 2: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Combinatorial Computational Catalysis

Polymorphismprediction of prediction of polymorphspolymorphs – – a drug substance may exist a drug substance may exist as two or more crystalline as two or more crystalline phases in which the phases in which the molecules are packed molecules are packed differently. differently.

Acid Sites in Zeolites

explore which sites are involved in explore which sites are involved in catalysiscatalysis – – used in used in diverse diverse industries including petroleum, industries including petroleum, chemical, polymers, chemical, polymers, agrochemicals, and environmental. agrochemicals, and environmental.

N

CH3

NO2

NO2

H

CH3

OH

H

Page 3: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Polymorph Prediction

Different crystal structures of a molecule are called polymorphs.

Polymorphs may have considerably different properties(e.g. bioavailability, solubility, morphology)

Polymorph prediction is of great importance to the pharmaceutical industry where the discovery of a new polymorph during production or storage of a drug may be disastrous

Drug molecules are often flexible and this makes the polymorph prediction process more challenging…

Page 4: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

MOLPAK Generation of ~6000 densely packed crystal

structures using rigid molecular probe

DMAREL Lattice energy optimisation

For flexible molecules: conformational optimisation

n feasible rigid molecular probes representing energetically plausible conformers

Data : Unit cell volume, density, lattice energy

Restricted number of structures selected crystal structures and properties stored in

Database

Morphology

n times

n = number of conformers

Polymorph Prediction Workflow

Page 5: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

N

CH3

NO2

NO2

H

CH3

OH

H

Blind Test 2004

The Challenge:

Predict the crystal structure of2-methyl-4,5-dinitro-phenyl-acetamide

Wide range of conformers within plausible energy range

8 conformers chosen and used in subsequent searches

Flexibility indicated with arrows

0

5

10

15

20

25

30

35

40

0 100 200 300 400

CCNC Torsion Angle / ˚

En

erg

y D

iffe

ren

ce /

kJm

ol-1

Potential energy surface scan

about the CCNC torsion angle

Page 6: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

-130

-110

-90

-70

-50

-30

250 270 290 310 330 350 370 390 410

a

b

c

10

20

-10

-20

-5

Volume / Z (Å3 molecule-1)

Conformer:

Blind Test 2004

Minima in the Lattice Energy for Different ConformationsLa

ttic

e e

nerg

y +

intr

am

ole

cula

r energ

y /

kJm

ol-1

Page 7: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

-126

-124

-122

-120

-118

-116

260 265 270 275 280 285 290

a

b

c

10

20

-10

-20

-5

Blind Test 2004

Volume / Z (Å3 molecule-1)

Conformer:

Best 10kJmol-1

Necessary to consider properties of best crystal structures, such as growth rates, to decide which are more likely to be observed

Latt

ice e

nerg

y +

intr

am

ole

cula

r energ

y /

kJm

ol-1

Minima in the Lattice Energy for Different Conformations

Page 8: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Results

Observed crystal structure (revealed upon completion of blind test) – higher energy conformer than those considered!

ObservedPredicted

When just the observed conformer is used as the rigid probe in the search the observed structure is found as global minimum in lattice energy

Page 9: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Summary

High energy gas phase conformers may be stabilised by packing within a lattice in the solid state

As many conformers as possible need to be considered to maximise the chance of predicting crystal structures correctly and exploring the range of structures that are energetically feasible as polymorphs

A fast, distributed e-Science application is being developed, to enable routine crystal structure prediction for large numbers of conformers –this is essential to develop computational methods of predicting possible polymorphs of pharmaceutical molecules

Page 10: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Predicting Morphologies

The shape, or morphology, of a crystal plays an important role in the manufacturing process as there are considerable problems if the morphology changes due to impurities or changes of solvent or when the process is scaled up for high volume manufacture.

An understanding of the factors influencing crystal morphology will help us to understand how the crystallisation process can be controlled through, for example the use of solvents or additives.

• BFDH Theory – based on geometrical factors

• AE Model – based on energetic factors

Page 11: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Scheme for Morphology Calculations

Minimised Structure

Choose faces to study ~15-20

For each face calculate AE

Draw morphology for each crystals set of faces

Calculate relative volume growth rates

From DMAREL minimised structure

BFDH calculation in GDIS

Calculate valid shifts Converge regions (exclude polar)

Wulff plot

New property

Page 12: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

The calculated morphology can be visualised using a Wulff plot-where the ratio of surface normal distances of all planes from the centre of the crystal are determined by either the interplanar spacings, attachment or surface energies.

OH NH

O

CH3

Observed and predicted morphology of form 1 of paracetamol

Morphologies

Page 13: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Growth Volume

New property ‘growth volume’- obtained by numerical integration to find the volume within the Wulff shape-gives an indication of whether one face dominates.

0

1

2

3

4

5

6

7

8

9

10

fa37

ak11

am50

cb38 fc2

1aq

34dd

31

am20

ak23

cd49

av32

ca21

am43 ai3

6cb

39de

20ca

28de

40cb

47 ai18

am5

form

II

ca43 ak

7az

5ak

14 fa38

fa29

form

I

ak15

Polymorph-Decreasing Stability

Rel

ativ

e V

olu

me

-30

-25

-20

-15

-10

-5

0

AE

/kJ

mo

l-1

per

mo

lecu

le

Volume

AE

N

Form 1 Z’=4

Many low energy structures, new observed form 2 predicted to grow fast

Pyridine

Prompted expt.search for morepolymorphs

Page 14: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

• simulations take too long to run• data are distributed across many sites and systems• no catalogue system• output in legacy text files, different for each program • few tools to access, manage and transfer data• workflow management is manual• licensing within distributed environment

e-Science Issues to Address

Page 15: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

1. Expose Fortran binary as distributed Web Service

Fortran binary

XML<x…/>

XSL FO

FO XML

Fortraninput

Fortranoutput

WSDLWSDL

Define an XML interface to the computation

(Web Service Description Language)

To get binary to “talk” in XML: either change Fortran code so input and output uses XML or use parsers and XSLT conversion documents to map from fixed format input/output files to and from XML.

Fortran Web Services

Page 16: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

2. Orchestrate Web Serviceswith workflow service

BPEL script

BPEL script

WS wrapped Fortran binary

WS wrapped Fortran binary

Business Process Execution Language

Workflow service is exposed to outside world as a web service

Distributed Workflow

Page 17: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

CH4

CH4

CH4

CH4

Fortran programs, use lots of different formats to represent the same thing.

Data Representation

Page 18: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

CML<CH4…/>

CML<CH4…/>

Since we provide new WSDL interfaces for each application we have a perfect opportunity to employ a standard representation for chemical structures. XML standard in Chemistry is CML (Chemical Markup Language)

Data Representation

Development of chemical markup language (CML) as a system for handling complex chemical content. P. Murray-Rust, New Journal of Chemistry, 2001, 25, 618-634.

Page 19: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

(BPEL)workflow

Integration with Existing Infrastructure

Prototype has been successfully deployed.

Page 20: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Sun Grid

Engine(BPEL)workflow

Existing grid infrastructure does not integrate easily with web services.

Policy on compute clusters enforced by Sun Grid Engine batch system

Other users of clusters submit jobs via this control software

Building a WSDL binding over the Sun Grid Engine protocol is difficult

Smooth transition from existing infrastructure to WS riskier than thought.

Integration with Existing Infrastructure

Page 21: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

• file storage at CCLRC• distributed file access via Storage Resource Broker

(SDSC)• catalogue of files using metadata in relational database• web interface to metadata and files via Data Portal

• metadata editor through browser

Data Management at CCLRC

Page 22: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Store data files from simulations in the Store data files from simulations in the Storage Resource BrokerStorage Resource Broker

Storage Resource Broker

Page 23: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Search for studies in material sciences and download Search for studies in material sciences and download associated data using theassociated data using the -- CCLRC Data PortalCCLRC Data Portal

Data Portal

Page 24: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

• upload files as part of workflow to SRB• generate metadata• upload extracted data from files

Ongoing and Future Work

Page 25: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Acid Sites in Zeolites

•Determine the extra framework cation position within the zeolite framework.

•Explore which proton sites are involved in catalysis and then characterise the active sites.

•To produce a database with structural models and associated vibrational modes for Si/Al ratios.

•Improve understanding of the role of the Si/Al ratio in zeolite chemistry.

Page 26: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

A combined MC and EM approach has been developed to model zeolitic materials with low and medium Si/Al ratios. Firstly Al is inserted into a siliceous unit cell and then a charge compensating cation.

The zeolite Mordenite, which has a 1 dimensional channel system, has been studied with a simulation cell containing two unit cells, which means 296 atoms, with 96 Si centres (referred to as T sites).

MC/EM

Page 27: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

0-12085

-12083

-12081

-12079

-12077

-12075

-12073

-12071

-12069

-12067

-12065

ConfigurationsT

ota

l E

ner

gy

(eV

)

5350

5370

5390

5410

5430

5450

5470

5490

5510

5530

5550

Cel

l V

ol.

full_TE

full_Vol

5 per. Mov. Avg. (full_TE)

5 per. Mov. Avg. (full_Vol)

It can be seen that there are two distinct regions, -12079eV to -12076eV and -12075eV to -12073eV, but there is no obvious correlation between total energy and cell volume.

100

100 Configurations

Page 28: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

-12090

-12085

-12080

-12075

-12070

-12065

configurationsT

E

5350

5400

5450

5500

5550

VO

L

TE

VOL

200 per. Mov. Avg. (TE)

200 per. Mov. Avg. (VOL)

However, when 10,000 structures are considered it is clear that the most stable structures correspond to cation placements that do not cause the cell to expand. This requires that the cations sit in the large channel.

0 10000

10000 Configurations

Page 29: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Comparison of Regions

-12079.5eV -12075.04eV

Page 30: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

When confirmed the lowest energy positions of Al the cation is exchanged for a proton and again energy minimised.

This method will allow us to construct realistic models of low and medium Si/Al zeolites. Such structures can be used for further simulations and aid the interpretation of experimental data.

What Next

Page 31: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

Extensive use of Condor pools (UCL – 950 nodes in teaching pools). 48 cpu-years of previously unused compute resource have been utilised in this study. Close collaboration with the NERC e-minerals project has allowed access to this resource.

50,000 calculations have been performed each with 488 particles per simulation box, which means a total of 24,000,000 particles have been included in our simulations to date.

Condor

Page 32: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

1. First use of CML schema for defining Web Service port types.2. Calculation of 50,000 configurations of zeolite Mordenite (24,000,000 particles) to gain insight into structure when a realistic ratio of Al substitution is included in model.3. Successfully exposed Fortran codes as OGSI Web Services - prototype application deployed on 80 nodes. The prototype computational polymorph application is being ported to a larger production machine.4. First use of BPEL standard for orchestrating web services in a Grid application.5. Open Source BPEL implementation in development enabling late binding and dynamic deployment of large computational processes.6. Integration of OGSI and BPEL with Sun Grid Engine.7. Development of Graphic User Interface for polymorph application - connects to relational database via EJB interface.8. Infrastructure for metadata and data management9. SRB and dataportal are already being used to hold datasets and being used for transferring the data between different scientists and computer applications.10. Implementation of Condor pool at Ri.

Achievements To Date

Page 33: E-Science Technologies in the Simulation of Complex Materials L. Blanshard, R. Tyer, K. Kleese S. A. French, D. S. Coombes, C. R. A. Catlow B. Butchart,

We are now doing science that was not possible before the advancements made within e-Science.

Key Achievement