cinf 2012 talk recrystallization app

Post on 10-May-2015

604 Views

Category:

Education

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Jean-Claude Bradley presents on a recrystallization app based on Open Data feeds and models.

TRANSCRIPT

The deployment of an app from Open Data feeds and algorithms: Recommending

recrystallization solvents

Jean-Claude Bradley

December 13, 2012

ACS-CINF Symposium

Associate Professor of ChemistryDrexel University

The importance of recrystallization

• Generally preferred if there is a known solvent that gives a good yield

• Scales much more easily and cheaply than chromatography

• However, for new compounds much trial and error may be needed

The Recrystallization App

(Andrew Lang)

What are good solvents to recrystallize benzoic acid?

(Andrew Lang)

Click on the solvent to see temp curve

(Andrew Lang)

Deliver melting point data via App

(Andrew Lang)

How does it work?

1. Look up the solvent boiling point

2. Look up the room temperature solubility or predict it via Abraham descriptors predicted from a model using the CDK

3. Look up the solute melting point or predict it via a model using the CDK

4. Use the melting point and the solubility at room temperature to predict the solubility at boiling

5. Calculate the predicted recrystallization yield

Openness in Chemistry

WHY?

The Recrystallization App produces and uses Open Data:• Open Solubility Collection and Models• Open Melting Point Collection and

Models• Modeling depends mainly on CDK (Open

Source Software with Open Descriptors)• Open Notebook Science

Open Data Collections are essential for this strategy

Open DataOpen Data

Open Data

transparent transformation

Transparent chain of provenance

Open Melting Point DatasetsCurrently 20,000 compounds with Open MPs

American Petroleum Institute 5 CPHYSPROP -30 CPHYSPROP 125 Cpeer reviewed journal (2008) 97.5 Cgovernment database -30 Cgovernment database 4.58 C

What is the melting point of 4-benzyltoluene?

Motivation: Faster Science, Better Science

The quest to resolve the melting point of 4-benzyltoluene: liquid at room temp

and can be frozen <-30C

Open Lab Notebook page measuring the melting point of 4-benzyltoluene

Ruling out all melting points above -15C?

Oops – 4-benzyltoluene freezes after 16 days at -15C!

Measuring the melting point by slowly heating from -15 C gives 5 C

There are NO FACTS, only measurements embedded

within assumptions

Open Notebook Science maintains the integrity of data

provenance by making assumptions explicit

Open Random Forest modeling of Open Melting Point data using CDK descriptors

(Andrew Lang)

R2 = 0.78, TPSA and nHdon most important

Melting point prediction service

Web services for summary data

(Andrew Lang)

Using a Google Spreadsheet as a “dashboard interface” for reaction planning and analysis

Calling Google App Scripts

Calling Google App Scripts

(Andrew Lang and Rich Apodaca)

Never having to leave the Google Spreadsheet dashboard for access to key info

(Andrew Lang and Rich Apodaca)

A click away from an interactive NMR display (using JCAMP-DX format and ChemDoodle)

(Andrew Lang)

Google Apps Scripts for conveniently exploring melting

point data

Straight chain carboxylic acids from 1 to 10 carbons

Straight chain alcohols from 1 to 10 carbons

Comparison of model with triple validated measurements

Cyclic primary amines from 3 to 6 carbons (cyclobutylamine flagged for validation – only single source available)

Open Melting Points in Supplementary Data Pages of Wikipedia (Martin Walker)

Dibenzalacetone derivatives docking against tubulin (paclitaxel site)

(Andrew Lang)

“Simple” aldol condensation synthesis

Top Hit(no reports of synthesis)

In top ten(a few reports of synthesis)(Andrew Lang)

Information from the literature on the target synthesis

Information from the literature on the target synthesis

Searching for aldol condensations of acetone in the Reaction Attempts

database (about 90% of reactions in Open Notebooks are “not successful”)

(Andrew Lang)

An example of a “failed experiment” in an Open Notebook with useful

information

A failed experiment reveals the importance of aldehyde solubility

An example of a successful experiment in an Open Notebook

A successful synthesis by avoiding water, dramatically increasing NaOH and long reaction

time

Chemical Information Retrieval 2012 property assignment

Melting Point Outlier List

Melting Point Outlier example

Solubility Outlier List

Solubility of benzoic acid in 1-octanol discrepancies

Using ChemSpider to ensure all stereocenters are defined before

searching for properties

Using the InChIKey to find single isomers

Chemical Information Validation Sheet 2012

Each entry validated with an image

Avoiding redundant property data points with a single click within the validation

sheet

Open Chemical Property Matrix (OCPM)

logP

Abraham descriptors

Melting point

Aqueous solubility

Octanol solubility

Vapor pressure

Flash point

Boiling point

Open Chemical Property Matrix (OCPM)

OCPM relationships

OCPM melting point sheet

Dibenzalacetone libraries are promising for connecting the OCPM with useful applications

Conclusions

More openness in chemistry can make science more efficient

Provide interfaces that make sense to the end users: Open Data, Open Models and Open Source Software to modelersApps (smartphones, Google App Scripts, etc.) for chemists at the bench

Acknowledgements

Andrew Lang (code, modeling)Bill Acree (modeling, solubility data contribution)Antony Williams (ChemSpider services, mp data curation)Matthew McBride and Rida Atif (recrystallization and synthesis)Kayla Gogarty (OCPM)

top related