cinf 2012 talk recrystallization app
DESCRIPTION
Jean-Claude Bradley presents on a recrystallization app based on Open Data feeds and models.TRANSCRIPT
The deployment of an app from Open Data feeds and algorithms: Recommending
recrystallization solvents
Jean-Claude Bradley
December 13, 2012
ACS-CINF Symposium
Associate Professor of ChemistryDrexel University
The importance of recrystallization
• Generally preferred if there is a known solvent that gives a good yield
• Scales much more easily and cheaply than chromatography
• However, for new compounds much trial and error may be needed
The Recrystallization App
(Andrew Lang)
What are good solvents to recrystallize benzoic acid?
(Andrew Lang)
Click on the solvent to see temp curve
(Andrew Lang)
Deliver melting point data via App
(Andrew Lang)
How does it work?
1. Look up the solvent boiling point
2. Look up the room temperature solubility or predict it via Abraham descriptors predicted from a model using the CDK
3. Look up the solute melting point or predict it via a model using the CDK
4. Use the melting point and the solubility at room temperature to predict the solubility at boiling
5. Calculate the predicted recrystallization yield
Openness in Chemistry
WHY?
The Recrystallization App produces and uses Open Data:• Open Solubility Collection and Models• Open Melting Point Collection and
Models• Modeling depends mainly on CDK (Open
Source Software with Open Descriptors)• Open Notebook Science
Open Data Collections are essential for this strategy
Open DataOpen Data
Open Data
transparent transformation
Transparent chain of provenance
Open Melting Point DatasetsCurrently 20,000 compounds with Open MPs
American Petroleum Institute 5 CPHYSPROP -30 CPHYSPROP 125 Cpeer reviewed journal (2008) 97.5 Cgovernment database -30 Cgovernment database 4.58 C
What is the melting point of 4-benzyltoluene?
Motivation: Faster Science, Better Science
The quest to resolve the melting point of 4-benzyltoluene: liquid at room temp
and can be frozen <-30C
Open Lab Notebook page measuring the melting point of 4-benzyltoluene
Ruling out all melting points above -15C?
Oops – 4-benzyltoluene freezes after 16 days at -15C!
Measuring the melting point by slowly heating from -15 C gives 5 C
There are NO FACTS, only measurements embedded
within assumptions
Open Notebook Science maintains the integrity of data
provenance by making assumptions explicit
Open Random Forest modeling of Open Melting Point data using CDK descriptors
(Andrew Lang)
R2 = 0.78, TPSA and nHdon most important
Melting point prediction service
Web services for summary data
(Andrew Lang)
Using a Google Spreadsheet as a “dashboard interface” for reaction planning and analysis
Calling Google App Scripts
Calling Google App Scripts
(Andrew Lang and Rich Apodaca)
Never having to leave the Google Spreadsheet dashboard for access to key info
(Andrew Lang and Rich Apodaca)
A click away from an interactive NMR display (using JCAMP-DX format and ChemDoodle)
(Andrew Lang)
Google Apps Scripts for conveniently exploring melting
point data
Straight chain carboxylic acids from 1 to 10 carbons
Straight chain alcohols from 1 to 10 carbons
Comparison of model with triple validated measurements
Cyclic primary amines from 3 to 6 carbons (cyclobutylamine flagged for validation – only single source available)
Open Melting Points in Supplementary Data Pages of Wikipedia (Martin Walker)
Dibenzalacetone derivatives docking against tubulin (paclitaxel site)
(Andrew Lang)
“Simple” aldol condensation synthesis
Top Hit(no reports of synthesis)
In top ten(a few reports of synthesis)(Andrew Lang)
Information from the literature on the target synthesis
Information from the literature on the target synthesis
Searching for aldol condensations of acetone in the Reaction Attempts
database (about 90% of reactions in Open Notebooks are “not successful”)
(Andrew Lang)
An example of a “failed experiment” in an Open Notebook with useful
information
A failed experiment reveals the importance of aldehyde solubility
An example of a successful experiment in an Open Notebook
A successful synthesis by avoiding water, dramatically increasing NaOH and long reaction
time
Chemical Information Retrieval 2012 property assignment
Melting Point Outlier List
Melting Point Outlier example
Solubility Outlier List
Solubility of benzoic acid in 1-octanol discrepancies
Using ChemSpider to ensure all stereocenters are defined before
searching for properties
Using the InChIKey to find single isomers
Chemical Information Validation Sheet 2012
Each entry validated with an image
Avoiding redundant property data points with a single click within the validation
sheet
Open Chemical Property Matrix (OCPM)
logP
Abraham descriptors
Melting point
Aqueous solubility
Octanol solubility
Vapor pressure
Flash point
Boiling point
Open Chemical Property Matrix (OCPM)
OCPM relationships
OCPM melting point sheet
Dibenzalacetone libraries are promising for connecting the OCPM with useful applications
Conclusions
More openness in chemistry can make science more efficient
Provide interfaces that make sense to the end users: Open Data, Open Models and Open Source Software to modelersApps (smartphones, Google App Scripts, etc.) for chemists at the bench
Acknowledgements
Andrew Lang (code, modeling)Bill Acree (modeling, solubility data contribution)Antony Williams (ChemSpider services, mp data curation)Matthew McBride and Rida Atif (recrystallization and synthesis)Kayla Gogarty (OCPM)