navigating the complex web of chemistry using chemspider
DESCRIPTION
There is an increasing availability of free and open access resources for scientists to use on the internet. Coupled with the increasing availability of Open Source software tools we are in the middle of a revolution in data availability and tools to manipulate these data. ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. As an aggregator of chemistry related information from many sources, at present over 21.5 million unique chemical entities from over 200 separate data sources, ChemSpider has taken on the task of both robotically and manually curating publicly available data sources. This presentation will provide an overview of the ChemSpider platform and how it is fast becoming the centralized hub for resourcing information about chemical entities.TRANSCRIPT
![Page 1: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/1.jpg)
Navigating the Complex Web of Chemistry Using ChemSpider
![Page 2: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/2.jpg)
Antony Williams vs Identifiers
Old Passport ID
Dad, Tony, others
SSN
Green Card
License5 email addressesChemSpiderman (blog, Twitter account, Facebook, Friendfeed)OpenID….
![Page 3: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/3.jpg)
Aspirin vs Chemical Identifiers
![Page 4: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/4.jpg)
Aspirin names and synonyms
• Text searches depend on correct association
• 335 suggested identifiers for Aspirin just on PubChem!
• Disambiguation dictionaries are necessary
![Page 5: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/5.jpg)
Linked Data Cloud
![Page 6: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/6.jpg)
…the premium database producers are using some
automatic tools to prepare a ‘first draft’ of a database record, to be refined by eye.
Coupled with the public internet as a distribution method of choice, it is becoming possible for the first time to create and distribute new structure based databases at much lower costs, or even free of charge.
![Page 7: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/7.jpg)
![Page 8: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/8.jpg)
![Page 9: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/9.jpg)
The Final Search Strategy
![Page 10: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/10.jpg)
All Those Names, One Structure
![Page 11: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/11.jpg)
Content is King and Quality Costs Chemistry “content” is big business. Not everyone
can afford it. Patent searching Structures and properties Drug databases Literature databases
Chemical Abstracts Service (CAS), the “Gold Standard” in Chemistry related information 101 years of content $260 million revenue (2006) >50 million substances Proprietary platform
![Page 12: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/12.jpg)
Searching Chemistry on the Internet
How complete a result set will we get if we search for “chemicals” by name?
Is there a better way to link chemistry databases? Linking by “names” is dangerous
Chemists want structure and SUBstructure searching
![Page 13: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/13.jpg)
The InChI Identifier
![Page 14: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/14.jpg)
Multiple Layers
![Page 15: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/15.jpg)
InChIStrings Hash to InChIKeys
![Page 16: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/16.jpg)
Oleoylethanolamine
InChI=1S/C20H39NO2/c1-2-3-4-5-6-7-8-9-10-11-12-13-14-15-16-17-20(23)21-18-19-22/h9-10,22H,2-8,11-19H2,1H3,(H,21,23)/b10-9-
BOWVQLFMWHZBEF-KTKRTIGZSA-N
![Page 17: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/17.jpg)
InChIKey Searches Work
![Page 18: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/18.jpg)
Search Engine Dependencies
![Page 19: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/19.jpg)
Search Engine Dependencies
![Page 20: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/20.jpg)
InChIs have traction…
![Page 21: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/21.jpg)
RDF Linking of Structures
![Page 22: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/22.jpg)
PubChem
![Page 23: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/23.jpg)
The Simplest Organic Molecule
![Page 24: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/24.jpg)
Question Everything online: www.dhmo.org
![Page 25: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/25.jpg)
The Structure-Based Data Cloud
![Page 26: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/26.jpg)
Vancomycin
![Page 27: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/27.jpg)
![Page 28: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/28.jpg)
Vancomycin
Who will curate?
How would you clean such a large dataset?
![Page 29: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/29.jpg)
Vancomycin on ChemSpider
![Page 30: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/30.jpg)
Vancomycin
![Page 31: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/31.jpg)
Vancomycin
Search Molecular SKELETON
Search Full Molecule
![Page 32: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/32.jpg)
Full Skeleton Search: 104 Hits
![Page 33: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/33.jpg)
Full Molecule Search: 4 Hits
![Page 34: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/34.jpg)
What is ChemSpider? ChemSpider is:
Building a Structure Centric Community for Chemists 22.2 million compounds, >200 data sources
A deposition and curation platform
A publishing platform for the community
Grows daily – more depositions, more links, more data sources
![Page 35: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/35.jpg)
For Chemical Compounds
Vendor sites – Aldrich, Alfa Aesar, TCI and 100s of others
Government databases – PubChem, DSSTox, FDA databases, ChemIDPlus,…
Biological Databases – Protein Database, Stitch, KEGG, ChEBI,…
Analytical databases –NMRShiftDB,…
![Page 36: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/36.jpg)
How Was ChemSpider Built? ChemSpider was a “hobby project”
Housed in a basement and running off three servers – one bought, two built
May 2009
![Page 37: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/37.jpg)
3 servers – 2 homebuilt .NET architecture SQL server Homebuilt structure/substructure Commercial components Open Source Components
OpenBabel, Jmol, JSpecView, NCBI Toolkit, InChI Libraries
![Page 38: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/38.jpg)
Search Cholesterol
![Page 39: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/39.jpg)
Search Cholesterol
![Page 40: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/40.jpg)
Search Cholesterol
![Page 41: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/41.jpg)
Search Cholesterol
![Page 42: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/42.jpg)
Linked across the internet
![Page 43: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/43.jpg)
Kyoto Encyclopedia of Genes and Genomes
![Page 44: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/44.jpg)
Links to Patents based on structure
![Page 45: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/45.jpg)
![Page 46: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/46.jpg)
Answering Questions for Chemists Questions a chemist might ask…
What is the melting point of n-butanol? What is the chemical structure of Xanax? Chemically, what is phenolphthalein? What are the stereocenters of cholesterol? Where can I find publications about xylene? What are the different trade names for Ketoconazole? What is the NMR spectrum of Aspirin? What are the safety handling issues for Thymol Blue?
![Page 47: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/47.jpg)
Complex Data and Information
![Page 48: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/48.jpg)
Remember – QUALITY ISSUES
![Page 49: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/49.jpg)
The FDA’s DailyMed
![Page 50: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/50.jpg)
Incorrect Structures
![Page 51: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/51.jpg)
Does one stereocenter matter?
Distaval, Talimol, Nibrol, Sedimide, Quietoplex, Contergan, Neurosedyn, and Softenon
![Page 52: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/52.jpg)
Crowd-sourcing Chemistry Curation
![Page 53: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/53.jpg)
We Need Recognition and Rewards
![Page 54: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/54.jpg)
Master Curators, Curators, Depositors
![Page 55: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/55.jpg)
Collaborating with Wikipedia
Long term project to curate chemical compounds
Robotically linking ChemSpider to Wikipedia at present
Will layer on InChI Strings and InChIKeys shortly and make Wikipedia structure searchable
![Page 56: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/56.jpg)
Blogs need InChIs too!
![Page 57: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/57.jpg)
Blogs need InChIs too!
![Page 58: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/58.jpg)
Use Intelligent Structures : ChemSpider Embed Web Service
![Page 59: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/59.jpg)
ChemSpider Web Services
![Page 60: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/60.jpg)
Semantic Mark-up for Chemistry
Semantic mark-up for chemistry is here
RSC project prospect
Nature publishing group compound linking
ChemMantis
![Page 61: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/61.jpg)
Nature Chemistry Compound Pages
![Page 62: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/62.jpg)
Project Prospect
![Page 63: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/63.jpg)
ChemMantis
![Page 64: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/64.jpg)
Deposit Structures
![Page 65: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/65.jpg)
Species – linked to Wikipedia
![Page 66: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/66.jpg)
Semantic Linking of Structures
What would you want to link off a structure? Chemical suppliers Other publications Analytical Data Related Reactions Wikipedia Patents “Everything”
![Page 67: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/67.jpg)
The InChI “Resolver”
![Page 68: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/68.jpg)
InChI Resolver to DOIsStructure Search the Web
![Page 69: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/69.jpg)
![Page 70: Navigating the Complex Web of Chemistry Using ChemSpider](https://reader035.vdocuments.mx/reader035/viewer/2022070315/554ead8cb4c905fb7c8b4f07/html5/thumbnails/70.jpg)
Conclusions Internet resources provide a collaborative
community for chemistry
Crowdsourcing to expand, curate and integrate to the benefit of chemists
Searching the web for chemistry is arriving
InChIs are enabling chemistry on the internet
Question Quality!