oops and downs of resolving inchis for the chemistry community
DESCRIPTION
The InChI resolver was rolled out to the community in March 2009 with the purpose of providing a centralized resource for chemists to resolve InChIs (International Chemical Identifiers). This presentation will provide an overview of the development of the underlying technologies associated with the InChI resolver, and how the resolver is being used, integrated and enhanced to provide additional value to the chemistry community. We will discuss present limitations to application of the resolver for providing access to databases and chemistry information distributed across the internet and define our vision for enhancing interconnectivity across Open databases using the InChI resolver as the glue.TRANSCRIPT
![Page 1: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/1.jpg)
Oops and downs of resolving InChIs for the chemistry community
![Page 2: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/2.jpg)
The InChI Has Arrived
My opinions:
The InChI is a crucial part of the future of structure-based relationships on the web
The semantic web of chemistry will sit on the shoulders of InChI until there is something better
InChIs and publishers are already in relationship – publishers who have not adopted will follow
![Page 3: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/3.jpg)
PPP – Perfection vs Productive vs Prolific
The InChI is not perfect
There are limitations but they are acknowledged and in discussion
The InChI is very “productive”
InChIs are showing up in databases, manuscripts, spreadsheets, on publications, in software
![Page 4: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/4.jpg)
A Lot of Variability in InChIs
Source: Unofficial InChI FAQ page
![Page 5: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/5.jpg)
InChIStrings Hash to InChIKeys
![Page 6: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/6.jpg)
HVYWMOMLDIMFJA-DPAQBDIFSA-N
![Page 7: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/7.jpg)
The InChI Resolver
![Page 8: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/8.jpg)
Inchis.chemspider.com
![Page 9: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/9.jpg)
Resolve an InChI or InChIKey
![Page 10: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/10.jpg)
Resolved
![Page 11: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/11.jpg)
Connection Only Resolving
![Page 12: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/12.jpg)
InChIs and Big Databases
There appears to be a bigger is better mentality with online databases
InChI has shown a lot of “overlap” in the ChemSpider database
Distinction : a unique chemical entity versus what it’s meant to be
Some simple examples …
![Page 13: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/13.jpg)
Spot The Difference
![Page 14: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/14.jpg)
Standard InChIKeys
![Page 15: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/15.jpg)
Spot The Difference
![Page 16: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/16.jpg)
55 Hits in 0.08 Seconds
![Page 17: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/17.jpg)
Large Databases Contain Junk
InChI Resolvers will get us back to results but it’s a look up..
There is an enormous need for curation and linking resolved structures to “correct” structures – a manual task
![Page 18: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/18.jpg)
Generate-It
![Page 19: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/19.jpg)
Draw and generate
![Page 20: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/20.jpg)
Generate
![Page 21: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/21.jpg)
All Flavors
![Page 22: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/22.jpg)
Historical and Future InChIs
The Standard InChI removed variability
There will be new variants in the future
There are already millions of historical InChIs “out there”
Resolvers should accommodate historical and future InChIs
![Page 23: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/23.jpg)
In Our Resolver…
![Page 24: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/24.jpg)
On to ChemSpider…
![Page 25: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/25.jpg)
NEW Patents and Pubmed on ChemSpider
![Page 26: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/26.jpg)
InChIs to Patents and Pubmed Articles
![Page 27: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/27.jpg)
But there will be multiple resolvers…
Each publisher, database, scientist can choose not to publish their structures into a centralized database
There are many large online databases. There is no need to merge/mirror them – each can be a resolver
They need to be federated
![Page 28: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/28.jpg)
Many ways to address resolving
Our approach is simple – lookup. We look up the structure. SIMPLE.
![Page 29: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/29.jpg)
NCI/CADD resolver: 69 million structures
![Page 30: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/30.jpg)
Differences
The NCI and ChemSpider Resolvers are “different”
Different databases behind the resolver – Feedback from NCI: “Preliminary results indicate that inchis.chemspider.com can resolve approx. 28% of our structures.”
Our approaches for resolving differ
Some features are different
![Page 31: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/31.jpg)
The InChI Resolver Protocol
There will not be only one InChI Resolver – there will be many Publishers Commercial Databases Free services and resources : PubChem,
ChemSpider, NCI Database, ChEBI
Resolvers will not be mirrors of each other There is no need to mirror when a protocol is in
place
![Page 32: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/32.jpg)
InChI Resolver Protocol
InChI resolving needs to be federated
A common protocol can connect resolvers so that a user gets a complete results set
Individual resolvers can have different capabilities but an agreed common protocol for resolving InChIs
![Page 33: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/33.jpg)
Discuss with us on Google Groups
Draft protocol for ACS Spring 2010 from RSC ChemSpider NCI/CADD PubChem Symyx
Proof of concept hopefully by end of this year for initial feedback (NCI and ChemSpider
Join us at http://tinyurl.com/r7q9zc http://groups.google.com/group/inchiresolverprotocol
![Page 34: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/34.jpg)
InChI trust
The founder members of the Trust: Elsevier, Thompson Reuters, Wiley, Nature Publishing Group, Royal Society of Chemistry, Symyx, FIZ-Chemie, Taylor & Francis and OpenEye
![Page 35: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/35.jpg)
In InChIs We Trust
It was said…. “There is a finite, but very small probability of
finding two structures with the same InChIKey.”
The first collision was announced on Sunday by Jonathan Goodman
![Page 36: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/36.jpg)
Spongistatin
![Page 37: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/37.jpg)
Probabilities are what they are…
“The molecule for which a collision has been reported … gives rise to 226 = 67,108,864 possible stereoisomers”
The probability of a clash is low but finite…and it happened.
OR…there may be a bug…work underway
![Page 38: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/38.jpg)
The Future
InChI is here
InChIKeys are proliferating
The need for lookup is inevitable – the need for federated resolvers is obvious
Intention to provide draft resolver protocol by end of year
ACS Spring – unveil proof of concept
![Page 39: Oops and Downs of Resolving InChIs For the Chemistry Community](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e9e3ab4c90526358b5611/html5/thumbnails/39.jpg)
Acknowledgments
The InChI “Team” – leadership team, developers, advisors, funders and the community providing feedback
Royal Society of Chemistry