![Page 1: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/1.jpg)
Hosting a Compound Centric Community Resource for Chemistry Data
Antony Williams, ACS Anaheim March 28th 2011
![Page 2: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/2.jpg)
Data Archiving, e-Science andPrimary Data How much data generated in a lab, that COULD
go public, is lost forever? Public Domain reference databases of value?
Syntheses Properties Spectra CIFs Images
Much of chemistry is chemical structure-based – where and how could we host these data?
![Page 3: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/3.jpg)
The Social Network
Career-wise, within the next few years NOT having a personal presence online will be a detriment Self-marketing Establishing a profile Getting on the record Collaborative Science Demonstrating a skill set Measured using alternative metrics Contributing to the public peer review process
![Page 4: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/4.jpg)
Social Networking Tools
A growing number of social networking tools:
Facebook Twitter Linked-In Flickr YouTube Blogs Communities Collaborative environments
![Page 5: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/5.jpg)
Collaborative Knowledge Management
![Page 6: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/6.jpg)
TotallySynthetic.com
![Page 7: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/7.jpg)
Contributing Chemistry online Property databases Compound aggregators Screening assay results Scientific publications Encyclopedic articles (Wikipedia) Metabolic pathway databases ADME/Tox data – eTOX for example Blogs/Wikis and Open Notebook Science Contributing Open Source code to projects
![Page 8: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/8.jpg)
Chemistry Social Networking Methods of sharing MY chemistry online include:
Wikis or blogs Slideshare for presentations YouTube for videos Flickr, Wikimedia etc. for images (and FigShare) PubChem for assay data NMRShiftDB for NMR assignments GoogleDocs for data (and FigShare)
![Page 9: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/9.jpg)
FigShare
![Page 10: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/10.jpg)
FigShare
![Page 11: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/11.jpg)
Chemistry Social Networking Methods of sharing MY chemistry online include:
Wikis or blogs Slideshare for presentations YouTube for videos Flickr, Wikimedia etc. for images (and FigShare) PubChem for assay data NMRShiftDB for NMR assignments GoogleDocs for data (and FigShare)
What other online environments can you immediately share chemistry data?
![Page 12: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/12.jpg)
ChemSpider
ChemSpider is a chemistry database >25 million compounds, >400 data sources A deposition platform
Structure(s) Identifiers Links to internet resources, articles and DOIs Experimental data (spectra, images, CIFS) Multimedia (videos, MP3s)
A curation and annotation platform Remove “bad data” Annotate existing data
A publishing platform for the community
![Page 13: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/13.jpg)
Search for a Chemical by name
![Page 14: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/14.jpg)
Available Information…
Linked to vendors, safety data, toxicity, metabolism
![Page 15: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/15.jpg)
Available Information….
![Page 16: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/16.jpg)
Crowdsourced “Annotations”
Users can add Descriptions/Syntheses/Commentaries Links to PubMed articles Links to articles via DOIs Add spectral data Add Crystallographic Information Files Add photos Add MP3 files Add Videos
![Page 17: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/17.jpg)
![Page 18: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/18.jpg)
Spectra
![Page 19: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/19.jpg)
Spectra
![Page 20: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/20.jpg)
Inherited Errors
Inherited errors from every database… all public compound databases, including ours, have errors
“Incorrect” structures – assertions, timelines etc
“Incorrect” names associated with structures
ENORMOUS CHALLENGE
![Page 21: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/21.jpg)
Crowdsourced Curation
Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate
![Page 22: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/22.jpg)
Search “Vitamin H”
![Page 23: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/23.jpg)
“Curate” Identifiers
![Page 24: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/24.jpg)
“Curate” Identifiers
![Page 25: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/25.jpg)
“Curate” Identifiers
![Page 26: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/26.jpg)
Crowdsourcing Works
>130 people have deposited data and participated in data curation
Different level curators check each other
More curators and depositors are encouraged!
![Page 27: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/27.jpg)
Molbank (Open Access Journal)
![Page 28: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/28.jpg)
ChemSpider SyntheticPages
Many syntheses are not published but are of value
CSSP: A database of synthesis procedures built for the community, by the community.
Peer-reviewed by the community
Each contribution has a DOI – of value to the submitter?
![Page 29: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/29.jpg)
![Page 30: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/30.jpg)
Vandalism
Vandalism of ChemSpider is VERY rare…
Three acts of vandalism ever Someone tried to “sell a house!” A vendor posted their logo against a chemical A student, Katie Crow, posted a “personal
photo”
But data quality can appear like vandalism!
![Page 31: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/31.jpg)
Drivers in the Social Network Anonymity is a choice in the social networks
Many people on Wikipedia are anonymous Many blogs are anonymous Comments on blogs can be anonymous
Anonymity in peer-review will likely become less important and may be generational
I may want acknowledgment if… I share my data I review a paper I share my expertise
![Page 32: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/32.jpg)
The Alt-Metrics Manifesto
http://altmetrics.org/manifesto/
![Page 33: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/33.jpg)
Enabled by ORCID…
![Page 34: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/34.jpg)
Who declares data as Open? Data licensing is very interesting and can spark
“interesting” conversations. Opinions differ: Are images data? Are assertions data? What on a ChemSpider record is data? Is PubChem or PubMed Open Data?
We allow people to declare their data as Open and add an Open Data button at upload
A lot of data on ChemSpider are free but not Open Pragmatism: Our focus is a community resource
![Page 35: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/35.jpg)
Licensing “My Work” Online Is it “my” chemistry once it’s online?
The complex nature of licensing “my” chemistry Blogs - copyrighted and creative commons Wikis - mixed licensing, depends on the host(s) Data – much value in sharing data as “Open Data”
Often, people can make money from your work!
Police your own “licensing” – how many people have read the Facebook and Twitter agreements?!
![Page 36: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/36.jpg)
ChemSpiderA Structure Centric Host
An established community resource
>25 million compounds from >400 data sources Thousands of users per day Approaching a million transactions per day A crowdsourced deposition and curation platform Grows daily – more depositions, more data A publishing platform for the community Contributions welcome! Learn how…
![Page 37: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/37.jpg)
ChemSpider Training Session
ChemSpider: A Community Resource for Chemical Data
Wednesday, March 30th
8:30-11:00 AM
Anaheim Convention Center, Room 211 A
![Page 38: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/38.jpg)
Acknowledgments RSC|ChemSpider team The “Crowd” of curators All Data Source providers
GGA Software Services ACD/Labs OpenEye Accelrys
![Page 39: Hosting a compound centric community resource for chemistry data](https://reader036.vdocuments.mx/reader036/viewer/2022062513/554e8641b4c90526358b46b4/html5/thumbnails/39.jpg)
Thank you
Email: [email protected] Twitter: ChemConnectorPersonal Blog: www.chemconnector.comSLIDES: www.slideshare.net/AntonyWilliams