ChemSpider SyntheticPages – the Benefits of Publishing
Chemical Syntheses Online
If it was not just about me…
If it was not just about me…
We might have a community built encyclopedia
I might know where the best restaurants are
I might get good advice on books to read
I might know which movies to watch
I might know which plumber to call
Data might just be Open
If it was not just about me…
We might have a community built encyclopedia
I might know where the best restaurants are
I might get good advice on books to read
I might know which movies to watch
I might know which plumber to call
Data might just be Open
ChemSpider SyntheticPages Many syntheses are not published but are of value
A database of synthesis procedures built for the community, by the community.
Peer-reviewed by the community
Each contribution DOI’ed. Develop online scientific reputation at a time of “micro-publications”
Integrates semantic mark-up and visualization tools
ChemSpider SyntheticPageshttp://cssp.chemspider.com
ChemSpider SyntheticPages
Submission process Register as a user Use the Submit button and fill in the fields…
Submission Process
Submissions reviewed by editorial board
Published as is or comments sent to author
Online Peer Review process – engage chemists in ongoing discussions and feedback loop
Data supported include web movies, images, live spectra etc.
Recent Submissions
Semantic Markup: Project Prospect
Entity-Extraction, Mark-up, Annotate
Success Depends on Dictionaries
Link to a Structure or the Right Structure?
Name-Structure Pairs
Semantic Linking of Structures
What would you want to link off a structure? Chemical suppliers Other publications Analytical Data Related Reactions Wikipedia Patents “Everything”
ChemSpider
The Free Chemical Database
A central hub for chemists to source information >28 million unique chemical records Aggregated from >400 data sources Chemicals, spectra, CIF files, movies, images,
podcasts, links to patents, publications, predictions
A central hub for chemists to deposit & curate data
Answer Questions with ChemSpider
Questions a chemist might ask… What is the melting point of n-heptanol? What is the chemical structure of Xanax? Chemically, what is phenolphthalein? What are the stereocenters of cholesterol? Where can I find publications about xylene? What are the different trade names for Ketoconazole? What is the NMR spectrum of Aspirin? What are the safety handling issues for Thymol Blue?
I want to know about “Vincristine”
I want to know about “Vincristine”
If all algorithms work then everything on the page is correct by default except the name!
Vincristine: Identifiers and Properties
Vincristine: Identifiers and Properties
Vincristine: Vendors and Sources
Vincristine: Patents
Vincristine: Articles
Searches: The INTERNET
All ChemSpider and Internet searches are “simply algorithms” but synonym searching is based on an assertion
InChIs
Validated Names for Searching…
Interactive Data
Most Accessed
Is it working? Show of hands…
How many of you know CSSP? Have any of you submitted to CSSP?
Low submissions but some dedicated authors
Popular Authors
Is it working? Show of hands…
How many of you know CSSP? Have any of you submitted to CSSP?
Low submissions but some dedicated authors
What reasons are there you would not publish? Time Approval from supervisor Need to keep the science quiet Publishing on CSSP prevents future publishing?
How will it improve?
Participation and
contribution
The Social Network
Career-wise NOT having a personal presence online will be a detriment Self-marketing Establishing a profile Getting on the record Collaborative Science Demonstrating a skill set Measured using alternative metrics Contributing to the public peer review process
Social Networking Tools
A growing number of social networking tools:
Facebook Twitter Linked-In Flickr YouTube Blogs Communities Collaborative environments
Chemistry Social Networking
Methods of sharing MY chemistry online include: Wikis or blogs Slideshare for presentations YouTube for videos Flickr, Wikimedia etc. for images PubChem for assay data NMRShiftDB for NMR assignments GoogleDocs for data
Drivers in the Social Network Anonymity is a choice in the social networks
Anonymity in peer-review will likely become less important and may be generational
I may want acknowledgment if… I share my data I review a paper I share my expertise
The Alt-Metrics Manifesto
http://altmetrics.org/manifesto/
Enabled by ORCID…
The Joint Responsibility of Authors
What is my ImpactStory?
ImpactStory
The Linked Network
Imperial College
Data repository activities initiated with Imperial Storage of research data from electronic lab
notebook Chemicals Reactions Analytical data – spectra Experimental data points Open Data with CC licenses of NC-SA
Feeding ELN Data into ChemSpider
Integrate e-Notebooks into ChemSpider
IDBS e-Workbook plug-in allows direct deposition of chemical structures
Can be extended to more ELN content Spectra Reactions Properties etc.
Integration Video http://tinyurl.com/9xnprqr
Feeding ELN Data into ChemSpider
What is already in testing…
ChemSpider Google Searching Google Scholar, Google Books and
Google Patents by chemical structure
ChemSpider reactions – alpha version 300,000 reactions extracted from US patents ChemSpider SyntheticPages container Container for future RSC Archive reactions Accepting Electronic Lab Notebook depositions Successful AND Failed Reactions
Work in Progress – 300k Reactions
Data Enabling the RSC Archive
An archive going back to 1841. Project underway to “data enable” the archive:
Extract chemistry – chemicals, reactions, experimental data points, complex data
Semantic enriching of the articles for interactive viewing and crowdsourced annotation/curation
Dramatically enables the type of queries possible across the archive
EPSRC National Chemical Database
RSC is preferred bidder for the EPSRC national chemical database tender – presently completing legal documentation etc.
Will deliver federated access to a series of commercial databases plus data repository – personal, group and institutional
Citable data objects for papers, supplementary info, non-published work
A model for data segregation
Integrate to Institutional repositoriesAccess to Theses and Dissertations
Model Building with Community Data
Community data can be the basis of model building
Consume data from available databases, RSC archive, new publications and build predictive algorithms for the community
Accept research data from the community and include into predictions
Internet Data
An Open Data-Centric Chemistry Hub
Commercial SoftwarePre-competitive Data
Open ScienceOpen DataPublishersEducators
Open DatabasesChemical Vendors
Small organic moleculesUndefined materialsOrganometallicsNanomaterialsPolymersMineralsParticle boundLinks to Biologicals
Benefits of Publishing Chemical Syntheses Online Not all syntheses will be “published” Publishing is changing and has many forms Online exposure develops reputation, benefits
the community, engages discussion and collaboration. Peer review in the open.
CSSP offers a platform for exposure, linking to ChemSpider, interactive visualization and is a feed to ChemSpider reactions
ELNs are a natural feed to the CSSP micro-publishing platform
Acknowledgments
RSC|ChemSpider team CSSP Editorial Team All data source providers Curators and annotators Service providers:
ACD/Labs OpenEye GGA Software Services Many others….
Thank you
Email: [email protected] Twitter: ChemConnectorPersonal Blog: www.chemconnector.comSLIDES: www.slideshare.net/AntonyWilliams