chemspider presentation at university of toronto

64
ChemSpider: Building the Premier Online Resource for Chemists University of Toronto June 8th 2010

Upload: orcid-0000-0002-2668-4821

Post on 10-May-2015

1.747 views

Category:

Technology


2 download

DESCRIPTION

The presentation of ChemSpider was to a groub of science librarians, specifically chemistry librarians, and was meant to provide an overview of the platform and answer the question posed: What is the difference between ChemSpider, CAS Scifinder and Reaxys.

TRANSCRIPT

Page 1: ChemSpider Presentation At University Of Toronto

ChemSpider: Building the Premier Online Resource for Chemists University of Toronto June 8th 2010

Page 2: ChemSpider Presentation At University Of Toronto

Overview

The status of chemistry online today The pragmatic vision of ChemSpider The Quality of online chemistry Linking together the internet using InChIs Citizen scientists for deposition and curation ChemSpider as a multimedia container Comparing ChemSpider, Reaxys and SciFinder

Page 3: ChemSpider Presentation At University Of Toronto

Where is chemistry online? Encyclopedic articles (Wikipedia) Chemical vendor databases Metabolic pathway databases Property databases Patents with chemical structures Drug Discovery data Scientific publications Compound aggregators Blogs/Wikis and Open Notebook Science

Page 4: ChemSpider Presentation At University Of Toronto

Chemistry on the Internet TODAY

Chemistry searches are generally limited to text-based searches across the internet

Data are dirty: sorting the wheat from the chaff. Who can you trust?

Too many searches required to resource data

Page 5: ChemSpider Presentation At University Of Toronto

media.obsessable.com

As few interfaces as possible

What do humans want?

Page 6: ChemSpider Presentation At University Of Toronto

A Pragmatic Vision

“Build a Structure Centric Community”

December 2006 – A hobby project initiated to connect chemistry on the web

Integrate chemical structure data on the web Create a “structure-based hub” to information and

data Provide access to structure-based “algorithms” Let chemists contribute their own data Allow the community to curate/correct data

Page 7: ChemSpider Presentation At University Of Toronto

ChemSpider Searches

Page 8: ChemSpider Presentation At University Of Toronto

Search Cholesterol

Page 9: ChemSpider Presentation At University Of Toronto

Search Cholesterol

Page 10: ChemSpider Presentation At University Of Toronto

Search Cholesterol

Page 11: ChemSpider Presentation At University Of Toronto

Search Cholesterol

Page 12: ChemSpider Presentation At University Of Toronto

Search Cholesterol

Page 13: ChemSpider Presentation At University Of Toronto

Linked across the internet

Page 14: ChemSpider Presentation At University Of Toronto

Kyoto Encyclopedia of Genes and Genomes

Page 15: ChemSpider Presentation At University Of Toronto

Links to Patents based on structure

Page 16: ChemSpider Presentation At University Of Toronto

Articles Linked

Page 17: ChemSpider Presentation At University Of Toronto

ChemSpider Complex Searches

Page 18: ChemSpider Presentation At University Of Toronto

Link off a structure in ChemSpider

Chemical suppliers Other publications Analytical Data Related Reactions Wikipedia Patents “Everything”

Page 19: ChemSpider Presentation At University Of Toronto

Answering Questions for Chemists Questions a chemist might ask…

What is the melting point of n-butanol? What is the chemical structure of Xanax? Chemically, what is phenolphthalein? What are the stereocenters of cholesterol? Where can I find publications about xylene? What are the different trade names for Ketoconazole? What is the NMR spectrum of Aspirin? What are the safety handling issues for Thymol Blue?

Page 20: ChemSpider Presentation At University Of Toronto

What is a compound?

Page 21: ChemSpider Presentation At University Of Toronto

ChemSpider is a structure-centric hub

ChemSpider aggregates and links out across the internet

Data aggregate based on “structures and links”

What defines a chemical compound?

Page 22: ChemSpider Presentation At University Of Toronto

Linked Data on the Web

Taken from: Rafael Sidis’ Blog

Page 23: ChemSpider Presentation At University Of Toronto

Where Would You look? What Do You Trust?

Page 24: ChemSpider Presentation At University Of Toronto

Chemistry on The Internet Is Messy

Page 25: ChemSpider Presentation At University Of Toronto

It’s Methane…

Page 26: ChemSpider Presentation At University Of Toronto

What’s Methane?

Page 27: ChemSpider Presentation At University Of Toronto

What’s Methane?

Page 28: ChemSpider Presentation At University Of Toronto

What ELSE is Methane???

Page 29: ChemSpider Presentation At University Of Toronto

PubChem

Page 30: ChemSpider Presentation At University Of Toronto

Chemistry is REALLY Messy

Page 31: ChemSpider Presentation At University Of Toronto

Vancomycin

Who will curate?

How would you clean such a large dataset?

Assertions!!!

Page 32: ChemSpider Presentation At University Of Toronto

Vancomycin on ChemSpider 1 compound – 3 days

Page 33: ChemSpider Presentation At University Of Toronto

The EXPERTS must get it right?!

Page 34: ChemSpider Presentation At University Of Toronto

Wikipedia, C&E News, PubChem C&E News (from ACS)

Page 35: ChemSpider Presentation At University Of Toronto

The InChI Identifier

Page 36: ChemSpider Presentation At University Of Toronto

Multiple Layers

Page 37: ChemSpider Presentation At University Of Toronto

InChIStrings Hash to InChIKeys

Page 38: ChemSpider Presentation At University Of Toronto

Vancomycin – Search the Internet

Page 39: ChemSpider Presentation At University Of Toronto

Full Molecule Search: 4 Hits

Page 40: ChemSpider Presentation At University Of Toronto

Full Skeleton Search: 104 Hits

Page 41: ChemSpider Presentation At University Of Toronto

Citizen Scientists

Page 42: ChemSpider Presentation At University Of Toronto

Crowd-sourcing Chemistry Curation

Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate

Page 43: ChemSpider Presentation At University Of Toronto

Citizens as Data Sources

Page 44: ChemSpider Presentation At University Of Toronto

Semantic Markup: Project Prospect

Page 45: ChemSpider Presentation At University Of Toronto

Entity-Extraction, Mark-up, Annotate

Page 46: ChemSpider Presentation At University Of Toronto

Success Depends on Dictionaries

Page 47: ChemSpider Presentation At University Of Toronto

Semantic Linking of Structures

What would you want to link off a structure? Chemical suppliers Other publications Analytical Data Related Reactions Wikipedia Patents “Everything”

Page 48: ChemSpider Presentation At University Of Toronto

Unpublished Chemistry

Only a fraction of chemistry is published

Only a tiny fraction of chemistry is patented

What of the “Lost Chemistry”- never published and cannot be abstracted Reactions performed Structures made and studied Spectra acquired and then disposed of Available chemicals never found

Page 49: ChemSpider Presentation At University Of Toronto

Org Prep Daily (Blog)

Page 50: ChemSpider Presentation At University Of Toronto

ChemSpider SyntheticPages

Page 51: ChemSpider Presentation At University Of Toronto

Submission Process

Submissions reviewed by editorial board

Published as is or comments sent to author

Online Peer Review process

Data supported include web movies, images, live spectra etc.

Page 52: ChemSpider Presentation At University Of Toronto

Micro- and Nano-publications Blogs, wiki entries and even Amazon book reviews

are micro/nano-publications

ChemSpider SyntheticPages will be DOI’ed – students can add these “micro-publications” to their resume

Structures and spectra are nano-publications – these can be tracked and referenced also. (depositions, curations etc). Students participate in building one of the premier sources of chemistry data.

Page 53: ChemSpider Presentation At University Of Toronto

ChemSpider Everywhere:What do computers want?

Web services

flickr.com/photos/microcosmos

Page 54: ChemSpider Presentation At University Of Toronto

ChemSpider Everywhere: ChemMobi

Page 55: ChemSpider Presentation At University Of Toronto

Mobile ChemSpider

Page 56: ChemSpider Presentation At University Of Toronto

Multimedia Content Holder

Page 57: ChemSpider Presentation At University Of Toronto

Periodic Table Images

Page 58: ChemSpider Presentation At University Of Toronto

CAS SciFinder

Page 59: ChemSpider Presentation At University Of Toronto

reaxys

Page 60: ChemSpider Presentation At University Of Toronto

Differences between ChemSpider, Reaxys and SciFinder Everything on Reaxys and Scifinder is curated The data resources can be over a 100 years old The platforms are commercial and “read-only”

ChemSpider is free, to everyone Data are in a state of ongoing curation & annotation Data resources are from the “electronic era” Data are expanded daily and enhanced on an

ongoing basis The platform delivers integrated algorithm access

Page 61: ChemSpider Presentation At University Of Toronto

Community Contribution

We make a bigger contribution to the community if the community shares via ChemSpider

ChemSpider wins “Communitycontribution” best practice award”

Page 62: ChemSpider Presentation At University Of Toronto

How Can You Help ChemSpider?

Encourage students to deposit their data and share with the community Structures – one or many Spectra Links Syntheses into ChemSpider SyntheticPages

Spread the word – ChemSpider is an untapped resource

Page 63: ChemSpider Presentation At University Of Toronto

Chemistry on the Internet FUTURE The semantic web for chemistry is in place Crowdsourced contributions are commonplace Chemists will search by structure/substructure Chemistry articles indexed and searchable Reduced number of searches to find data Data are integrated – compounds, vendors,

syntheses, data, publications and patents A world of Open Access and Open Data

Page 64: ChemSpider Presentation At University Of Toronto

Thank you

[email protected]: ChemSpidermanwww.chemspider.com/blogSLIDES: www.slideshare.net/AntonyWilliams