chemspider: connecting chemistry & mass spectrometry on the internet
TRANSCRIPT
ChemSpider: Connecting Chemistry and Mass
Spectrometry on the Internet
Antony Williams and John Shockcor
A Vision for Our Time
Analysis of body fluids can lead to earlier diagnosis
Analysing the “Metabolome” improves the opportunities for personalized medicine
Coupled analytical instrumentation with chemical and bio informatics…
It’s about structures…
Drugs, metabolites, pesticides – it’s all about chemical structures
RSC hosts a community resource for chemistry to support chemists
Chemical compounds, properties, syntheses, analytical data, publications..
Public Domain Chemistry Databases
Online databases can be messy and inconsistent
Non-curated databases proliferate errors
Original sources of errors hard to determine
Data validation is time-consuming, challenging and exacting
The structure of Vitamin K1 is ???
Vitamin K1
ChemSpider A free internet database
25 million chemicals from 400 data sources
Data includes: Chemical identifiers Links to publications Links to patents Experimental and predicted properties Spectral data
For Synthetic Chemists
For NMR Spectroscopists
Community Contribution Community-based deposition of data
Structures, spectra, links, properties
Community crowdsourced curation Validating and curation data
>130 people have contributed data, skills and experience Synthesis procedures Spectral data
A flexible platform for...
Searching – many tens of thousands of searches per day
Teaching - providing data and resources to help teach chemistry
Integration - available for third parties to take value from the data
For Mass Spectrometrists Valuable searches for Mass Spec would
be:
Search the database by mass or formula for structure identification
Search subsets of data – “metabolism”
Link structure-based data across the internet
Provide “programming interfaces”
What is Metabonomics?
Metabonomics “…measurement of the dynamic multiparametric metabolic response of living systems to pathophysiological stimuli or genetic modification…” Nicholson et al., 1999
Full Metabolic Pathway Chart
Kreb’s Cycle
What is Metabonomics or Metabolomics? Analytical data on two groups of samples which
we suspect may be different, can we determine the following information?
Are the groups different?
Detect those compounds which have increased or decreased in concentration in each group.
Detect those compounds which are missing from or unique to each group.
Direct Infusion of Urine into a Mass Spectrometer
UPLC Lipid Class Separation
PC, SM, PG, PE, DG
ChoE & TG
lyso-PC, lyso-PE
Hybrid High Resolution QTof MS
High Resolution MS of a TAG
Extracted Ion Chromatogram of m/z 874.7898
MarkerLynx PCA Analysis of Data Transgenic Mice Study
PCA Loadings
-100
-80
-60
-40
-20
0
20
40
60
80
100
-140 -120 -100 -80 -60 -40 -20 0 20 40 60 80 100 120 140
t[2]
t[1]
Scores Comp[1] vs. Comp[2] colored by Modification subsets
TGaTGbWT
WT
WT
WT
WT
WT
WTWT
WT
WT
WT
WT
WTWT
WT
WT
WT
WT
WT
TGbTGb
TGb
TGbTGb
TGb
TGbTGb
TGb
TGa
TGa
TGaTGa
TGa
TGaTGa
TGa
TGa
EZinfo 3 - Heart TG mouse 4_106.usp (M1: PCA-X) - 2010-10-03 13:15:14 (UTC-5)
PCA Scores
-0.20
-0.15
-0.10
-0.05
-0.00
0.05
0.10
0.15
0.20
0.25
-0.25 -0.20 -0.15 -0.10 -0.05 -0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35
p[2]
p[1]
Loadings Comp[1] vs. Comp[2]
EZinfo 3 - Heart TG mouse 4_106.usp (M1: PCA-X) - 2010-10-03 13:52:36 (UTC-5)
m/z_RT Pairs Which Describe the Variance Between Transgenic and Wild Type Mice
-0.20
-0.15
-0.10
-0.05
-0.00
0.05
0.10
0.15
0.20
0.25
-0.25 -0.20 -0.15 -0.10 -0.05 -0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35
p[2
]
p[1]
Loadings Comp[1] vs. Comp[2]
806.5713
874.7898
EZinfo 3 - Heart TG mouse 4_106.usp (M1: PCA-X) - 2010-10-03 13:41:05 (UTC-5)
Trend Plot of Those m/z_RT PairsY-axis is the Area of the XIC
0
50
100
150
200
250
300
350
400
450
500
550
600
650
WT
WT
WT
WT
WT
WT
WT
WT
WT
WT
WT
WT
WT
WT
WT
WT
WT
WT
TG
b
TG
b
TG
b
TG
b
TG
b
TG
b
TG
b
TG
b
TG
b
TG
a
TG
a
TG
a
TG
a
TG
a
TG
a
TG
a
TG
a
TG
a
Modification subsets
Variable Trends colored by Modification subsets
EZinfo 3 - Heart TG mouse 4_106.usp (M1: PCA-X) - 2010-10-20 09:47:43 (UTC-5)
Calculation of Elemental Composition & ChemSpider Search of Lipid Maps Database Performed via MarkerLynx
Results of the ChemSpider Search in the MarkerLynx Worksheet
Hit Details in ChemSpider
Hybrid High Resolution QTof MS
Lipid at RT 4.41 with m/z 806.5695 PC 16:0/22:6
16:0
22:6
Phospholipids Elevated in Transgenic Mice
What Have We Done?
Turned “Data into Information”
Now to turn information into knowledge
We want to visualize our metabolite information and information from other “omics” techniques using modern “pathway” tools. (Systems Biology)
Why Do I Like ChemSpider?
It is open to all the scientific community not just a select few.
It relies on expertise from the scientific community for expansion, curation and improved functionality. Let us work together!
It is always with me…
Linking and Enabling
Clearly mass spectrometry facilitates deeper understanding of metabolomics
The ChemSpider database is a rich foundation for structure-based analysis
Community support through data deposition and curation is very enabling
Ideas for Future Work
Extend search capabilities
Expand existing databases
Integrate to metabolic pathways tools
Thank You!!!