metabolomics and beyond challenges and strategies for next-gen omic analyses
TRANSCRIPT
Webinar Session 5
Metabolomics and Beyond: Challenges and Strategies for Next-Gen Omic Analyses
Dr. Dmitry Grapov
Data Scientist,
CDS- Creative Data Solutions and
Genome Data Analytics,
Monsanto, USA
[email protected] Please note that the Webinars are presently free, courtesy of the Metabolomics Society and will be uploaded to the society's website. Please feel free to contact us with any questions or suggestions via [email protected]
Metabolomics and Beyond Challenges and Strategies for Next-gen Omic Analyses
Dmitry Grapov, PhDCDS- Creative Data Solutions
Background
Born: Minsk, Belarus in 1981
Minsk, BelarusUniversity of Utah (2000-2007)• B.S. Biology • B.S. Chemistry
Salt Lake City, UT
University of California, Davis (2007-2012)• Ph.D. Analytical Chemistry
with Emphasis in Biotechnology
• Post doc, Oliver Fiehn Lab
Davis, CA
Interests:• Omics, integromics, microbials and big biological data• Multivariate data analysis and visualization, machine learning and software design
WCMC
• Principal Statistician at the NIH West Coast Metabolomics Center (WCMC)
Data Scientist• CDS - Creative Data
Solutions• Genome Analytics,
Monsanto
St. Louis, MO
Experience: Omic’ data analysis and visualization
Grapov et. al., Circ. Cardiovasc. Genet. 2014
Network Analysis
Multivariate Modeling
Grapov et. al.,PLoS ONE (2014) doi:10.1371/journal.pone.0084260
J. Proteome Res., 2015, 14 (1), pp 557–566 DOI: 10.1021/pr500782g
Biomarker validation
• Metabolomics can offer real-time insight into treatment efficacy and drive personalized medicine decisions
Metabolomics: study of small molecules
Metabolome: a proxy for phenotype
• Large and complex studies
• Integration of multiple biochemical domains
• Interpretation of experimental results within a biological context
Challenges for Next-gen Omic Analyses
Large longitudinal studies may be required to identify small phenotypic and environmental effects
http://teddy.epi.usf.edu/TEDDY/
TEDDY: The Environmental Determinants of Type 1 Diabetes in the Young
multi-Omic longitudinal study involving > 15,000 samples acquired over 3 yrs
Time
TimeAnalytical batch effects can hide smaller
biological effects
Data normalization strategies should be considered during experimental design
Analyte specific data quality overview
normalizations can be used to remove analytical variance
Raw Data Normalized Data
log mean
low precision
%RS
D
high precision
Data normalization may require a combination of approaches
Internal standard (ISTD) based normalization
Retention time of normalized compounds
Number of analytes optimally normalized by each ISTD
(qcISTD)
qcISTD: analytical replicate optimize QC selection
Data normalization may require a combination of approaches
Internal standard (ISTD) based normalization may not fully remove analytical batch effects
Analytical replicate-based normalizations can be used to estimate and remove
analytical variance
Raw Data Normalized Data
SamplesQCs
LOESS
Quality Control (QC) based normalizationOptimal method should use no sample knowledge
Across-batch performance
Within-batch performance
14,526 measurements of 443 variables acquired
over 2 years
Comparison of normalization methods
Raw (RSD ~75)
Normalized (25)
Normalizations need to be numerically and visually validated
Good
Bad: QCs don’t match samples
Bad: overtrained
Challenge: getting appropriate QCs and implementation of normalizations
Identification of systems of changes requires integration of multiple analytical platforms
Am J Clin Nutr. 2015 Aug;102(2):433-43. doi: 10.3945/ajcn.114.103804. Epub 2015 Jul 8.
Modern metabolomic analyses often require combinations of multiple measurement platforms
American Journal of Physiology - Endocrinology and Metabolism 2015 Vol. no. , DOI: 10.1152/ajpendo.00019.2015
PMID:24204828
2009
~10% variance explained
Many diseases, including aging, have dominant metabolic components (e.g. metabolic syndrome)
Genotype + metabolome >40% variance explained
Type 2 DiabetesNeed for Integromics
Omic’ data integration strategies
Biomarker Insights 2015:Suppl. 4 1-6 DOI: 10.4137/BMI.S29511
Empirical correlation
Network based
Biochemical pathway
Pathway analysis
Metabolomic network analysis
MetaMapR: Metabolomic network calculation
http://dgrapov.github.io/MetaMapR/
MetaMapR: Metabolomic network calculation
• Biochemical reactions
• Structural similarity
• Mass spectral similarity
• Empirical relationships
MetaMapR: Network visualization
Omic’ network analysis
http://kwanjeeraw.github.io/grinn/
MappingsNetwork Mapped Network
Grapov D.,American Society of Mass Spectrometry Conference (2013, 2014)
Network Mapping
+ =
DeviumWeb: Data analysis and visualization
https://github.com/dgrapov/DeviumWeb
DeviumWeb: Interactive visualization
DeviumWeb: Statistical Analysis
DeviumWeb: Cluster Analysis
DeviumWeb: Exploratory Analysis
DeviumWeb: Predictive Modeling
DeviumWeb: Pathway analysis
Thank you:
Metabolomics SocietyDr. Biswapriya Misra
CollaboratorsDr. Johannes FahrmannDr. Kwanjeera WanichthanarakDr. Oliver FiehnDr. Suzanne MiyamotoDavid Liesenfeld
More information:https://imdevsoftware.wordpress.com/
Software:https://github.com/dgrapov
Hire me:[email protected]