Download - Metabolomics

Metabolomics

Bob Ward

German Lab

Food Science and Technology

Genome-

….All the DNA

Transcriptome-

….All the mRNA

Proteome-

….All the proteins

Metabalome

….All the metabolites

“ ”

“Metabolomics is a post genomic technology which seeks to provide a

comprehensive profile to all the metabolites present in a biological sample.” (Taylor et. al, 2002)

Limitations of “ohmics” technologies

Genomics

Static picture Expensive Not for individuals

Transcriptomics

Need Genome (annotations) Correlated with proteome?

Sampling issues splicing No info on modifications

Proteomics

Technologically challenging Need genome?

Metabolome

• Same metabolites for all organisms

• ~1k for organism vs 10k(genes) or 100k(proteins)

• Technology exists and is not too expensive

• Carbohydrate and Lipid info

Goal: Discrimination between related genotypes of Arabidopsis

• between Co10 and C24 (parent strains)

• between Co10 x C24 and progeny (F1)

• between (Co10 x C24) and (C24 x Co10)-Maternal line donates both mitochondria and

chloroplast

-Clear-cut realization of effectiveness

-Potential to uncover biologically relevant info

Instrumental and Informatic Tools

• GC/MS-Separation/identification of polar metabolites in 1200 second run time

• AMDIS deconvoluting software

• MassLab to choose target ions

• R for statistics

• WEKA (standard neural network approach)

• Euclidean distance

• Principal Component Analysis

Data Work-Up

• Selection of reference chromatogram (F1)

• 8 individual samples for each genotype– no replicates

• Selection of target peaks/analytes (433)– normalized (mg analyte/wt sample)to internal

standard (ribitol)– Allows for simple 2-D matrix

201 metabolites identified in some detail

(92 as molecular type and 109 by chemical property)

High variance in low numbers corresponds to core metabolites

Co101-8

C249-16

Co10 x C2417-24

C24 x Co1025-32

Neural Network Analysis

}P=0.27

Lack of samples precluded use of a training subset

‘Leave one out cross’ used for training

Model judged by ability to classify remaining object (repeated for all objects)

Allows for maximal use of data for validation when n is low

Clustering by Euclidean distance

Co101-8

C249-16

Co10 x C2417-24

C24 x Co1025-32

Principal Component Analysis• Used to tease out role of individual metabolites in

discrimination• Unsupervised multivariate analysis applied to

functions of many attributes• Transformation of large set of related values to

smaller set of uncorrelated variables• Attempts to express maximum variance in data• PC’s are axes in multidimensional space• Object characterized by distance to axis

PCA algorithm from MatLab

78% of variation of data from first 3 PC’s

Variance of data explained by first few principal components

Principal Component Analysis

Co10 and C24 differentiated except outlier

F1 genotypes cluster together

Contribution of each variable to first PC

Malate and Citrate- metabolites of TCA cycle

Relative peak area for metabolites malate and citrate

Co10 contains outlier…..may explain misclassification

Other significant results

• Parental genotype removed from PCA analysis and F1’s discriminated by glucose and fructose

• Inference that the first PC differentiates parental line, and 2nd and 3rd differentiate F1

• Malate and Citrate from TCA, glucose and fructose from chloroplasts

Conclusions• Advances in technology will improve detection limits

and will allow characterization of metabolites

• Formalized ontology needed to link chemical structure with pathways

• Metabolite profiling is an exciting new field which complements other non-hypothesis driven global analysis technologies

• Large amounts of informatic support to develop field and to correlate data from genomics, microarrays, and proteomics

Download - Metabolomics

Top Related