bioinformatics for targeted metabolomics: met and unmet needs klaus m. weinberger biocrates life...
TRANSCRIPT
Bioinformatics for Targeted Metabolomics: Met and Unmet Needs
Klaus M. Weinberger
Biocrates Life Sciences AG, Innsbruck, Austria
3rd Annual Forum for SMEs
Information Workshop on European Bioinformatics Resources
Vienna, September 3 – 4, 2009
Agenda
• Why (targeted) metabolomics?
• Proof-of-concept in routine clinical diagnostics
• Technology platform
• Workflow integration & data analysis
• Issues
• Acknowledgements
Socrates470-399 BC
Hippocrates 460-377 BC
IntelligenceWisdom
MedicineHealth
BIOCRATES
“Creating Knowledge for Health”
... the systematic identification and quantitation of all/ biologically relevant small molecules* in a given compartment, cell, tissue or body fluid.
It represents the functional end-point of physiological and pathophysiological processes depicting both genetic predisposition and environmental influences like nutrition, exercise or medication.
* no biopolymers (nucleic acids, polypeptides)
Metabolomics is...
Why (targeted) metabolomics?
Six systems biologists examining an elephant
Transcription
Translation PTM
DNA
2.5·104
RNA
~105
Polypeptides
~106
Proteins
~107
~104
Metabolites
Enzymaticactivity
Transportetc.
Why metabolomics?
• Functional end-point of physiology and pathophysiology
• Reasonable scale of the analytical challenge
• Direct mirror of environmental influences
• (Mal-)nutrition• Exercize• Medication
Sample cohorts
Metabolic profiling(e.g. full scan LC-MS)
Differential patterninformation
Metabolomics approaches
HPLC-ToF-MS of urine samples +T O F M S : 4.995 to 9 .994 m in f rom P R01 -40-1_040092_56_1_ 02 04029486.w iff ,s aturation c orrection appl ied a=3. 56735167855777570e-004, t0= 3.08326670642854880e+ 001, subtrac ted (12.99 4 to 13.994 m in)
M ax. 32.0 c ounts.
100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500m/ z, am u
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
Inte
nsity
, cou
nts
376.1168
114.0871
105. 0298
600.230 9
388.2420
327.1889
432.2672
377.1319207.1518
359.128 9134.0554
391.273018 0.0597
424.2011 584 .2329344.2083497.1240
163.1292 195.0611 570.220541 5. 23493 18.1 748297.1361
520.3189366.1678107.0682 446.2369334.1 510 624.2623
Sample: mouse urineID 0204029486
(3/8)HPLC: Waters Atlantis dC18 injection volume: 10 µl detection: pos. ToF-MS
m/z 100-1500mass accuracy: ~ 2 ppmdata content: c. 2500 features per spectrum for statistical assessment
PCA of LC/MS profiling data
Candidate drug vs. Untreated Untreated vs. Rosiglitazone
Sample cohorts
Metabolic profiling(e.g. full scan LC-MS)
Differential patterninformation
Identification of relevant metabolites
Targeted metabolomics(ID / quantitation
by SID on MS/MS)
Metabolite concentration shifts
Functional annotation
Metabolomics approaches
Pathway mapping of quantitative Mx data
Cit
ArgOrn
Argsucc
Fum
Urea
Asp
Carb-P
NO
NOS
ASL
ASS
ARG
OCT
Basic research- Functional genomics in biochemistry, physiology, cell biology,
microbiology, ecology, … Agricultural & nutrition industry
- Plant intermediary metabolism- Health effects of functional food products
Biotechnology- Optimization and monitoring of fermentation processes
Pharmaceutical R&D- Pathobiochemistry / characterization of disease models- Safety / toxicology- Efficacy / pharmacodynamics and mode-of-action
Clinical diagnostics & theranostics- Early diagnosis and accurate staging- Specific monitoring of therapeutic effects
Areas of application
History and proof-of-concept in clinical diagnostics
Sir Archibald Edward Garrod
• 1857, London – 1936, Cambridge• Educated in Marlborough, Oxford,
and London• Postgraduate studies at the AKH in
Vienna in 1884/85• Publications on chemical pathology
(e.g. of alkaptonuria, cystinuria, pentosuria)
• One gene – one enzyme hypothesis• Concept of inborn errors of
metabolism (Croonian lectures to the Royal College of Physicians, 1908)
Proof-of-concept in neonatology
• Newborn screening for inborn metabolic disorders
• replaced expensive monoparametric assays
• simultaneous detection of 40 - 60 metabolites (amino acids, acylcarnitines)
• simultaneous diagnosis of 20 - 30 monogenic diseases (AA metabolism, FATMO) with immediate treatment options
• total incidence > 1:2000
• unprecedented sensitivity, specificity, ppv
• co-pioneered in the mid-90s by BIOCRATES founder Bert Roscher
• > 1,300,000 newborns screened in Munich
• similar labs worldwide
Lessons from newborn screening
1) Quantitative tandem mass spectrometry (stable isotope dilution) is able to meet the most stringent quality criteria (precision, accuracy) for routine diagnostics
2) The concept of multiparametric biomarkers improving assay sensitivity and, particularly, specificity is valid for many monogenic (and multifactorial) diseases
3) MS-based diagnostics can save costs despite a wider analytical panel and improved diagnostic quality
Also true for therapeutic drug monitoring of immunosuppressants, antidepressants, antiretrovirals...
Goals in clinical diagnostics
Conventionaldiagnostics
genetic predisposition
healthy
latent
illMultiparametric diagnostics
• Early diagnosis• Prophylaxis instead of therapy
• Subtyping / Staging• Therapeutic drug monitoring• Phenotypic pharmacogenomics• Individualized (and more cost-
efficient) medicine
Technology, workflow integration & data analysis
• Automated extraction and derivatization
• SPE
Sample preparation
• Technical validation• Statistical analysis• Data visualization• Biochemical
interpretation
BioInformatics
1
disease2
Clinical & experimentalsamples
Diagnoses & lab data
BioBank
LIMS/Database
• Separation (LC, GC)• Quantitation (MRM, SID)• QA/QC
Analytics
Integrated technology platform
Workflow overview
Staging of diabetic and non-diabetic nephropathy by PCA-DA
MarkerViewTM
Identifying marker candidates: stage 3 vs. stage 5 kidney disease (loadings)
stage 3 stage 4 stage 5
Met
-SO
/Met
0,000
0,005
0,010
0,015
0,020
0,025
0,030
Increasing oxidative stress in progressing CKD
• Oxidation of methionine is highly indicative for oxidative stress
• Ratio of Met-SO to Met quantitative measure for this biomarker
01020304050607080901000
10
20
30
40
50
60
f(x) = 0.499538363143896 x − 5.89571444082883R² = 0.752276711067173
Metabolite vs. eGFR, non-diabetic, w/o Stage 5
ADMA (U) Linear (ADMA (U))
eGFR
Met
abo
lite
Decreasing ADMA secretion in progressing CKD
• Regression analysis to identify correlation of marker candidates with continous (clinical) variables instead of discrete (=artificial) stages
Membrane phospholipids (GPC, GPE, GPS, ...)
Lysophospholipids Free fatty acidsPUFAs
AA 20:4w6LA 18:2w6 DHA 22:6w3EPA 20:5w3
9-HODE 12-HETE 15-HETE PGD2LTB4 TXB213-HODE
SPL2
PGE2
LOX COXROS
Orchestration of fatty acid oxidation
Pathway visualization in KEGG (reference pathway)
Pathway visualization in KEGG (human)
Dynamic pathway visualization in MarkerIDQ
Exploring ‚metabolic shells‘ around metabolites
Route finding between metabolites across pathways
Reactions vs. Reactant pairs!
Issues I: Databases
Parallel / competing initiatives with incompatible / proprietary data formats KEGG MetaCyc, HumanCyc, etc. Reactome HMDB OMIM Lipidomics consortia ...
Compartmentalization not well depicted Incompleteness / generic entries (phospholipids, acylcarnitines,
etc.) Lack of curation Lack of publication
Standardization Instrument vendors oppose common data formats What meta-data to record? No valid guidelines for quantitation of endogenous metabolites
(FDA guidance was developed for xenobiotics) Nomenclature vs. analytical reality (sum signals, isomers, etc.)
Normalization Absolute quantitation overcomes the need for analytical
normalization Role of sample types (plasma, CSF, urine, tissue homogenates,
cell extracts, ...) How can biological normalization work? Are there ‚house-
keeping metabolites‘?
Issues II: Standardization and normalization
Overfitting & correction Suitable clustering algorithms for multivariate data sets? Metabolites are no equivalent independent variables
Analytical validity/variability are usually not considered Often, groups of metabolites are synthesized or degraded
by the same enzyme(s) Consecutive reactions within a pathway/network depend on
each other (flux analysis!) How to incorporate this in biostatistics? Weighting? Derived
parameters, ratios, etc.? How to exploit this in (automated) plausibility checks?
Issues III: Biostatistics
Summary I
• Metabolomics depicts the functional end-point of genetics and environment
• Targeted metabolomics data are analytically reproducible and allow immediate biochemical interpretation
• Proof-of-concept has been achieved in routine diagnostics of inborn errors of metabolism
• Many metabolic biomarkers are valid across species and enable translational research
• Comprehensive targeted metabolomics bridges the gap to open profiling approaches
Summary II : Success factors for biomarker development
Validated quantitative
assays
Well-documented biobanking
Patent strategy and experience
Clinical & scientific experts
Biochemical plausibility &
understanding
Solid multi-variate
biostatistics
Biomarker candidates
Diligentstudy design
Validated biomarkers
Selectedpartners
Acknowledgements
BioinformaticsDaniel Andres Olivier LefèvrePaolo Zaccaria Florian BichtelerMarc Breit Manuel GoglBernd Haas Mattias BairRobert Eller Hamza Ovacin
Gerd Lorünser Yi Zao
AnalyticsStefanie Gstrein Sascha DammeierHai Pham Tuan Cornelia RöhringTherese Koal Ali AlchalabiVerena Forcher Ines UnterwurzacherStefan Urban Doreen KirchbergRalf BogumilPatrizia HoferLisa KörnerPeter Enoh
Statistics & BiochemistryIngrid Osprian Marion BeierVera Neubauer Oliver LutzMatthias Keller Denise SonntagHans-Peter Deigner Ulrika Lundin
Admin, IT & BizDevBrad Morie Anton Grones Ingrid SandnerDoris Gigele Georg Debus Wolfgang SamsingerElgar Schnegg Patricia Aschacher