understanding the factors that influence a metabolomics

38
Metabolomic Profiling in Drug Discovery: Understanding the Factors that Influence a Metabolomics Study and Strategies to Reduce Biochemical and Chemical Noise Mark Sanders 1 ;Serhiy Hnatyshyn 2 ; Don Robertson 2 ; Michael Reily 2 ; Thomas McClure 1 ; Michael Athanas 3 , Jessica Wang 1 , Pengxiang Yang 1 and David Peake 1 1 Thermo Fisher Scientific, San Jose, CA; 2 Bristol Myers Squibb, Princeton, NJ 3 Vast Scientific, Boston, MA

Upload: others

Post on 14-Mar-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Metabolomic Profiling in Drug Discovery: Understanding the Factors that Influence a Metabolomics Study and Strategies to Reduce Biochemical and Chemical Noise

Mark Sanders1;Serhiy Hnatyshyn2; Don Robertson2; Michael Reily2; Thomas McClure1; Michael Athanas3, Jessica Wang1, Pengxiang Yang1 and David Peake1

1Thermo Fisher Scientific, San Jose, CA; 2Bristol Myers Squibb, Princeton, NJ 3Vast Scientific, Boston, MA

Metabolomics in Drug Discovery

Pattern recognition “Good” profile verses “bad” profile

Identification and quantitation of endogenous “markers” Compound selection

Target effects – efficacy markers Off-target effects – toxicity/liability markers

Identification of markers provides mechanistic insights Target validation Mechanism of toxicity

Early evaluation of potential clinical markers

Targeted Analysis Metabolite target analysis Analysis restricted to metabolites of an enzyme system that

are known to be affected by a certain perturbation Metabolite profiling Analysis focused on a class of compounds associated with a

particular pathway (e.g. nucleoside triphosphates, lipids, steroids, etc.)

Only find what you are looking for

Untargeted Analysis A comprehensive analysis of all metabolites A measure of the fingerprint of biochemical perturbations Useful when you don’t know what to expect Hypothesis generation

Metabolomics in Drug Discovery

Metabolomics Analysis

• Goals • Quantitative assessment of the biochemical

makeup of the samples • Differential analysis between sample groups • Identify compounds responsible for changes

• Challenges

• Complexity of a biological sample • Diversity of small molecule metabolites • Wide range of metabolite concentration • Multiple sources of variability • Incomplete information – majority of

components seen by LC/MS are unknowns • Structure elucidation of unknowns is expensive

Need sophisticated data reduction tools and strategies to minimize “noise”

Sources of Noise in a Metabolomics Study • Instrumental

• Mass and retention time stability • Robust and stable detector response

• Sufficient resolution to resolve isobaric interference • Chemical (Data Processing)

• Background from column/solvents • Multiple signals per compound • Setting the threshold

• Biological • Different response rates to a stimulus between individuals • Stress status • Feeding status • Other health factors

• Study Design • Proper controls and randomized sampling/analysis to minimize systematic errors • Sampling, sample preparation and storage

• Statistical Analysis • Limited sampling, over fitting data

Q Exactive: Benchtop Quadrupole Orbitrap

• Quadrupole mass filter

• Quadrupole: hyperbolic rods • Isolation down to 0.4 amu

• HCD collision cell • Analogous to LTQ Orbitrap Velos

• Precursor ion selection for

SIM and MS/MS functionality

Instrumentation

• Higher scan speed

• S-lens • Stacked Ring Ion Guide • Analogous to LTQ Velos • Shorter inject times

• Parallel Processing • Ions collected in C-trap

while orbitrap is scanning

• Advanced Signal Processing • Improved resolution • Faster acquisition speed

Q Exactive: Speed

Peakwidth (FWHM) ~ 1 sec Scans/peak = 21

1.75 1.80 1.85 1.90 1.95 Time (min)

0

10

20

30

40

50

60

70

80

90

100

Rel

ativ

e Ab

unda

nce

1.75 1.80 1.85 1.90 1.95 Time (min)

0

10

20

30

40

50

60

70

80

90

100

Rel

ativ

e Ab

unda

nce

Peakwidth (FWHM) ~ 1 sec Scans/peak = 11

Resolution Setting: 35,000 Resolution Setting: 70,000

Instrumentation

311.0 311.5 312.0 312.5 313.0 313.5 m/z

311.1689

312.1715

313.1641

Q Exactive: Resolution Setting - 70,000

313.14 313.16 313.18 313.20 m/z

313.1641

313.1741

34S

13C2

Instrumentation

Calculated 35,000 Resolution

C17H27O3S

313.10 313.15 313.20 313.25 m/z

313.1669

185.5 186.0 m/z

185.0968

186.1003

185.0968

186.1002

185.0969

186.1003 185.0968

186.1002

4.28

4.27

4.28

4.27

XIC 185.0969 ± 5ppm

7:51pm

11:14pm

3:32am

8:08am

-0.45ppm

-0.72ppm

-0.24ppm

-0.72ppm

-0.04ppm

-0.31ppm

0.39ppm

-0.47ppm Ext.Cal + 65.13 hrs

4.15 4.20 4.25 4.30 4.35 4.40 Time (min)

FWHM = 1.86 sec

Q Exactive: Mass and Response Stability D5-hippuric acid, external calibration, resolution = 82,000

CV = 2.4%

Chromatograms Mass Spectra

Instrumentation

0 50 100 150 200 250 300 350 400 450 500 550 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2 4.4 4.6 4.8

Are

a R

atio

Q Exactive: High Sensitivity Quantitation Testosterone 10pg/mL in Serum

Standard pg/mL

% Difference

10 0.97

20 7.45

50 -5.78

100 -0.29

250 -5.35

500 2.99

Testosterone 10 pg/mL in Serum

Instrumentation

Concentration (pg/mL)

m/z window = 1 Da

100 200 300 400 500 600 700 800 900 1000 0

100 [M+H]+

m/z 853.0 853.5 854.0 854.5 855.0 0

852.9720 853.4727

853.9745

854.4817

z =+2 5

• >1,000,000 data points

• ~100,000 extracted ion peaks.

• Peak area ranges ~ 7 orders

• Much irrelevant data

• Much redundant data

• High quality data from the Orbitrap allows for more precise automated data processing

Need to be able to reduce the data to chemical entities

Anatomy of a UHPLC/Orbitrap Data Set

Z=1(73%)Other

(5%)

Z=212%

Z=3(10%)

Adduct % Assignments

[M+H]+ 100 [M+Na]+ 12.1 [M-H2O+H]+ 8.3 [2M+H]+ 4.7 [M+NH4]

+ 3.8 [2M+Na]+ 3.1 [M+K]+ 2.7 [M-2(H2O)+H]+ 2.5 [M+CH3CN+H]+ 2.1

+1

+2 +3 ≥+4

Data Processing

= Analyte signals Sample - Solvent blank

~98% of lower intensity signals are eliminated

Background Subtraction Data Processing

180.07

202.05 576.13 413.04

[M+H]+

[M+Na]+ [2M+H]+

100 150 200 250 300 350 400 450 500 550 600 m/z

359.12

m/z 180.0652 [M+H]+

2 3 4 5

3.37 4e7

3e5

1e6

9e5

m/z 576.1277 [3M+Ca-H]+

m/z 591.0930 [3M+Fe-2H]+

m/z 413.0427 [2M+Fe-H]+

Time (min)

Spectral Interpretation

Hippuric Acid

NH

OH

O

O

ESI+ 12 related ions ESI - 24 related ions

Data Processing

Rat Urine

180.07

202.05 576.13 413.04

[M+H]+

[M+Na]+ [2M+H]+

100 150 200 250 300 350 400 450 500 550 600 m/z

359.12

m/z 180.0652 [M+H]+

2 3 4 5

3.37 4e7

3e5

1e6

9e5

m/z 576.1277 [3M+Ca-H]+

m/z 591.0930 [3M+Fe-2H]+

m/z 413.0427 [2M+Fe-H]+

Time (min)

Spectral Interpretation

590 592 594 m/z

591.0929

592.0975

593.1014 589.0975

591.0935

592.0965

593.0987 589.0981

594.0981

594.1010

590.1034

590.1013

Measured

Theoretical C27H25N3O9Fe

Fe Isotope Pattern Detected

Data Processing

Rat Urine

Varying Response with Different Ion Species

Trp Phe415,983 2,574,163 420,085 2,614,732 410,093 2,494,727 427,342 2,479,608 423,358 2,448,543 416,844 2,439,600

418,951 2,508,562 6,047 70,659

1.4% 2.8%

Data Processing

M+H [2M+Fe-H]+ [3M+Ca]+ [3M+Fe-2H]+122,738,814 869,212 2,576,598 527,298 119,451,097 824,794 2,471,863 499,852 117,092,066 689,807 2,234,709 456,582 115,057,559 623,152 2,167,836 432,552 115,387,079 573,090 2,138,694 417,703 117,957,232 560,476 2,157,101 409,896

117,947,308 690,089 2,291,134 457,314 2,858,562 130,537 186,409 47,198

2.4% 19% 8% 10%

Hippuric Acid

Rat Urine

Varying Response with Different Ion Species

Trp PheM+H M+H415,983 2,574,163 420,085 2,614,732 410,093 2,494,727 427,342 2,479,608 423,358 2,448,543 416,844 2,439,600

418,951 2,508,562 6,047 70,659

1.4% 2.8%

Data Processing

M+H [2M+Fe-H]+ [3M+Ca]+ [3M+Fe-2H]+122,738,814 869,212 2,576,598 527,298 119,451,097 824,794 2,471,863 499,852 117,092,066 689,807 2,234,709 456,582 115,057,559 623,152 2,167,836 432,552 115,387,079 573,090 2,138,694 417,703 117,957,232 560,476 2,157,101 409,896

117,947,308 690,089 2,291,134 457,314 2,858,562 130,537 186,409 47,198

2.4% 19% 8% 10%

Hippuric Acid

Rat Urine

Importance of Spectral Interpretation

2.22 2.24 2.26 2.28 2.30 2.32 2.34 2.36 2.38 2.40

Time (min)

m/z = 593.2815

2.22 2.24 2.26 2.28 2.30 2.32 2.34 2.36 2.38 2.40

Time (min)

Component ion 297.1443

Dosed [M+H]+

[2M+H]+

[M+H]+

100 200 300 400 500 600 700 800 900 1000 m/z

297.1443

220.1176

593.2815

5e7

5e5

Control

Removing Noise from the Statistics

Fasted Fed

Female

Male

Components

No group separation

m/z Peaks

Large intra group variability

Data Processing

Statistically rigorous automated label-free LC/MS differential analysis platform

Applied to: peptide, protein, small molecule data

State 1 Raw file

State 2 raw file

State … raw file

Workflow

Align Detect Identify

Reports: •Components •Identification •Relative Quantitation •Statistical Analysis •Trend information

SIEVE Analysis Platform

Unaligned basepeaks

SIEVE Workflow – Alignment

Wine

Aligned basepeaks

Alignment scores

SIEVE Workflow – Alignment

Wine

Adducts, fragments and multimers

[M+H]+ [M+Na]+ [M+K]+ 524.3703, z=1, I=4.2E+08, 100% 546.3517, z=1, I=1.0E+08, 24.6% 562.3232, z=1, I=1.1E+06, 0.3%

A+1

Isotopic peaks

525.3730, I=1.2E+08, 28.9%

527.3784, I=3.0E+06, 0.7%

528.3811, I=3.9E+05, 0.1%

A+2 526.3756, I=2.3E+07, 5.5%

A+3 A+4

547.3535, I=2.9E+07, 27.8%

548.3577, I=5.6E+06, 5.4%

549.3595, I=9.0E+05, 0.9%

A+1 A+2

A+3

Isotopic peaks

21.9816

37.9554

Component Detection

Constituents are represented by base component

Accurate Mass Identification

Local database

chemspider web service

Component MW

List of candidates

MolWt Expression Name290.079 L-Epicatechin306.074 Epigallocatechin

314.01 D-glycoside of vanillin380.1254 Vellokaempferol 3-5-dimethyl ether382.1047 Velloquercetin 4 -methyl ether426.0945 Epigallocatechin 3-O-(4-hydroxybenzoate)436.1153 Epigallocatechin 3-O-cinnamate450.0793 Quercetin 4 -galactoside468.1051 Epigallocatechin 3-O-caffeate

472.1 Epigallocatechin 3-O-(3-O-methylgallate)477.1266 Isorhamnetin 7-alpha-D-Glucosamine;Quercetin 3 -methyl ether 7-alpha-D-Glucosam478 0742 Q i 7 l id

Rat Fasting Study

• Study designed to monitor the effect of fasting on metabolic profiles

1 Rats fasted during a 6 p.m. to 6 a.m. dark cycle to capture peak feeding time

Biology

Group Male Fasting Time1

1: 1101-1105 5 Dark Cycle Control (no Fast)

2. 2101-2105 5 2 hr Fast 3. 3101-3105 5 4 hr Fast 4. 4101-4105 5 8 hr Fast 5. 5101-5105 5 12 hr Fast 6. 6101-6105 5 16 hr Fast

• Samples: 50uL Serum ppt with cold MeOH • MS: Q Exactive @ 70K resolution, ESI+ and ESI- • UHPLC: Accela 1250 • Column: Hypersil GOLD aQ 2.1x150mm, 1.9µ @ 600µL/min, 50ºC • Buffers: A: 0.1% formic acid in H2O, B: 0.1% formic acid in 98:2 ACN:H2O

Pooled Quality Controls

Pooled QC Treated Control

*Sangster, et. al., Analyst, 2006, 131, 1075 - 1078 QC Treated Control

Sample 1

Sample 2

Sample 3

Sample 52 Pooled QC

Component of Interest

IS citrulline Tyr Phe Trp 273.1479_1.07pooled QC 20,903,851 969,474 18,350,003 19,904,399 20,685,704 10,918,321pooled QC 22,076,315 1,041,539 20,227,547 20,984,429 22,968,636 9,599,500pooled QC 22,088,562 1,182,143 20,853,789 21,040,901 23,310,086 9,010,457pooled QC 22,052,324 1,205,426 20,390,553 21,477,887 23,583,964 8,213,523pooled QC 22,042,181 1,153,795 21,417,740 22,061,286 23,215,235 6,456,432pooled QC 22,778,779 1,244,100 21,862,115 21,822,323 23,765,745 3,499,156

3% 9% 6% 4% 5% 33%

Sam

e Sa

mpl

e Inj. # 3

14 25 36 47 53

Finding the Differences PCA, rat plasma negative mode

Control - Fed 16hr

4hr

12hr

Pooled Controls

Biology

-

200,000,000

400,000,000

600,000,000

800,000,000

1,000,000,000

1,200,000,000

QC Blank DC 2h 4h 8h 12h

-

50,000,000

100,000,000

150,000,000

200,000,000

250,000,000

300,000,000

350,000,000

400,000,000

450,000,000

QC Blank DC 2h 4h 8h 12h

Examples of Metabolite Changes on Fasting

-

50,000,000

100,000,000

150,000,000

200,000,000

250,000,000

QC Blank DC 2h 4h 8h 12h

-

20,000,000

40,000,000

60,000,000

80,000,000

100,000,000

120,000,000

140,000,000

QC Blank DC 2h 4h 8h 12h

Methionine

Arachidonic Acid Linoleoyl-lyso-PC (18:2)

Proline

Biology

Overall Method Robustness Uric Acid: Positive and Negative Data

50,000,000

100,000,000

150,000,000

200,000,000

250,000,000

300,000,000

QC Blank DC 2h 4h 8h 12h

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

70,000,000

80,000,000

90,000,000

100,000,000

QC Blank DC 2h 4h 8h 12h

Negative ion

Positive ion Time between analysis 30 hrs

Biology

Biological Variability: Sometimes Unavoidable Biology

-

1,000,000

2,000,000

3,000,000

4,000,000

5,000,000

6,000,000

QC Blank DC 2h 4h 8h 12h

-

1,000,000

2,000,000

3,000,000

4,000,000

5,000,000

6,000,000

QC Blank DC 2h 4h 8h 12h

-

1,000,000

2,000,000

3,000,000

4,000,000

5,000,000

6,000,000

QC Blank DC 2h 4h 8h 12h

Study Findings

• Fasting has profound impact on metabolomic profiles

• Most metabolic changes are modest in extent

• Fasting-status may exacerbate or obscure drug-induced metabolic effects.

• Fasting data help contextualize drug-induced changes in many metabolites

• As part of the study design, fasting is neither “right” or “wrong” but it is a significant variable in model design

Biology

Summary

• Metabolomics is very challenging. It is fraught with numerous sources of noise and the cost of going down the wrong path is high

• Instrumentation • Needs to be precise and robust – good quality in, good quality out • Q Exactive provides an ideal platform

• Excellent mass accuracy with external calibration • Ultra high resolution without loss of sensitivity • High performance quantitation • Discovery and validation on the same platform

• Chemical Noise (Data Processing) • The right software and the right controls can make all the difference • Intelligent data reduction tools can significantly reduce noise

• Biological Noise • Needs to be understood through systematic studies • Metabolomic prescreening can identify biological outliers

• Ensure homogeneity within the study

Comparison of Palm Oil Samples

Control

Adulterated

0 2 4 6 8 10 12 14 16 18 20 Time (min)

0

20

40

60

80

100

Rel

ativ

e A

bund

ance

0

20

40

60

80

100

Rel

ativ

e A

bund

ance

13.56 12.89 11.29 3.21

15.27 9.31 3.28

2.86 7.87

4.89 2.56 4.98

5.73 1.53

13.37 11.43 14.87 3.20

9.34

2.85 7.89

4.90 2.56 5.01

5.66 1.52

2.68E10

2.69E10

Comparison of Palm Oil Samples

0 1 2 3 4 5 6 7Time (min)

0

10

20

30

40

50

Rel

ativ

e A

bund

ance

0

10

20

30

40

50

Rel

ativ

e A

bund

ance 4.89 4.902.56

4.693.702.34 3.734.98

5.064.20 5.73 5.725.63 6.845.961.53

1.881.461.190.850.44

4.902.56 4.843.71 4.693.742.334.25 5.01

5.09 5.66 6.855.77 6.821.52

1.651.340.30 0.51

2T

2T gControl

Adulterated

Sieve for Differential Analysis

Easy to use wizard walks you through the process and parameters of a differential analysis and unknown identification

Sieve for Differential Analysis

Sieve for Differential Analysis

Thanks!

• Serhiy Hnatyshyn

• Michael Reily

• Don Robertson

• Jessica Wang

• Pengxiang Yang

• Michael Athanas

• Thomas McClure

• David Peake

• Kate Comstock

• Yingying Huang

• Patrick Bennett

• Markus Kellmann • Catharina Crone

• Thomas Moehring

• Alexander Makarov

• Eugen Democ

• Frank Czemper

• Sebastian Kanngiesser

• Andreas Wieghaus