chemical impacts of the microbiome across scales reveal novel … · 1 1 chemical impacts of the...
TRANSCRIPT
1
Chemical Impacts of the Microbiome Across Scales Reveal Novel Conjugated Bile Acids 1
21,2Robert A. Quinn, 3Alison Vrbanac, 1Alexey V. Melnik, 3Kathryn A. Patras, 1Mitchell Christy, 31Andrew T. Nelson, 1Alexander Aksenov, 1,3Anupriya Tripathi, 3Greg Humphrey, 1Ricardo da 4
Silva, 4Robert Bussell, 5Taren Thron, 1Mingxun Wang, 1Fernando Vargas, 1Julia M. Gauglitz, 51Michael J. Meehan, 3Orit Poulsen, 6Brigid S. Boland, 6John T. Chang, 6William J. Sandborn, 63Meerana Lim, 7,8Neha Garg, 9Julie Lumeng, 10Barbara I. Kazmierczak, 10Ruchi Jain, 10Marie 7
Egan, 3Kyung E. Rhee, 3Gabriel G. Haddad, 1Dionicio Siegel, 5Sarkis Mazmanian, 1,2,3Victor 8
Nizet, 2,3,12Rob Knight and 1,2Pieter C. Dorrestein 9
101Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San 11
Diego, La Jolla, CA 122Center for Microbiome Innovation, University of California San Diego, La Jolla, CA 133Department of Pediatrics, University of California San Diego, La Jolla, CA 144Department of Radiology, University of California San Diego, La Jolla, CA 155Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, 16
CA 176Division of Gastroenterology, Department of Medicine, University of California San Diego, La 18
Jolla, CA. 197School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA 208Emory-Children’s Cystic Fibrosis Center, Atlanta, GA 219Department of Pediatrics, University of Michigan, Ann Arbor, MI 2210Department of Internal Medicine, Yale School of Medicine, New Haven, CT 2311Department of Pediatrics, Yale School of Medicine, New Haven, CT 2412Department of Computer Science and Engineering, University of California San Diego, La 25
Jolla, CA 26
27
Contributions 28
PCD, RK and RQ designed the project 29
RQ, AA, AM, FV, JG, NG, AT, MC, ATN, MM, GH, RdS, and RB generated data 30
RQ, AV, AT and MC analyzed data 31
RQ, BB, ML, OP, JC, ML, JL, KP, BK, RJ, ME, KR, GH, KR, GC, WS and RB collected 32
samples 33
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
2
PCD, RK, SM, VN and DS guided experimental design and analysis. 34
MW converted the data in GNPS, developed spectral search and molecular explorer. 35
TT, VN and SM raised animals and guided experimental design. 36
RQ and PD wrote the manuscript 37
38
Abstract 39
A mosaic of cross-phyla chemical interactions occurs between all metazoans and their 40
microbiomes. In humans, the gut harbors the heaviest microbial load, but many organs, 41
particularly those with a mucosal surface, associate with highly adapted and evolved 42
microbial consortia1. The microbial residents within these organ systems are increasingly well 43
characterized, yielding a good understanding of human microbiome composition, but we have 44
yet to elucidate the full chemical impact the microbiome exerts on an animal and the breadth 45
of the chemical diversity it contributes2. A number of molecular families are known to be 46
shaped by the microbiome including short-chain fatty acids, indoles, aromatic amino acid 47
metabolites, complex polysaccharides, and host lipids; such as sphingolipids and bile acids3–4811. These metabolites profoundly affect host physiology and are being explored for their roles 49
in both health and disease. Considering the diversity of the human microbiome, numbering 50
over 40,000 operational taxonomic units12, a plethora of molecular diversity remains to be 51
discovered. Here, we use unique mass spectrometry informatics approaches and data 52
mapping onto a murine 3D-model13–15 to provide an untargeted assessment of the chemical 53
diversity between germ-free (GF) and colonized mice (specific-pathogen free, SPF), and 54
report the finding of novel bile acids produced by the microbiome in both mice and humans 55
that have evaded characterization despite 170 years of research on bile acid chemistry16. 56
57
Main 58
In total, 96 sample sites, covering 29 organs, producing 768 samples (excluding 59
controls, Fig. S1) were analyzed from four GF and four colonized mice by LC-MS/MS mass 60
spectrometry and 16S rRNA gene sequencing. The metabolome data was most strongly 61
influenced by organ source, but as expected, the microbiome was dictated by colonization 62
status (Fig. 1a,b). GF mice and sterile organs in SPF mice clustered tightly with background 63
sequence reads from blanks (reflecting their sterility), whereas colonized organs within the 64
SPF mice clustered apart from these samples (Fig. 1a,b). Mapping the principle coordinate 65
values of the two data types onto the murine 3-D model showed how the gut samples were 66
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
3
similar, but important differences were observed, including separation of the stool sample 67
from the upper GI tract in the metabolome but not in the microbiome, and similarity between 68
the esophageal and gut microbiomes. The strongest separation in the metabolome between 69
colonization states was present in the stool, cecum, other regions of the GI tract, and 70
samples from the surface of the animals including ears and feet (Fig. 1c). The liver also had 71
signatures suggestive of metabolomic differences between the GF and SPF mice, but these 72
were not significant compared to the within individual variation (Fig. 1, Fig. S2). 73
Molecular networking is a novel spectral alignment algorithm that enables identification 74
of unique molecules in mass spectrometry data and the relationships between related 75
spectra14. Applying molecular networking to this comprehensive murine dataset identified 76
7,913 unique spectra (representing putative molecules) of which 14.7% were exclusively 77
observed in colonized mice and 10.0% were exclusive to GF (Fig. 2). Although the overall 78
profiles exhibited the strongest difference in the GI tract, molecular networking showed that 79
all organs had some unique molecular signatures from the microbiome, ranging from 2% in 80
the bladder to 44% in stool (Fig. 2). As expected, the metabolome of the cecum, site of 81
microbial fermentation of food products, was profoundly affected by the microbiota, but other 82
GI sites had weaker signatures. Spectral library searching enabled annotation of 8.86% of 83
nodes in the molecular network (n=700 annotated nodes13,17); which included members of the 84
molecular families of plant products, such as soyasaponins and isoflavonoids (sourced from 85
the soybean (Glycine max, f. Fabaceae) component of mouse chow), host lipids and 86
microbial metabolic products (Fig. 2a). Many of the unique signatures attributed to the 87
microbiome were the result of metabolism of plant triterpenoids and flavonoids from food 88
(Supplemental Data, Fig. S3, S4). These effects were location specific, indicating that the 89
microbiome inhabits spatially distinct and varied niche space throughout an organism, 90
exerting location-dependent effects on host physiology through the metabolism of xenobiotics 91
and modification of host molecules. 92
The strong impacts from the microbiome in the gastrointestinal (GI) tract led to deeper 93
analysis of the molecular changes in this organ system. A random forests classification was 94
used to identify the most differentially abundant molecules between the GF and SPF GI 95
tracts. The metabolome of both the GF and SPF mice changed through the different sections 96
of the digestive system (Fig. 3a). While changes through the upper GI tract were subtle in GF 97
mice, SPF animals had progressive transitions in this region (Fig. 3a). A major transition 98
occurred between the ileum and cecum in both groups, but the specific molecules that were 99
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
4
changing were different between them (Fig. 3a). Many unique metabolites in SPF mice were 100
unknown compounds, but known molecules were also identified including bile acids and 101
soyasaponins (Fig. 3a, Supplementary Data, Fig. S3,S5). The Shannon diversity index of the 102
GF and SPF mouse metabolome was mirrored in the upper GI tract, both being low in the 103
esophagus and higher in the stomach and duodenum, however, upon transition to the cecum, 104
the diversity of the two groups of mice began to separate (Fig. 3c,d). The molecular diversity 105
in the cecum and colon of colonized mice was significantly higher than GF mice (Mann-106
Whitney U-test), but not in the stool samples (Fig. 3c). 107
We also compared the changing microbial community through the GI tract in the 108
context of the changes observed in the molecular data. Similar to the metabolites, 109
microbiome transitions were observed traversing the GI tract (Fig. 3b). The corresponding 110
microbial diversity of the colonized animals showed a similar profile to the metabolome, 111
mostly stable through the upper parts of the system and then abruptly increasing at the 112
cecum, followed by a decrease in the colon and stool (Fig. 3d). However, an interesting 113
contrast was observed where a high diversity of the metabolome in the duodenum 114
corresponded to a lower microbial diversity. We hypothesize that this contrasting result was 115
due to the secretion of bile acids from the gallbladder at this location. Because these 116
molecules possess antimicrobial properties, their high abundance may explain the lower 117
microbial diversity in the upper GI tract18, while simultaneously, microbial modification of the 118
molecules increases their molecular diversity. After the duodenum, changes in the diversity of 119
microbiome and metabolome were closely aligned, but colonized mice had greater molecular 120
diversity in the cecum and colon. This shows that microbial activity in these organs was 121
altering the molecules present, particularly bile acids, soyasaponins, flavonoids, and other 122
unknown compounds, which expanded the metabolomic diversity of the cecum 123
(supplementary results). 124
Molecular networking also enabled meta-mass shift chemical profiling19 of the GF and 125
SPF GI tract, which is an analysis of chemical transformations based on parent mass shifts 126
between related nodes without the requirement of knowing the molecular structure. For 127
example, a unique node found in colonized mice with an 18.015 Da difference represents 128
H2O and 2.016 Da is H2. In colonized animals, there was a strong signature for the loss of 129
water in the duodenum and jejunum and the loss of H2, acetyl and methyl groups in latter 130
parts of the GI tract (Fig. 3e,f). GF mice had notable mass gains corresponding to 131
monosaccharides in all regions of the GI tract, which were absent in SPF animals. Instead, a 132
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
5
mass gain of C4H8 was seen in the jejunum and ileum of SPF mice, which was associated 133
with the conjugated bile acid glycocholic acid (Fig. 3e,f). A significant portion of the 134
dehydrogenation and dehydroxylation mass shifts from the microbiome were associated with 135
bile acids, indicating that microbial enzymes acted on C-C double bonds of the cholic acid 136
backbone and removed hydroxyl groups, which is a known microbial transformation3. 137
Deacetylations were also prevalent in SPF animals, though the metabolites upon which these 138
losses were occurring remain mostly unidentified. Overall, both GF and SPF mice had many 139
cases of mass loss between related molecules, but there were comparably fewer molecules 140
in the colonized mice that showed gain of a molecular group (Fig. 3f). This indicates that the 141
microbiome contributes more to the catabolic breakdown of molecules and less to anabolism; 142
however, one interesting anabolic reaction that was detected was the addition of C4H8 on 143
glycocholic acid, which we subsequently investigated further. 144
Glycine and taurine conjugated bile acids were detected in both GF and SPF mice. As 145
they moved through the GI tract, the conjugated amino acid was removed in SPF mice only, 146
representing a known microbial transformation (Fig. S5,20). In the bile acid molecular network 147
that contained taurocholic acid and glycocholic acid there were modified forms of these 148
compounds that were only present in colonized animals. These nodes were related to the 149
glycocholic acid through spectral similarity and to the sulfated form (Fig. 4a) and one of them 150
corresponded to the addition of C4H8 described above. Analysis of the MS/MS spectra of the 151
three nodes m/z 556.363, m/z 572.358 and m/z 522.379 showed maintenance of the core 152
cholic acid, but with a fragmentation pattern characteristic of the presence of the amino acids 153
phenylalanine, tyrosine and leucine through an amide bond at the conjugation site in place of 154
glycine or taurine (Fig. S6). In the extensive bile acid literature, representing 170 years of bile 155
acid structural analysis and greater than 42,000 publication records in PubMed, the only 156
known conjugations of murine (and human) bile acids were those of glycine and taurine16. 157
Here, we have found a set of unique amino acid conjugations to cholic acid mediated by the 158
microbiome creating the novel bile acids phenylalanocholic acid, tyrosocholic acid and 159
leucocholic acid. These structures were validated with synthesized standards using NMR and 160
mass spectrometry methods (Supplemental methods and Fig. S7, S8, S9, S10, S11). These 161
uniquely conjugated bile acids were detected in the duodenum, jejunum and ileum of SPF 162
mice, with phenylalanocholic acid being the most abundant (Fig. 4). In comparison, 163
glycocholic acid was present in the latter parts of the GI tract (cecum and colon), whereas 164
taurocholic acid was most abundant in the upper parts of the GI tract (reduced through the 165
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
6
lower GI tract in SPF mice). The concentration of phenylalanocholic acid in mouse ileal 166
content from the four mice was 0.59 μM (s.d. 0.21) in the duodenum, 3.0 μM (s.d. 4.43) in the 167
jejunum and 5.25 μM (s.d. 2.42) in the ileum, with its highest concentration reaching 13.24 168
μM in a single jejunum sample (Fig. S12). These findings demonstrate that these novel amino 169
acid conjugates are abundant in the upper GI tract of mice on a normal soy-based diet and 170
require the microbiota for their production, but were subsequently absorbed, further modified, 171
or deconjugated again upon travel to the cecum. 172
Because GNPS is a public repository of mass spectrometry data from a wide variety of 173
biological systems, we used an analysis feature called “single spectrum search” to search all 174
739 publically available data sets for the presence of MS/MS spectra matching these 175
conjugated bile acids (April 27, 2018,13). Spectral matches corresponding to 176
phenylalanocholic acid, tyrosocholic acid and leucocholic acid were found in 19 other studies 177
comprising samples from the GI tract of both mice (with at least one conjugate found in 3.2 to 178
59.4% of all samples, Fig. S13) and humans (in 1.6 to 25.3% of all samples, Fig. S13). In a 179
crowd-sourced fecal microbiome and metabolome study at least one of these unique bile 180
acids was found in 1.6% of human fecal samples with tyrosocholic acid being the most 181
prevalent (n=490, the American Gut Project 21, Fig. 4b). They were found in higher frequency 182
in fecal samples collected without swabs, including studies of patients with inflammatory 183
bowel syndrome, cystic fibrosis (CF) and infants (Fig. 4b). Re-analysis of data from a 184
previously published study of the murine microbiome and liver cancer enabled a comparison 185
of the abundance of these molecules in mice fed a high-fat-diet (HFD) and treated with 186
antibiotics22, Fig. 4b). Supporting the role of the microbiome in their production, the 187
Phe/Tyr/Leu amino acid conjugates were decreased with antibiotic exposure, whereas 188
glycocholic acid, which is synthesized by host liver enzymes, was not. In contrast, these 189
microbial bile acids were more abundant in mice fed HFD, with no change observed in the 190
host conjugated glycocholic acid22. In a separate data set where atherosclerosis-prone mice 191
were similarly fed a HFD the novel conjugates were also increased over time, but not on 192
normal chow and the host-conjugated taurocholic acid did not change significantly (Fig. S14). 193
Finally, exploration into the metadata associated with a public study of a pediatric CF patient 194
cohort showed that there was a higher prevalence of these compounds in CF patients 195
compared to healthy controls, particularly those with pancreatic insufficiency (Fig. 4b). 196
Insufficient production of pancreatic lipase in the CF gut results in the buildup of fat and a 197
microbial dysbiosis23, which parallels the gut microbial ecosystem in mice fed HFD. 198
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
7
The first chemical characterization of a bile acid was in 184824, the first correct 199
structure of a bile acid related molecule was elucidated in 193225 and bile acid metabolism by 200
the microbiome has been known since the 1960s26. Since then, microbial alteration of bile 201
acids has been known to occur through four principal mechanisms: dehydroxylation, 202
dehydration and epimerization of the cholesterol backbone, and deconjugation of the amino 203
acids taurine or glycine3,27,28. Here, using a simple experiment with colonized and sterile 204
mice, we have identified a fifth mechanism of bile acid transformation by the microbiome 205
mediated by a completely novel mechanism: conjugation of the cholesterol backbone with the 206
amino acids phenylalanine, leucine and tyrosine. Further research is required to determine 207
the microbial producers of these compounds and their role in gut microbial ecology, 208
especially considering the important findings that microbiome based bile acid metabolism can 209
affect C. difficile infections29 or regulate liver cancer30. The findings reported here show that 210
all bile acid research to date have overlooked a significant component of the human bile acid 211
pool produced by the microbiome. 212
In conclusion, the chemistry of all major organs and organ systems are affected by the 213
presence of a microbiome. The strongest signatures come from the gut through the 214
modification of host bile acids and xenobiotics, particularly the breakdown of plant natural 215
products from food. Addition of chemical groups to host molecules were more rare, but those 216
that were detected were sourced from a unique alteration of host bile acids by the 217
microbiome that changes our understanding of human bile after 170 years of research16. As 218
the connections between us and our microbial symbionts becomes more and more obvious, a 219
combination of globally untargeted approaches and the development of tools that interlink 220
these data sets will enable us to identify novel molecules, leading to a better understanding of 221
the deep connection between our microbiota and our health. 222
223
Acknowledgements: The authors would like to acknowledge funding from the National 224
Institutes of Health under project 5U01AI124316-03, 1R03CA211211-01, 1R01HL116235 225
and R01HD084163. Additionally, B.S.B. was supported by UCSD KL2 (1KL2TR001444), J.L. 226
by grant 13EIA14660045 from the American Heart Association. We would like to 227
acknowledge Gail Ackerman for her contributions to curate the metadata. We further thank 228
Alan Hofmann and Lee Hagey for their insightful discussions on structural characterization of 229
bile acids. 230
231
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
8
Data Availability: All metabolomics data is available at GNPS (gnps.ucsd.edu) under the 232
MassIVE id numbers: MSV000079949 (GF and SPF mouse data). Additional sample data: 233
MSV000082480, MSV000082467, MSV000079134, MSV000082406. The sequencing data 234
for the GF and SPF mouse study is available on the Qiita microbiome data analysis platform 235
at Qiita.ucsd.edu under study ID 10801 and through the European Bioinformatics Institute 236
accession number ERP109688. 237
238
References 239
1. Consortium, H. Structure, function and diversity of the healthy human microbiome. 240
Nature 486, 207–14 (2012). 241
2. Dorrestein, P. C., Mazmanian, S. K. & Knight, R. Finding the Missing Links among 242
Metabolites, Microbes, and the Host. Immunity 40, 824–832 (2014). 243
3. Ridlon, J. M., Kang, D. J., Hylemon, P. B. & Bajaj, J. S. Bile acids and the gut 244
microbiome. Curr. Opin. Gastroenterol. 30, 332–8 (2014). 245
4. Gilbert, J. A. et al. Microbiome-wide association studies link dynamic microbial 246
consortia to disease. Nature 535, (2016). 247
5. Wikoff, W. R. et al. Metabolomics analysis reveals large effects of gut microflora on 248
mammalian blood metabolites. Proc. Natl. Acad. Sci. U. S. A. 106, 3698–703 (2009). 249
6. Marcobal, A. et al. Metabolome progression during early gut microbial colonization of 250
gnotobiotic mice. Sci. Rep. 5, 11589 (2015). 251
7. Miller, T. L. & Wolin, M. J. Pathways of acetate, propionate, and butyrate formation by 252
the human fecal microbial flora. Appl. Environ. Microbiol. 62, 1589–92 (1996). 253
8. Gillner, M., Bergman, J., Cambillau, C., Fernström, B. & Gustafsson, J. A. Interactions 254
of indoles with specific binding sites for 2,3,7,8-tetrachlorodibenzo-p-dioxin in rat liver. 255
Mol. Pharmacol. 28, 357–63 (1985). 256
9. Martin, F.-P. J. et al. A top-down systems biology view of microbiome-mammalian 257
metabolic interactions in a mouse model. Mol. Syst. Biol. 3, 112 (2007). 258
10. Moriya, T., Satomi, Y., Murata, S., Sawada, H. & Kobayashi, H. Effect of gut microbiota 259
on host whole metabolome. Metabolomics 13, 101 (2017). 260
11. Swann, J. R. et al. Systemic gut microbial modulation of bile acid metabolism in host 261
tissue compartments. Proc. Natl. Acad. Sci. U. S. A. 108 Suppl 1, 4523–30 (2011). 262
12. Structure, function and diversity of the healthy human microbiome. Nature 486, 207–14 263
(2012). 264
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
9
13. Wang, M. et al. Sharing and community curation of mass spectrometry data with Global 265
Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016). 266
14. Watrous, J. et al. Mass spectral molecular networking of living microbial colonies. Proc. 267
Natl. Acad. Sci. U. S. A. 109, 1743–52 (2012). 268
15. Protsyuk, I. et al. 3D molecular cartography using LC–MS facilitated by Optimus and ’ili 269
software. Nat. Protoc. 13, 134–154 (2017). 270
16. Hofmann, A. F. & Hagey, L. R. Key discoveries in bile acid chemistry and biology and 271
their clinical applications: history of the last eight decades. J. Lipid Res. 55, 1553–95 272
(2014). 273
17. Yang, J. Y. et al. Molecular networking as a dereplication strategy. J Nat Prod 76, 274
1686–1699 (2013). 275
18. Hofmann, A. F. & Eckmann, L. How bile acids confer gut mucosal protection against 276
bacteria. Proc. Natl. Acad. Sci. U. S. A. 103, 4333–4 (2006). 277
19. Hartmann, A. C. et al. Meta-mass shift chemical profiling of metabolomes from coral 278
reefs. Proc. Natl. Acad. Sci. U. S. A. 114, (2017). 279
20. Wahlström, A., Sayin, S. I., Marschall, H.-U. & Bäckhed, F. Intestinal Crosstalk 280
between Bile Acids and Microbiota and Its Impact on Host Metabolism. Cell Metab. 24, 281
41–50 (2016). 282
21. McDonald, D. et al. American Gut: an Open Platform for Citizen Science Microbiome 283
Research. mSystems 3, e00031-18 (2018). 284
22. Shalapour, S. et al. Inflammation-induced IgA+ cells dismantle anti-liver cancer 285
immunity. Nature 551, 340–345 (2017). 286
23. Manor, O. et al. Metagenomic evidence for taxonomic dysbiosis and functional 287
imbalance in the gastrointestinal tracts of children with cystic fibrosis. Sci. Rep. 6, 288
22493 (2016). 289
24. Strecker, A. Untersuchung der Ochsengalle. Ann. der Chemie und Pharm. 65, 1–37 290
(1848). 291
25. BERNAL, J. D. Crystal Structures of Vitamin D and Related Compounds. Nature 129, 292
277–278 (1932). 293
26. Gustafsson, B. E., Gustafsson, J. A. & Sjövall, J. Intestinal and fecal sterols in germfree 294
and conventional rats. Bile acids and steroids 172. Acta Chem. Scand. 20, 1827–35 295
(1966). 296
27. Midtvedt, T. Microbial bile acid transformation. Am. J. Clin. Nutr. 27, 1341–1347 (1974). 297
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
10
28. Gérard, P. & Philippe. Metabolism of Cholesterol and Bile Acids by the Gut Microbiota. 298
Pathogens 3, 14–24 (2013). 299
29. Buffie, C. G. et al. Precision microbiome reconstitution restores bile acid mediated 300
resistance to Clostridium difficile. Nature 517, 205–208 (2015). 301
30. Ma, C. et al. Gut microbiome-mediated bile acid metabolism regulates liver cancer via 302
NKT cells. Science 360, eaan5931 (2018). 303
31. Sumner, L. W. et al. Proposed minimum reporting standards for chemical analysis 304
Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI). 305
Metabolomics 3, 211–221 (2007). 306
307
Figures and Figure Legends 308
309Figure 1. a) Principal coordinate analysis (PCoA) of microbiome and mass spectrometry data 310
highlighted by sample source as GF or SPF. b) Same data highlighted by organ source. c) 311
PC1 13.74% PC1 13.74%
Adrenal Gland
Bladder
Blood
Brain
Cecum
CervixColon
Duodenum
EarEsophagus
Foot
Gall BladderHeart
IleumJejunumKidneyLiver
LungMouth
OvaryPancreas
SkinSpleenStomach
StoolThymus
TracheaUterus
Vagina
PC2 9.3%PC2 9.3%
PC3 7.42% PC3 7.42%
SPF
GF
PC1 52.57%
PC2 11.73%
PC3 5.03%
PC1 40.47%
Met
abol
ome
Mic
robi
ome
a) b)
PC2 11.73%
PC1 52.57%
PC3 5.03%
0.8
0.6
0.4
0.8
0.6
0.4
0.8
0.6
0.4
0.2
0.8
0.6
0.4
0.8
0.6
0.4
0.2
0.8
0.6
0.4
0.8
0.6
0.4
1.0
0.8
0.6
0.4
0.2
1.0
c)
GF-SPF Within
Bra
y-C
urtis
Dis
sim
ilarit
y
Stool
CecumIleum
Colon
DuodenumStomach
Ear Foot
GF-SPF WithinGF-SPF Within
GF-SPF Within
GF-SPF WithinGF-SPF Within
GF-SPF Within GF-SPF Within
Bra
y-C
urtis
Dis
sim
ilarit
y
Metabolome
p<0.00001p=0.032
p<0.00001
p<0.00001
p<0.00001
p<0.00001
p=0.02 p=0.019
0.2
0.4
0.6
0.8
Between WithinJejunum
Jejunump<0.00001
GF-SPF Within
0.2
0.4
0.6
0.8
Between WithinJejunum
GF-SPF Within
Liverp=0.077
Ce
StlCoI
J
D
StmBr
Er
Es Tr
LgLv
Ft
Mo
Hd
Kd
Ad
Bl
Vg
Cx
UtOv
Ce
StlCoI
J
D
StmBr
Er
Es Tr
LgLv
Ft
Mo
Hd
Kd
Ad
Bl
Vg
Cx
UtOv
Stl
Co
I
J
Br
Er
EsTr
LgLvFt
Mo
Hd
Kd
BlVg
CxUt Ov
Stl
Co
I
J
Br
Er
EsTr
LgLvFt
Mo
Hd
Kd
BlVg
CxUt Ov
Met
abol
ome
Mic
robi
ome
CRANIAL CAUDAL
CRANIAL CAUDAL CRANIALCAUDAL
CRANIALCAUDAL
1st Principle Coordinate 1st Principle Coordinate
1st Principle Coordinate1st Principle Coordinate
d)
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
11
Bray-Curtis dissimilarities of the metabolome data collected from murine organs. The 312
dissimilarities are calculated within individual mice of the same group (GF or SPF, “Within”) or 313
across the GF and SPF groups (GF-SPF). Organs with multiple samples are pooled, but only 314
samples collected from exact same location are compared. d) 3-D model of murine organs 315
mapped with the mean 1st principle coordinate value from the four GF and four SPF mice. 316
High values across the 1st PC are shown in red and lower values are shown in blue. The PC1 317
values are from the data in panels a) and b). (Er=ear, Br=brain, Ad=adrenal gland, 318
Es=esophagus, Tr=trachea, Stm=stomach, Kd=kidney, Mo=mouth, D=duodenum, Ov=ovary, 319
Co=colon, Stl=stool, Hd=hand, Lg=lung, Lv=liver, J=jejunum, Ce=cecum, Bl-bladder, 320
Ut=uterus, Cx=cervix, Vg=vagina, Ft=feet) 321
322
323Figure 2. a) Molecular network of LC-MS/MS data with nodes colored by source as GF, SPF, 324
shared, or detected in blanks. Molecular families with metabolites annotated by spectral 325
matching in GNPS are listed by a number corresponding to the molecular family. These are 326
level 2 or 3 annotations according to the metabolomics standards consortium 31. b) 327
percentage of total nodes from each organ sourced from GF only, SPF only or shared and 328
the total number of unique nodes from each murine class per organ. 329
330
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Skin
Ea
r Fo
ot
Mou
th
Bloo
d Br
ain
Adre
nal
Thym
us
Hea
rt Tr
ache
a Lu
ng
Live
r G
all
Kidn
ey
Blad
der
Panc
reas
Sp
leen
Es
opha
gus
Stom
ach
Duo
denu
m
Jeju
num
Ile
um
Cec
um
Col
on
Stoo
l Va
gina
C
ervi
x O
vary
U
teru
s
Perc
ent o
f Uni
que
Spec
tra
Shared SPF GF Blanks
1
1. Soyasaponins
2
2. Taurocholic acid 3. Lysophospholipids
3
4. Eicosanoids
4 5
5. Tryptophan 6. Peptides
6
7. Ampicillin Standard
7 8
8. Phthalates
9
9. Phenylalanine
33
10
10. Phosphoethanolamines
11
11. Cholic acid
1213
12. Ceramides
2
13. Unknown microbial
3 6 14
14. Palmitoylcarnitine
11
15. Flavanoids
15 6 16
16. Soyaspongenols
6 17
17. Epoxy-keto-decanoic acids
18
18. Docosenamide
8 10 19
19. Urobilin
620 21`
21. 12-OAHSA 22. Riboflavin
22`1523`
23. Ketodeoxycholic acid
24`
24. Indoles25. Deoxyguanosine
25`
a) b)
0
500
1000
1500
2000
2500
Skin
Ea
r Fo
ot
Mou
th
Bloo
d Br
ain
Adre
nal
Thym
us
Hea
rt Tr
ache
a Lu
ng
Live
r G
all
Kidn
ey
Blad
der
Panc
reas
Sp
leen
Es
opha
gus
Stom
ach
Duo
denu
m
Jeju
num
Ile
um
Cec
um
Col
on
Stoo
l Va
gina
C
ervi
x O
vary
U
teru
s Num
ber o
f Uni
que
Spec
tra
SharedSPFGF
20. Deoxycholic acid
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
12
331Figure 3. a) Mean normalized abundance of the top 30 most differentially abundant 332
metabolites between GF and SPF mice. The metabolites are colored according to molecular 333
family, where bile acids are green and blue, respectively, soyasaponins are pink and 334
unknown molecules are brown/yellow. Colors corresponding to taurocholic acid (green) and 335
deoxymuricholic acid (teal) are highlighted for reference. b) Microbiome of the murine GI tract 336
in SPF mice. Taxa of relevance are color coded according to the legend. c) Mean and 95% 337
confidence interval of the Shannon-Weiner diversity of the metabolomic data in each GI tract 338
sample for GF and SPF mice. Statistical significance between metabolome diversity in the 339
same sample location between GF and SPF mice was tested with the Mann-Whitney U-test 340
(*=p<0.05). d) Mean Faith’s phylogenetic diversity (with 95% confidence interval) of the 341
microbiome through the SPF GI tract. e) Results of meta-mass shift chemical profiling 19 342
showing the relative abundance of the parent mass differences between unique nodes in 343
either GF or SPF mice to the total. Each mass difference corresponds to the node-to-node 344
gain or loss of a particular chemical group. f) Counts of the number of mass shifts of the 345
parent mass differences between nodes showing where the most abundant molecular 346
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100% Uknown Unknown Unknown Unknown Unknown Unknown Unknown Unknown Uknown SoyasaponinSoyasaponin I Na+ Soyasaponin I Soyasaponin 1 MN-Taurocholic Acid MN-Taurocholic Acid MN-Taurocholic Acid MN-Taurocholic Acid MN-Taurocholic Acid MN-Taurocholic Acid MN-Taurocholic Acid MN-Taurocholic Acid
MN-Bile Acid MN-Bile Acid MN-Bile Acid MN-Bile Acid MN-Bile Acid MN-Bile Acid
Deoxymuricholic Acid
Muricholic Acid
4
5
6
3osEcFG 2osEbFG 1osEaFG 2hcamotSfFG 1hcamotSeFG 4osEdFG 2ouDiFG 1ouDhFG 3hcamotSgFG 5ouDlFG 4ouDkFG 3ouDjFG 2elIoFG 2elInFG 1elInFG 6ouDmFG 6elIsFG 5elIrFG 4elIqFG 3elIpFG 3ujeJvFG 2ujeJuFG 1ujeJtFG 6ujeJyFG 5ujeJxFG 4ujeJwFG 3muceCbzFG 2muceCazFG 1muceCzFG 4muceCczFG 6muceCezFG 5muceCdzFG 3noloChzFG 2noloCgzFG 1noloCfzFG 6noloCkzFG 5noloCjzFG 4noloCizFG 2osEbFPS 1osEaFPS lootSlzFG 1hcamotSeFPS 4osEdFPS 3osEcFPS 3hcamotSgFPS 2hcamotSfFPS 3ouDjFPS 2ouDiFPS 1ouDhFPS 6ouDmFPS 5ouDlFPS 4ouDkFPS 3ujeJoFPS 2ujeJoFPS 1ujeJnFPS 4ujeJqFPS 4ujeJpFPS 3ujeJpFPS 6ujeJrFPS 5ujeJrFPS 5ujeJqFPS 3elIvFPS 2elIuFPS 1elItFPS 6ujeJsFPS 6elIyFPS 5elIxFPS 4elIwFPS 3muceCbzFPS 2muceCazFPS 1muceCzFPS 4muceCczFPS 6muceCezFPS 5muceCdzFPS 2noloCgzFPS 1noloCfzFPS 4noloCizFPS 3noloChzFPS lootSlzFPS 6noloCkzFPS 5noloCjzFPSSample
Shannon
3
Eso Sto Duo Jej Ile Cec Col Fec
Shan
non
Inde
x
EsophagusStomachDuodenumJejunumIleumCecumColonStool
*
* *****
*
**
*
Eso Sto Duo Jej Ile Col FCec Eso Sto Duo Jej Ile Col FCec
Nor
mal
ized
Abu
ndan
ceGF SPFa)
c) d)
b)
Rel
ativ
e A
bund
ance
GF SPF
o_Bacteroidales; f_S24-7 k_Bacteria; p_Firmicutesc_Bacilli; o_Bacillalesf_Lactobacillaceae; g_Lactobacilluso_Clostridiales, f_Lachnospiraceaeo_Clostridiales; f_Ruminococcaceaeg_Akkermansia; s_muciniphilaOther
SPF
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Stom
ach
Duo
denu
m
Jeju
num
Ile
um
Cec
um
Col
on
Stoo
l
Stom
ach
Duo
denu
m
Jeju
num
Ile
um
Cec
um
Col
on
Stoo
l
Stom
ach
Duo
denu
m
Jeju
num
Ile
um
Cec
um
Col
on
Stoo
l
Stom
ach
Duo
denu
m
Jeju
num
Ile
um
Cec
um
Col
on
Stoo
l Rela
tive
Abun
danc
e of
Uni
que
Mas
s Lo
ss
2x Aglycone OH C5H8O4-Sugar C6H10O5-Glycone C6H10O4-Sugar CO2 CH2O2/NO2 C2H6 2 Unsaturations C4H8 SO3 CH4 C2 C C2H2 NH3 CH2O O C2H4 H2O Methyl Acetyl H2
0
100
200
300
400
500
600
700
800
900
Sto
mac
h D
uode
num
Je
junu
m
Ileum
C
ecum
C
olon
S
tool
Sto
mac
h D
uode
num
Je
junu
m
Ileum
ce
cum
C
olon
S
tool
Sto
mac
h D
uode
num
Je
junu
m
Ileum
ce
cum
C
olon
S
tool
Sto
mac
h D
uode
num
Je
junu
m
Ileum
ce
cum
C
olon
S
tool
Spec
tral
Cou
nt o
f Uni
que
Mas
s Lo
ss 2x Aglycone
OH C5H8O4-Sugar C6H10O5-Glycone C6H10O4-Sugar CO2 CH2O2/NO2 C2H6 2 Unsaturations C4H8 SO3 CH4 C2 C C2H2 NH3 CH2O O C2H4 H2O Methyl Acetyl H2
GF SPF GF SPF GF SPF GF SPFMass Loss Mass Gain
SPF
Faith
’s P
hylo
gene
tic D
iver
sity
EsophagusStomachDuodenumJejunumIleumCecumColonStool
10
12
14
8
6
5
Mass Loss Mass Gaine) f)2x Aglycone OH C5H8O4-Sugar
C6H10O4-Sugar CO2 CH2O2
C2H6 2 Desaturations C4H8 SO3 CH4 C2 C C2H2 NH3 CH2O O C2H4 H2O Methyl Acetyl H2
C6H10O5-Glycone
2x Aglycone OH C5H8O4-Sugar
C6H10O4-Sugar CO2 CH2O2
C2H6 2 Desaturations C4H8 SO3 CH4 C2 C C2H2 NH3 CH2O O C2H4 H2O Methyl Acetyl H2
C6H10O5-Glycone
MN-Bile Acid
MN-Bile Acid
Bile Acids
Soyasaponins
Unknowns
Taurocholicacid
Deoxymuricholicacid
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
13
transitions are detected in the murine gut. 347
348
349Figure 4. a) Structures, molecular network, 3D-molecular cartography and abundance 350
through GI tract of novel microbiome associated bile acids in this murine study. Structures of 351
the previously known conjugates glycocholic acid, taroursocholic acid and taurocholic acid 352
are shown for comparison to structures of the newly discovered amino acid conjugates. The 353
molecular network of these bile acids is shown with mapping to the GF and SPF mice 354
according to the color legend. An inset highlighting the parent masses and mass differences 355
between the newly discovered molecules is shown for clarity. 3D-molecular cartography 356
maps the mean abundance and standard deviations of the mean of the newly discovered 357
conjugates onto a 3D-rendered model of the murine GI tract and the relative abundances of 358
the molecules through the GI tract samples compared to the host produced glycocholic acid 359
and taurocholic acid are also shown. b) Bar plots of the percent of samples positive for the 360
HFD NC No Ab Ab
*
33.9839
23.93
56.063
40.05
15.995
26.049
106.042
74.034
63.98
34.955
16.013
29.025
50.968
90.047
79.993
10.054
522.379
572.358
482.329
517.284
466.316
546.309
556.363 49.9789
1
1. Glycocholic Acid
2. Phenylalanocholic Acid
4. Leucocholic Acid
23
4
Duodenum Bile Acid Cluster
GFBothSPF
Stl
Co
I
J
D
Stm
CAUDAL CRANIAL
2. Phenylalanocholic Acid
Stl
Co
I
J
D
Stm
3. Tyrosocholic Acid
CAUDAL CRANIAL
Stl
CoI
JD
Stm
CAUDAL CRANIAL1
10
100
1000
10000
Stomach1
Stomach2
Stomach3
Gall
Duodenum1
Duodenum2
Duodenum3
Duodenum4
Duodenum5
Duodenum6
Jejunum1
Jejunum2
Jejunum3
Jejunum4
Jejunum5
Jejunum6
Ileum1
Ileum2
Ileum3
Ileum4
Ileum5
Ileum6
Cecum1
Cecum2
Cecum3
Cecum4
Cecum5
Cecum6
Colon1
Colon2
Colon3
Colon4
Colon5
Colon6
Stool Nor
mal
ized
Abu
ndan
ce x
1e5
Phenylalanocholic acid Tyrosocholic acid Leucocholic acid SPF Glycocholic Acid SPF Taurocholic Acid GF Taurocholic Acid GF Glycocholic Acid
4. Leucocholic Acid
Chemical Formula: C26H43NO6Exact Mass: 465.309
O
HO OH
NH
H
H
H
H
H
OHOH
O
Chemical Formula: C26H45NO7SExact Mass: 515.292
O
HO OH
NH
H
H
H
H
H
OHS OHO
O
Taurocholic acid
Chemical Formula: C26H46NO7SExact Mass: 498.297
O
HO OH
NH
H
H
H
H
HS OHO
O
Tauroursocholic acid
Acetyl-Tauroursocholicacids
Amino AcidConjugates
**Glycocholic acid Phenylalanocholic acid Tyrosocholic acid Leucocholic acid All
558.345
573.354
568.329570.345
570.345
530.313
480.278
496.273480.278 498.288 514.284
539.3
540.332
542.349
542.35
482.329
572.358
522.379
517.284
546.309
466.316
556.363
778.528
800.51826.526
614.276
647.392
647.392
561.356
596.256
527.288
532.294
531.309496.273
496.273514.283
549.319
462.267
480.278
464.282
464.282
501.306
500.304
500.304
462.266
598.281 482.293517.329
576.332
516.3
512.267
498.289
802.528
575.335
533.325
563.334
533.324O
HO OH
NH
H
H
H
H
H
OH
Chemical Formula: C33H49NO6Molecular Weight: 555.76
OH
O
Phenylalanocholic acid
Chemical Formula: C33H49NO7Molecular Weight: 571.35
3. Tyrosocholic AcidO
HO OH
NH
H
H
H
H
HOH OH
O
OH
Tyrosocholic acid
O
HO OH
NH
H
H
H
H
HOH
Chemical Formula: C30H51NO6Molecular Weight: 521.74
OH
O
Leucocholic acid
GB
GB
GB
GB
1. Glycocholic Acid
0
5
10
15
20
25
30
35
40
45 Glycocholic Acid
Phenylalanocholic
Tyrosocholic acid
Leucocholic acid
Any Microbial
Murine
**
NS
1e 2
1e 3
1e 4
1e 5
Stl
Co
IJ
D
Stm
Acetyl-Taurocholicacids
Ion
Inte
nsity
* **
NS
AGP IBD CF
***
0
10
20
30
40
50
60
70
80
CF-PI CF-PS non-CFInfants
Perc
ent o
f Sam
ples
Perc
ent o
f Sam
ples
a)
b)
n = 25 n = 8 n = 10n = 155n = 173n = 490 n = 310
CAUDAL CRANIAL
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint
14
novel bile acids from publically available datasets on GNPS. Percent of patients where novel 361
bile acids were detected from two human studies of cystic fibrosis patients compared to non-362
CF controls. Comparison of the abundance of novel conjugates in a controlled murine study 363
previously published where animals fed high fat diet (HFD) or normal chow (NC) were 364
compared and those treated with antibiotics 22. AGP = American Gut Project 21, IBD = 365
Inflammatory Bowel Disease, CF = cystic fibrosis, PI = pancreatic insufficient, PS = 366
pancreatic sufficient. 367
.CC-BY-NC-ND 4.0 International licenseis made available under aThe copyright holder for this preprint (which was not peer-reviewed) is the author/funder. It. https://doi.org/10.1101/654756doi: bioRxiv preprint