Dr Packer Agora

Methods. Author manuscript; available in PMC 2018 Oct 17.

Published in final edited form as:

Published online 2018 Apr 26. doi: 10.1016/j.ymeth.2018.04.024

NIHMSID: NIHMS974188

The publisher's final edited version of this article is available at Methods

Apr 08, 2015 Dr Packer - Shame (Dr Packer Rework) Dope N Disco. Unsubscribe from Dope N Disco? Cancel Unsubscribe. Subscribe Subscribed Unsubscribe 11K. Roger Packer, M.D., is Senior Vice President for the Center of Neuroscience and Behavioral Medicine at Children's National Health System.He is also the Gilbert Distinguished Professor of Neurofibromatosis, and Director at the Gilbert Neurofibromatosis and Brain Tumor Institutes.

See other articles in PMC that cite the published article.

Associated Data

Supplementary Materials

NIHMS974188-supplement-S1.pptx (5.8M)

Abstract

Multi-omic data and genome-scale microbial metabolic models have allowed us to examine microbial communities, communityfunction, and interactions in ways that were not available to us historically. Now, one of our biggest challenges is determininghow to integrate data and maximize data potential. Our study demonstrates one way in which to test a hypothesis by combiningmulti-omic data and community metabolic models. Specifically, we assess hydrogen sulfide production in colorectal cancer based onstool, mucosa, and tissue samples collected on and off the tumor site within the same individuals. 16S rRNA microbial communityand abundance data were used to select and inform the metabolic models. We then used MICOM, an open source platform, to track themetabolic flux of hydrogen sulfide through a defined microbial community that either represented on-tumor or off-tumor samplecommunities. We also performed targeted and untargeted metabolomics, and used the former to quantitatively evaluate our modelpredictions. A deeper look at the models identified several unexpected but feasible reactions, microbes, and microbialinteractions involved in hydrogen sulfide production for which our 16S and metabolomic data could not account. These results willguide future in vitro, in vivo, and in silico tests to establish why hydrogensulfide production is increased in tumor tissue.

1. Introduction

Integration of multi-omic data has become increasingly important as studies of the microbiome have advanced and as ourunderstanding of microbial community dynamics continues to grow. In the gut, for example, it is now appreciated that microbialinteractions and community function may play a more important role than single microbes in some disease processes includingClostridium difficile infection [], Type 2 diabetes[], and atherosclerosis []. As such, the tools and techniques for studying microbial metabolites and interactions are rapidly developing, andthe integration of multi–omic data is increasingly feasible on many platforms. Data integration using statistical correlationsbetween microbiome and metabolome [–] or metatranscriptome and meta-genome [,] has given us unique insights into gut microbial ecology and disease processes. However,correlation does not imply causation and cannot uncover underlying biological mechanisms. Mechanistic modeling, on the other hand,offers a way to examine data through the lens of biological interaction; and microbes, metabolites, or reactions of interest in themodels can be queried and further analyzed []. In this study, we characterizean approach for integrating multi-omic data that uses the principles of metabolic model reconstruction and modeling in order toexamine a specific mechanistic hypothesis that relates microbial hydrogen sulfide production and colorectal cancer (CRC).

CRC is the one of the leading causes cancer death in the United States [10]. It has also been linked to alterations in the gut microbiome and metabolome []. The microbially-produced metabolites implicated in CRC development or progression includehydrogen sulfide [], reactive oxygen species [], N-nitroso compounds [,], polyamines [], andsecondary bile acids [,]. Adding tothis complex landscape, it appears that the effects of some metabolites are context dependent. Butyrate, a short chain fatty acid(SCFA), is reportedly both an inducer [] and an inhibitor [] of CRC tumor development. Hydrogen sulfide, which has been implicated in our work andothers’ [,–], is a genotoxic and cytotoxic agent at physiologic concentrations [,], but it hasalso been reported as an anti-inflammatory and potentially anti-tumor agent [–].

Quantifying the concentration of hydrogen sulfide in the gut is inherently difficult given its volatile nature []. In contrast, quantifying the relative abundance of hydrogen sulfide-producingbacteria is far less challenging. Sulfidogenic bacteria such as Bilophila wadsworthia, Fusobacteriumspecies, and sulfur-reducers such as Desulfovibrio species have all been identified within the gut and linked tocolon inflammation [], adenomas [,], and colon tumors[–]. However, thesemicrobial associations are not sufficient for determining microbial metabolite production. Examining the effects of single microbes onCRC fails to capture microbial interactions or community effects that may alter the rate of hydrogen sulfide production and subsequentpathogenesis. Investigations on the role of microbial metabolites in CRC requires an approach capable of capturing metaboliteproduction within the context of different microbial communities.

Microbial metabolic models and the tools built around these models provide an opportunity to examine microbial communityinteractions [–] and testthe relationship between microbial hydrogen sulfide production and CRC. Genome-scale metabolic models (GEMs) are reconstructedmetabolic networks based on complete annotated genome sequences []. GEMs canbe used to calculate reaction fluxes for a specific target function—e.g., growth—by optimizing metabolite flow througha metabolic network []. Recently developed tools allow users to assesspairwise (e.g. MMINTE []) or community interactions (e.g. MICOM[,43]) between microbial GEMs.Unlike correlations which identify pairs of microbes and metabolites that are concurrently increased or decreased, community metabolicmodeling tools like MICOM capture more complex microbial interactions such as cross-feeding and resource competition. Additionally,the flux of specific metabolites (i.e. hydrogen sulfide) or specific reactions (i.e. cysteine degradation) can be tracked in thesemodels.

In this study, we obtained 16S microbial community data from colon tissue, mucosa, and stool samples in individuals diagnosedwith colorectal cancer. Samples were collected at the tumor site and at adjacent normal (hereafter referred to as“adjacent”) and distant normal (hereafter referred to as “normal”) colon sites. 16S data was then usedto select and inform genome-scale metabolic models that represented “tumor,” “adjacent,” and“normal” microbial communities. Metabolite abundances were inferred by tracking metabolic flux in MICOM. Models werealso queried to identify specific microbes and reactions contributing to these fluxes. Targeted and untargeted metabolomics on colontissue were performed to validate model predictions. The synthesis of modeling and experimental results here provided an improvedmeans to maximize data potential, test a specific hypothesis about hydrogen sulfide production, and gain unexpected insights intomicrobial community behavior in CRC. Specifically, our models predict increased hydrogen sulfide flux in CRC tumors tissue attributedto Fusobacterium

2. Methods

2.1. Human subject enrollment

This study was approved the Mayo Clinic Institutional Review Board (IRB# 14-007237 and IRB# 622-00). Adults(> 18 years old) who were identified as being surgical candidates for colorectal cancer (n = 106) were voluntarilyenrolled at Mayo Clinic in Rochester, Minnesota (Table 1; for tumor location by cancerstage: Supp. Table 1). Exclusion criteria included radiation therapy orchemotherapy in the 2 weeks prior to enrollment. (For a complete list of patient medications at the time of surgery see Supp. Table 2.) Patients with lower anterior resections or a divertingileostomy were additionally prescribed a MiraLAX bowel preparation. Immediately prior to surgery, all patients received tap wateror MiraLAX enemas. Partial or total colectomies were performed on each patient and, as feasible, colon tissue, mucosal scrapes,and stool samples (fecal material in contact with colon tissue) were obtained from the tumor site, adjacent normal area(“adjacent”), and distant normal area (“normal”). Mucosal scrapes and a portion of each tissuesample (~1 cm² of full thickness colon tissue) was immediately snap frozen in TE buffer (10 mmol Tris, 1 mmol EDTA, pH8). A separate sample of tumor tissue was preserved in formalin and submitted to pathology. Stool collected from within the colonat surgery was frozen without a buffer at −80 °C until DNA extraction.

Table 1

Demographics of individuals with colorectal cancer.

Subgroups
Sex(n,%)	Males 57 (53.89%)
Sex(n,%)	Females 49 (46.2%)
Age (mean,range)	65.3 (23–90)
BMI (mean,range)	27.7 (16–54.8)
Any history of smoking?	Yes 58 (54.7%)
Any history of smoking?	No 48 (45.3%)
Cancer stage (n, %)	1, 21 (19.8%)
	2, 27 (25.5%)
	3, 40 (37.7%)
	4, 6 (5.7%)
	Unknown,12 (11.3%)
Colon tumor location(n, %)	right side 36(34%)
	left side 58 (54.7%)
	transverse 11 (10.4%)
	multiple 1 (0.9%)

nucleatum, Clostridium perfringens, and Filifactor alocis.

2.2. DNA extraction and 16S rRNA gene sequencing

Stool DNA extraction [], quantification, and amplification wasperformed as described previously []. Colon tissue was thawed, weighed,and homogenized prior to DNA extraction. Mucosal scrape samples were centrifuged; the supernatant was removed, and the pellet wasused for DNA extraction. Colon tissue and mucosal scrape DNA extraction was performed using the Mo Bio PowerSoil DNA isolation kit(Mo Bio Laboratories, A Qiagen Company, Carlsbad, CA, USA). Library preparation on all samples was performed at the Mayo ClinicMicrobiome Lab (Rochester, MN) and sequencing was completed at the Mayo Clinic Medical Genomics Facility using a MiSeq instrument(2 × 300, 600 cycles, Illumina Inc.)

2.3. DNA sequence processing and analysis

Sequencing data was processed as previously described [,]. In brief, the IM-TORNADO bioinformatics pipeline was used to assign paired sequences at a97% identity match to operational taxonomic units (OTUs) [].Taxonomy assignment was made using the Ribosomal Database Project (RDP) version 9 []. Samples ranged from 0 to 514, 568 reads after processing. All samples with < 500 reads were excludedfrom analyses. No-template control samples were included with each group of samples that underwent extraction. Positive andno-template controls were also included during PCR amplifications. OTUs identified as contaminants present in no-template controlsamples were removed from analyses. Analyses including alpha- and beta-diversity and Kruskal-Wallis significance testing(group_significance.py in QIIME) were performed in QIIME 1.9.1 [] and R3.4.1. In order to achieve adequate statistical power, all samples, regardless of tumor location and stage were combined in ouranalyses. 16S sequencing data can be found at the NCBI Sequence Read Archive under BioProject accession PRJNA445346. Metagenomicsequencing data can be found at the NCBI Sequence Read Archive under BioProject accession PRJNA397219.

2.4. Real-time PCR for detection of Bacteroides fragilis toxin gene

Real-time PCR was performed on a QuantStudio 6 Flex Real-Time PCR System (Applied Biosystems, Waltham, MA, USA), and stoolDNA was tested for the presence or absence of the Bacteroides fragilis toxin (bft) genes. For detection of thebft gene, PCR amplification was performed using primers BFT-F (5'-GGATAAGCGTACTAAAATACAGCTG GAT-3'), BFT-R(5'-CTGCGAACTCATCTCCCAGTATAAA-3') and the probe (5'-FAM-CAGACGGACATTCTC-NFQ-MGB-3') []. Toxigenic B. fragilis (ATCC 43858, ATCC, Manassa, VA, USA) was used as apositive control. PCR was performed using 10 μl of TaqMan Universal Master Mix II (2X), no UNG (Applied Biosystems,Waltham, MA, USA) combined with 900 nM of each primer and 250 nM of the probe. Sterile water was added to adjust the finalreaction volume to 20 μl. Cycling parameters were 95 °C for 10 min then 50 cycles of 95 °C for 15 s and 60°C for 20 s.

2.5. Metabolomics sample preparation and analysis

2.5.1. Tissue sample preparation

Colon tissue and mucosal scrape tissue were removed from TE buffer, rinsed in ice cold PBS, pulverized with a frozenmortar and pestle and stored at −80 °C until submission for targeted and untargeted analysis.

2.5.2. Amino acid panel on UPLC-MS

Serine, homoserine, lanthionine and cystathionine – all amino acids produced concurrently with and proxies forhydrogen sulfide – were measured by LC-MS, similar to our amino acid plus metabolites panel as referenced[]. Tissue homogenate samples were spiked with internal standards(10 μl of serine, homserine, lanthionine, and cystathionine each at a concentration of 2 μg/ml), thendeproteinized with cold me-thanol followed by centrifugation at 10,000g for 15 min. The supernatant wasimmediately derivatized with 6-aminoquinolyl-N-hydro-xysuccinimidyl carbamate according to Waters’ MassTrak kit. An11-point calibration curve underwent similar derivatization procedure after the addition of internal standards. Bothderivatized standards and samples were analyzed on a triple quadrupole mass spectrometer (Thermo TSQ Quantum Ultra) coupledwith an Ultra Pressure Liquid Chromatography (Waters Acquity) system. Data acquisition was done using select ion monitor(SRM). The following transitions (m/z) were monitored: 276.25 > 171.04 for serine, 290.2> 171.04 for homoserine, 275.2 > 171.04 for lanthionine and 282.3 > 171.04 for cystathionine. Concentrations of theanalytes of each unknown were calculated against each respective calibration curve.

2.5.3. Short chain fatty acids (SCFA) panel on GC–MS

SCFA were quantitated via GC–MS as previously published with a few modifications []. Briefly, 50 μl tissue homogenate was added to a tube containing internalstandard (2 μl of 2-ethylbutyric acid at a concentration of 9 μg/ml) in HCl. One milliliter of dichloromethane(DCM) was used to extract SCFA from the mixture. The extract was derivatized withN-Methyl-N-tert-butyldimethylsilyltrifluoroacetamide (MTBSTFA) prior to analysis on GC–MS.Concentrations of acetic acid (m/z 117.0), propionic acid(m/z 131.1), isobutyric acid (m/z 145.1), butyric acid(m/z 145.1), isovaleric acid (m/z 159.1), valeric acid(m/z 159.1), isocaproic acid (m/z 173.2) and hexanoicacid (m/z 173.2) were measured against 11-point calibration curves that underwent the samederivatization.

2.5.4. Data analysis of targeted metabolomics

Kruskal-wallis or ANOVA significance testing followed by Dunn’s post hoc tests were performed in R 3.4.1 toassess metabolite differences between tumor, adjacent, and normal tissue. All samples, regardless of tumor location and stagewere combined in our analyses.

2.5.5. Qualitative large-scale profiling on LC-MS – untargeted metabolomics

Tissue homogenates (50 μl) were deproteinated with cold acetonitrile: methanol (1:6 ratio), kept on ice withintermittent vortexing and centrifuged at 18000×g for 30 min at 4 °C.¹³C₆-phenylalanine (3 μl at 250 ng/μl) was added as internal standard to samples.The supernatant was divided into 4 aliquots and dried down using a stream of nitrogen gas for analysis on a QuadrupleTime-of-Flight Mass Spectrometer (Agilent Technologies 6550 Q-TOF) coupled with an Ultra High Pressure Liquid Chromatograph(1290 Infinity UHPLC Agilent Technologies). Profiling data was acquired under both positive and negative electrosprayionization conditions over a mass range m/z of 100–1700 at a resolution of 10,000(separate runs) in scan mode. Metabolite separation was achieved using two columns of differing polarity, a hydrophilicinteraction column (HILIC, ethylene-bridged hybrid 2.1 x 150 mm, 1.7 μm; Waters) and a reversed-phase C18 column(high-strength silica 2.1 × 150 mm, 1.8 μm; Waters) with gradient described previously [–]. Run time was 18 min forHILIC and 27 min for C18 column using a flow rate of 400 μl/min. A total of four runs per sample were performed togive maximum coverage of metabolites. Samples were injected in duplicate, wherever necessary, and a pooled quality control(QC) sample, made up of all of the samples from each study was injected several times during a run. A separate plasma qualitycontrol (QC) sample was analyzed with pooled QC to account for analytical and instrumental variability. Dried samples werestored at −20 °C until analysis. Samples were reconstituted in running buffer and analyzed within 48 h ofreconstitution. Auto-MS/MS data was also acquired with pooled QC sample to aid in unknown compound identification usingfragmentation pattern.

2.5.6. Data analysis of untargeted metabolomics

Data alignment, filtering, univariate, multivariate statistical and differential analysis was performed using MassProfiler Professional (Agilent Inc, Santa Clara, CA, USA). Metabolites detected in at least ≥80% of one of twogroups were selected for differential expression analyses. Metabolite peak intensities and differential regulation ofmetabolites between groups were determined as described previously [,]. Each sample was normalized to the internal standard and log 2 transformed.Unpaired t-test with multiple testing correction p < 0.05 was used to find metabolites significantlydifferent between the two groups. Default settings were used with the exception of signal-to-noise ratio threshold (3), masslimit (0.0025 units), and time limit (9 s). Putative identification of each metabolite was done based on accurate mass(m/z) against METLIN database using a detection window of ≤ 7 ppm. Theputatively identified metabolites were annotated as Chemical Abstracts Service (CAS), Kyoto Encyclopedia of Genes and Genomes(KEGG), Human Metabolome Project (HMP) database, and LIPID MAPS identifiers [].

Significance testing of identified and unidentified compounds by sample type (tumor, adjacent, normal) was performedusing one-way ANOVA followed by TukeyHSD post hoc tests. P-value computations were asymptotic and the Benjamini-Hochbergcorrection was used for multiple testing. All samples, regardless of tumor location and stage were combined in ouranalyses.

2.6. Community metabolic modeling

To model the metabolism of the bacterial communities detected in the tissue samples, we used the MICOM tool, whichconstructs a steady-state community model based on a set of individual genome-scale metabolic models of bacteria and theirrelative abundances. Notably, a steady-state model may not provide the most effective representation of the dynamic gut community.However, intra-individual gut microbial composition and diversity is, in fact, relatively stable over the long-term (years)[]; thus, we used the model assumptions and proceeded with ouranalyses as follows: We used the AGORA set of individual metabolic models [] for bacteria. We matched the 16S rRNA gene sequence of each OTU of our dataset to the 16S rRNA genesidentified in the genomes used to construct the AGORA models. In the cases where the 16S rRNA genes were not identified in thegenomes, we either looked up the corresponding sequence in Genbank, or, if possible, re-assembled the genomes from publiclydeposited data using SPAdes 3.11.0 [] through Unicycler 0.4.1[]. For matching the OTU sequences to the AGORA 16S sequences, weused VSEARCH 2.4.3 [] with minimum identity of 97%, andparameters maxaccepts 0 and maxrejects 0, selecting the model sequence with highest identity. In case of multiple top matches, thefirst model in database order is selected. Then the model is assigned the relative abundance for the corresponding OTU in aparticular sample. For multiple OTUs mapping to a single model, their relative abundances were added. We also calculated theabundance fraction explained by the set of models. For each sample, we then proceed to run the MICOM tool inputting the models touse and their relative abundances. As media condition, we used a complete medium with the exchange flux for O2 set to zero(EX_o2_m = 0), to replicate anaerobic conditions. If present, we also knocked out the following reactions from anEscherichia coli model (Escherichia_coli_str_K_12_substr_MG1655): CYSDS, FXXRDO and SULR. MICOM finallyoptimized the fluxes using its “linear lagrangian” method, with the GNU Linear Programming Kit (GLPK) as thesolver backend, obtaining a table of microbe growth rates and reaction fluxes as output for analysis. Predicted hydrogen sulfideflux through the community metabolic models was quantified in tumor, adjacent, and normal tissue. Any samples that fell outside ofthe 95% confidence intervals of hydrogen sulfide flux were removed as outliers.

Pearson correlations were used to assess the direct relationship between predicted flux of hydrogen sulfide throughcommunity metabolic models and amino acid concentrations from metabolomics. Models were then queried to identify microbes andreactions responsible for hydrogen sulfide flux.

3. Results

3.1. Microbial composition and diversity

From 16S sequencing, we obtained an average of 41,271 and 5792 paired reads per sample before and after quality controlfiltering respectively. The microbial composition and diversity of colon tissue, stool, and mucosa were examined in relation tosample type: tumor, adjacent, or normal. Alpha-diversity, or within sample microbial diversity, was significantly lower in tumortissue samples as compared to adjacent and normal tissue samples (Fig. 1, Supp. Fig. 1). No distinct clustering was noted by sample type based on weighted(Fig. 2) or unweighted (Supp. Fig.2) principal coordinate analysis (PCoA) of beta-diversity. Despite this, 18 microbes were identified as differentiallyabundant based on sample type (Supp. Table 3). Notably, ahydrogen-sulfide-producing OTU identified as a Fusobacterium species was the only OTU significantly enriched inthe tumor samples. Fusobacterium along with Bacteroides fragilis were also identified assignificantly enriched in early stage (stage 1 and 2) tumor tissue samples as compared to late stage (stage 3 and 4) samples(Supp. Table 4). We next wanted to determine if the B.fragilis we were observing in our samples was toxigenic or not, as toxigenic B. fragilis has awell-established link to CRC [,]. qPCR for the bft gene on stool DNA samples identified toxigenic B. fragilis in 9 out of the98 (9.2%) individuals we tested (Supp. Table 5).

Alpha-diversity of stool, colon tissue, and mucosa samples collected on and off the tumor site (plotted mean with SEM; *p< 0.05, ***p < 0.0001; Kruskal-Wallis with Dunn’s post hoc tests; N = 107).

Beta-diversity (weighted UniFrac) of stool, colon tissue, and mucosa samples collected on and off the tumor site (N = 107individuals sampled).

3.2. Metabolite profiles

To examine the metabolic environment in the gut, we performed targeted and untargeted metabolomics. Despite relatively fewmicrobial compositional differences between tumor, adjacent, and normal samples, analyses of untargeted metabolomic profilesidentified clear metabolic differences in tumor tissue samples as compared to adjacent or normal tissue samples (Fig. 3). Over 124 identified compounds (Supp. Table6) and over 111 unidentified compounds (Supp. Table 7) werefound to be significantly enriched in tumor tissue.

Principal Component Analysis (PCA) of untargeted metabolomic profiles of colon tissue samples (N = 50 individuals). Stooland mucosa samples were not included in this analysis.

To test more specifically for hydrogen sulfide production, we quantified 4 amino acids (serine, homoserine, lanthionine,and cystathionine) concurrently produced with hydrogen sulfide (Fig. 4). These amino acidsare less volatile and easier to quantify than gaseous hydrogen sulfide. Of these metabolites, lanthionine and cystathionine arethe most specific proxies for hydrogen sulfide as serine and homoserine are also produced in other non-hydrogen-sulfide producingpathways. Lanthionine, cystathioine, and homoserine were all significantly enriched in tumor tissue (Fig. 5), suggesting increased hydrogen sulfide production on the tumor. We also quantified short chain fatty acid(SCFA) levels in colon tissue but did not find any significant differences in SCFAs on or off tumor (Supp. Fig. 3).

Hydrogen sulfide production pathways. This study quantified concentrations of the metabolites in red boxes as proxies for hydrogensulfide production. No standard was available for homolanthionine. Reprinted with permission of Caymen Chemical. (Forinterpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Amino acid concentrations on and off tumor tissue. These amino acids serve as proxies for hydrogen sulfide production (plotted meanwith SEM; **p < 0.01, ***p < 0.0001; Kruskal-Wallis with Dunn’s post hoctests; N = 44). Stool and mucosa samples were not included in this analysis.

3.3. Community metabolic modeling

To evaluate how microbial community metabolic interactions could be shaping the metabolic environment –particularly in relation to hydrogen sulfide production, we used 16S microbial composition and abundance data to constructgenome-scale metabolic models that represented the communities associated with tumor, normal, and adjacent tissue samples. Afterperforming flux balance analysis on the communities using MICOM, we then examined the results for hydrogen sulfide flux. Tumortissue samples exhibited significantly greater predicted hydrogen sulfide flux than normal but not adjacent samples (Fig. 6; with outliers – Supp. Fig.4). These results aligned with the metabolomics results – significantly increased concentrations of the aminoacid proxies for hydrogen sulfide on tumor tissue. To confirm this, we examined direct correlations between hydrogen sulfide flux(from the models) and amino acid concentrations in colon tissue (from metabolomics). Unexpectedly, we found that they were notcorrelated (Supp. Fig. 5). When we examined correlations between hydrogensulfide flux and amino acid abundance in Fusobacterium alone the correlations were significant (Supp. Fig. 6). We then queried the models to identify the predicted microbes andreactions involved in hydrogen sulfide production. The models predicted that Fusobacterium nucleatum,Clostridium perfringens, and Filifactor alocis (formerly Fusobacteriumalocis []) were significant hydrogen sulfide producers.Moreover, B. fragilis was observed to uptake and utilize this microbially-produced hydrogen sulfide in themodels. Additionally, the dominant reaction predicted for hydrogen sulfide production was the cysteine degradation pathwaycatalyzed by L-cysteine desulfhydrase (L-cysteine + H₂O − pyruvate + hydrogen sulfide +ammonium), a pathway that did not involve any of the amino acid proxies we use to measure hydrogen sulfide production viametabolomics. While many microbial species contain this pathway, the models attributed the majority of hydrogen sulfide flux toF. nucleatum, C. perfringens, and F. alocis.

Predicted hydrogen sulfide flux from colon tissue and mucosa microbial community metabolic models based on 16S relative abundances(*p < 0.05; Wilcoxon rank sum; n = 36 individuals, 241 samples). Three outliers that fell outside of the95% confidence intervals were removed from this figure. See Supp. Fig.4 for the figure with outliers present. Stool samples were not included in this analysis.

4. Discussion

Our results provide several key insights into the interplay between microbes and metabolites in colorectal cancer. First, thecombination of 16S and metabolomic data revealed that metabolite profiles but not microbial composition differed dramatically betweentumor, adjacent, and normal samples. This suggests that metabolites—or community function—may be more important toevaluate than individual microbes or microbial composition in CRC pathogenesis. While 2 microbes, a hydrogen sulfide producingFusobacterium species and B. fragilis, were significantly enriched in early stage tumor samples,several hundred metabolites were enriched in these same samples including amino acid proxies for hydrogen sulfide production(homoserine, lanthionine, cystathionine). These results suggest a role for hydrogen sulfide in CRC; although, the nature of that roleis unclear based on these data alone. On one hand, hydrogen sulfide produced by F. nucleatum or other microbialspecies may be promoting tumorigenesis by damaging host cells and DNA. Indeed, F. nucelatum has well establishedmechanistic links to CRC tumor development [,–], albeit, unrelated to hydrogen sulfide production. On theother hand, metabolomics profiles identify no source agent (host, microbe, microbial species). The community microbial metabolicmodels were able to predict specific microbial agents (F. nucleatum, C. perfringens, F. alocis) in relation to theincreased hydrogen sulfide flux. However, hydrogen sulfide production could also be attributed to host pathways that are not accountedfor in the models. In the host, hydrogen sulfide is generated as an anti-inflammatory agent; thus, it is feasible that this metaboliteis elevated in tumor samples due to host tissue reaction to tumor-induced inflammation [,]. Further work is necessaryto assess the source and role of hydrogen sulfide in relation to defined microbial communities and host tissue on and off tumor sites.Other factors that could potentially shape host or microbial community composition, expression, and interactions include tumorlocation and stage, patient medications (ranging from proton-pump inhibitors to antidepressants to statins and allergy medications)[,], smoking history[], BMI [], andsurgical preparation (a subset of patients received MiraLAX preparation) [].We were unable to control for all of these variables due to lack of power; however, despite the variation in the patient populationand tumor status, we detected consistent and significant differences in microbial and metabolomic profiles between tumor, adjacent,and normal colon. Additionally, this study design allowed each patient to serve as a control for him/herself as tumor, normal, andadjacent tissue within the same individual were all subject to the same factors (smoking, BMI, medications, surgical preparation).

A second key insight we gained from this work was the value of using community metabolic models integrated with multi-omicdata. The models allowed us to predict hydrogen sulfide flux through defined microbial communities and allowed us to examine thereactions, microbes, and microbial interactions underlying this flux. We made several important observations based on the models.First, the models predicted a feasible microbial basis for the increased hydrogen sulfide production in tumor tissue—aconclusion that was impossible to reach with 16S or metabolomic data alone. Although further in vitro and invivo testing is required to confirm these predictions; ultimately, effective in silico modeling mayprovide us an alternate means of prediction when such testing is limited either by resources or by ethical considerations in humansubjects. Second, metabolomic data appeared to serve as a strong validation of model predictions with both models and metabolomicsidentifying increased hydrogen sulfide production on tumor tissue. However, when we examined the direct relationship betweenmetabolite profiles and hydrogen sulfide flux, we found no significant correlations until we narrowed our examination toFusobacterium alone. This highlights the need to synthesize community metabolic modeling with 16S data forinterpreting metabolomics results. This approach lead us to examine hydrogen sulfide production more closely in the models. Weidentified 2 other microbes (C. perfringens and F. alocis) and a reaction (cysteine degradationcatalyzed by L-cysteine desulfhydrase) responsible for a large portion of the hydrogen sulfide production in the metabolic models.Both C. perfringens and F. alocis have demonstrated hydrogen sulfide production experimentally[–]. Clostridial specieshave additionally been associated with CRC in previous studies [,]. However, neither of these microbes was differentially abundant in 16S tumor samples, but themicrobes could be functioning in different ways on and off the tumor site without varying in abundance, again underscoring theimportance of evaluating microbial metabolic function over single microbes or microbial composition. The L-cysteine desulfhydrasereaction was also an important discovery because it revealed that our metabolomic data potentially underestimated hydrogen sulfideproduction as we did not quantify any proxies for hydrogen sulfide in this pathway. A third important observation we made based on themodels was a unique metabolic interaction involving hydrogen sulfide utilization by B. fragilis. The models predictedthat if hydrogen-sulfide producing bacteria were present, B. fragilis and other Bacteroides specieswould consume that hydrogen sulfide; an interaction, to our knowledge, heretofore unidentified and unstudied. While multiple studieshave demonstrated cross-feeding between Bacteroides species and other members of the gut microbialcommunity—including sulfate-reducing bacteria [,]—no studies, to our knowledge, have demonstrated cross-feeding or metabolic exchange inrelation to F. nucleatum, C. perfringins, F. alocis or hydrogen sulfide. In fact,the only microbes that have been identified as hydrogen sulfide utilizers appear to be environmental microbes [79–81]. As such, the validity of this predictedinteraction, while intriguing, requires further investigation. Interestingly, both F. nucleatum and toxigenicB. fragilis are reported to be independent promoters of CRC tumorigenesis [,–]. It is unknown what the effects of these interacting microbes may have together ontumorigenesis. Both B. fragilis and Fusobacterium were also identified in our 16S data and both weresignificantly enriched in early stage CRC. We further confirmed that the toxigenic strain of B. fragilis was presentin 9% of the 98 individuals tested via qPCR. Co-culture experiments tracking hydrogen sulfide production and utilizationbetween B. fragilis and Fusobacterium and in vivo testing of these microbes aloneand together in gnotobiotic models could provide further insight into if and how these 2 oncogenic microbes interact.

As to why there was no correlation between models and metabolomics, there are a few possibilities:

We were comparing apples to oranges and ideally should have measured and correlated hydrogen sulfide concentrationsagainst hydrogen sulfide flux.
The models are inaccurate. Models are only as good as their genome annotations, and many genomes can includeinaccurate or missing annotations resulting in added or deleted reactions that fail to represent biological reality.Additionally, model assumptions, like the steady state assumption in MICOM, may fail to capture the changing dynamics of thecolorectal cancer microbial community. Our assumption that hydrogen sulfide flux accurately predicts hydrogen sulfideproduction may also be flawed. Finally, these models do not include biological elements like host, diet, environment, ortranscriptional regulation, all of which provide important context in the metabolic interactions of the gut. The models mayhave coincidentally predicted increased hydrogen sulfide flux in tumor tissue due to a heavy reliance on the cysteinedegradation pathway when perhaps this pathway is not as active as predicted in vivo.
The metabolomics are inaccurate. Perhaps the models accurately predicted high levels of hydrogen sulfide flux throughthe cysteine degradation pathway and our metabolomics data underestimates hydrogen sulfide levels by not including a proxy inthis pathway. The metabolomics may have coincidentally predicted high levels of hydrogen sulfide production in tumor tissuedue to increased host-produced anti-inflammatory hydrogen sulfide rather than microbially-produced hydrogen sulfide.

Each of these scenarios can be isolated and tested in vitro and in vivo to determine wherebiological “truth” actually lies. Additionally, to determine if we can produce a better metabolomic-model correlation,we could select and evaluate other microbial metabolites that can be directly quantified in both metabolomics and models (withoutproxies), and that cannot be attributed to host production.

Our study demonstrates one way in which metabolic models can be synthesized with multi–omic data in a complementaryapproach to hypothesis testing. As the ability to model complex biological interactions improves and is validated through laboratoryexperimentation, future modeling efforts may serve as even better predictors for metabolic interactions and disease conditions. Modelsand the tools built around these models are rapidly growing and improving and may eventually account for elements of host[,83], diet, and transcriptionalregulation. Many community modeling approaches like MICOM assess flux, and here we use flux to approximate metabolite production.Further developments in modeling may eventually allow us to predict metabolite production more directly by incorporating varying ratesof enzyme activity. For example, in the following reaction, enzyme X catalyzes the reaction from metabolite A to B and enzyme Ycatalyzes the reaction from B to C.

If X has a faster reaction rate than Y, then metabolite B will accumulate in the system. If Y has a faster reaction rate thanX, then metabolite C will accumulate in the system. Incorporating and tracking these reaction rates could allow future models to infermetabolite profiles more effectively. Advances like this will improve our ability to maximize multi-omic data potential in combinationwith metabolic modeling.

5. Conclusions

Here, we used experimental data (16S) to inform genome-scale community metabolic models; models to test a hypothesis abouthydrogen sulfide production and CRC; experimental data (metabolomics) to assess model output; and modeling data to guide future workon specific metabolic reactions and microbial metabolic interactions. We draw 5 main conclusions from our work:

Hydrogen sulfide production is increased in CRC tumor samples, and this hydrogen sulfide has several potential sourcesincluding host or microbes predicted by both 16S and modeling data.
Microbial metabolic function or community interactions may be more important to evaluate than single microbes ormicrobial composition in CRC pathogenesis.
Metabolic modeling is a rapidly developing and inexpensive way to establish early predictions about community functionand interactions that can guide future work.
A word of warning: Modeling can fail to represent biological reality. Assess accuracy of models in your respectivesystems and be aware that models and reactions can be altered (i.e. added, removed) to improve accuracy.
Synthesis of multi-omic data and microbial community metabolic models allows maximization of data potential forhypothesis testing and hypothesis generation.

Supplementary Material

Acknowledgments

First and foremost, we thank the patients who volunteered for this study. Each measured step of our research is performed with thegreater goal of improving our ability to prevent, diagnose, and treat colorectal cancer, and that research would not be possiblewithout our patients. We also gratefully acknowledge all of the many other individuals who made this research possible – fromstudy coordinators to students, to surgeons, to program directors, to pathology assistants and laboratory technicians. We alsoacknowledge the reviewers for their helpful comments that improved this manuscript. Funding was provided by the NIH (R01CA179243; N.C.and V.L.H. and R01CA170357; L.B.), the Mayo Clinic Center for Cell Signaling in Gastroenterology (NIDDK P30DK084567), the Mayo ClinicMetabolomics Resource Core Pilot and Feasibility Award (U24DK100469), the Fred C. Andersen Foundation (H.N. and N.C.), and the MayoClinic Center for Individualized Medicine. O.R.A thanks the financial support coming from the National Institute of Genomic Medicine(INMEGEN) to develop the computational tool used for the microbiome analysis (MICOM).

Abbreviations

CRC	Colorectal Cancer
GEMs	Genome Scale Metabolic Models
GC–MS	Gas Chromatography–Mass Spectrometry
LC-MS	Liquid Chromatography–Mass Spectrometry
UPLC-MS	Ultra Pressure Liquid Chromatography–Mass Spectrometry
SCFA	Short Chain Fatty Acid

Appendix A. Supplementary data

Supplementary data associated with this article can be found, inthe online version, at http://dx.doi.org/10.1016/j.ymeth.2018.04.024.

References

1. Jenior ML, Leslie JL, Young VB, Schloss PD. Clostridium difficile colonizes alternative nutrient niches during infection across distinct murine gutmicrobiomes. mSystems. 2017;2(4):e00063–17.[PMC free article] [PubMed] [Google Scholar]

2. Sung J, Kim S, Cabatbat JJT, Jang S, Jin YS, Jung GY, Chia N, Kim PJ. Global metabolic interaction network of the human gut microbiota for context-specific community-scaleanalysis. Nat Commun. 2017;8:15393.[PMC free article] [PubMed] [Google Scholar]

3. Koeth RA, Wang Z, Levison BS, Buffa JA, Org E, Sheehy BT, Britt EB, Fu X, Wu Y, Li L, Smith JD, DiDonato JA, Chen J, Li H, Wu GD, Lewis JD, Warrier M, Brown JM, Krauss RM, Tang WHW, Bushman FD, Lusis AJ, Hazen SL. Intestinal microbiota metabolism of l-carnitine, a nutrient in red meat, promotes atherosclerosis. Nat Med. 2013;19(5):576–585.[PMC free article] [PubMed] [Google Scholar]

4. Gomez A, Petrzelkova K, Yeoman CJ, Vlckova K, Mrázek J, Koppova I, Carbonero F, Ulanov A, Modry D, Todd A, Torralba M, Nelson KE, Gaskins HR, Wilson B, Stumpf RM, White BA, Leigh SR. Gut microbiome composition and metabolomic profiles of wild western lowland gorillas (Gorilla gorilla gorilla)reflect host ecology. Mol Ecol. 2015;24(10):2551–2565. [PubMed] [Google Scholar]

5. Theriot CM, Koenigsknecht MJ, Carlson PE, Hatton GE, Nelson AM, Li B, Huffnagle GB, Li J, Young VB. Antibiotic-induced shifts in the mouse gut mi-crobiome and metabolome increase susceptibility to Clostridiumdifficile infection. Nat Commun. 2014;5:3114.[PMC free article] [PubMed] [Google Scholar]

6. Antharam VC, McEwen DC, Garrett TJ, Dossey AT, Li EC, Kozlov AN, Mesbah Z, Wang GP. An integrated metabolomic and microbiome analysis identified specific gut microbiota associated with fecalcholesterol and coprostanol in clostridium difficile infection. PLOS ONE. 2016;11(2):e0148824.[PMC free article] [PubMed] [Google Scholar]

7. Franzosa EA, Morgan XC, Segata N, Waldron L, Reyes J, Earl AM, Giannoukos G, Boylan MR, Ciulla D, Gevers D, Izard J, Garrett WS, Chan AT, Huttenhower C. Relating the metatranscriptome and metagenome of the human gut. Proc Nat Acad Sci. 2014;111(22):E2329–E2338.[PMC free article] [PubMed] [Google Scholar]

8. Kamke J, Kittelmann S, Soni P, Li Y, Tavendale M, Ganesh S, Janssen PH, Shi W, Froula J, Rubin EM, Attwood GT. Rumen metagenome and metatranscriptome analyses of low methane yield sheep reveals a Sharpea-enriched microbiomecharacterised by lactic acid formation and utilisation. Microbiome. 2016;4(1):56.[PMC free article] [PubMed] [Google Scholar]

9. Sung J, Hale V, Merkel AC, Kim PJ, Chia N. Metabolic modeling with Big Data and the gut microbiome. Appl Transl Genomics. 2016;10:10–15.[PMC free article] [PubMed] [Google Scholar]

10. American Cancer Society. Colorectal cancer facts & figures 2014–2016. 2014http://www.cancer.org/acs/groups/content/documents/document/acspc-042280.pdf.

11. Louis P, Hold GL, Flint HJ. The gut microbiota, bacterial metabolites and colorectal cancer. Nat Rev Micro. 2014;12(10):661–672. [PubMed] [Google Scholar]

12. Attene-Ramos MS, Nava GM, Muellner MG, Wagner ED, Plewa MJ, Gaskins HR. DNA damage and toxicogenomic analyses of hydrogen sulfide in human intestinal epithelial FHs 74 Intcells. Environ Mol Mutagen. 2010;51(4):304–314. [PubMed] [Google Scholar]

13. Attene-Ramos MS, Wagner ED, Gaskins HR, Plewa MJ. Hydrogen Sulfide Induces Direct Radical-Associated DNA Damage. Mol Cancer Res. 2007;5(5):455–459. [PubMed] [Google Scholar]