ABSTRACT
Microbes are critical in carbon and nutrient cycling in freshwater ecosystems. Members of the Verrucomicrobia are ubiquitous in such systems, yet their roles and ecophysiology are not well understood. In this study, we recovered 19 Verrucomicrobia draft genomes by sequencing 184 time-series metagenomes from a eutrophic lake and a humic bog that differ in carbon source and nutrient availabilities. These genomes span four of the seven previously defined Verrucomicrobia subdivisions, and greatly expand the known genomic diversity of this freshwater lineage. Genome analysis revealed their role as (poly)saccharide-degraders in freshwater, uncovered interesting genomic features for this life style, and suggested their adaptation to nutrient availabilities in their environments. Between the two lakes, Verrucomicrobia populations differ significantly in glycoside hydrolase gene abundance and functional profiles, reflecting the autochthonous and terrestrially-derived allochthonous carbon sources of the two ecosystems respectively. Several bog populations exhibited nitrogen cost minimization in their proteomes and genomes, which is likely an adaptation to long-term nitrogen limitation in the bog. Interestingly, a number of genomes recovered from the bog contained gene clusters that potentially encode a novel porin-multiheme cytochrome c complex and might be involved in extracellular electron transfer in the anoxic humic-rich environment. Notably, most epilimnion genomes have large numbers of Planctomycete-specific cytochrome c- containing genes, which exhibited nearly opposite distribution patterns with glycoside hydrolase genes, probably associated with the different environmental oxygen availability and carbohydrate complexity between lakes/layers. Overall, the recovered genomes are a major step towards understanding the role, ecophysiology and distribution of Verrucomicrobia in freshwater.
INTRODUCTION
Microbes play important roles in mediating carbon (C) and nutrient cycling in freshwater ecosystems. Among them are the ubiquitous Verrucomicrobia, which exhibit a cosmopolitan distribution in freshwater lakes. For example, they were present in 90% of 81 studied lakes (Zwart et al 2003). Verrucomicrobia abundances often range between <1% to 6% of the total microbial community (Eiler and Bertilsson 2004, Newton et al 2011, Parveen et al 2013), and contributed up to 19% in a humic lake (Arnds et al 2010). Yet, in comparison to other freshwater bacterial groups, such as members of the Actinobacteria, Cyanobacteria and Proteobacteria phyla, Verrucomicrobia have received relatively less attention, and their functions and ecophysiology in freshwater are not well understood.
As a phylum, Verrucomicrobia (V) was first proposed relatively recently, in 1997 (Hedlund et al 1997). Together with Planctomycetes (P), Chlamydiae (C), and sister phyla such as Lentisphaerae, they comprise the PVC superphylum. In addition to being cosmopolitan in freshwater, Verrucomicrobia have been found in oceans (Yoon et al 2007, Yoon et al 2008), soil (Sangwan et al 2005), wetlands (Qiu et al 2014), rhizosphere (da Rocha et al 2010), and animal guts (Derrien et al 2004, Wertz et al 2012), as free-living organisms or symbionts of eukaryotes. Verrucomicrobia isolates are metabolically diverse, including aerobes, facultative anaerobes, and obligate anaerobes, and they are mostly heterotrophs, using various mono-, oligo-, and poly-saccharides for growth (Chin et al 2001, Derrien et al 2004, Hedlund et al 1996, Hedlund et al 1997, Otsuka et al 2013a, Otsuka et al 2013b, Qiu et al 2014, Sangwan et al 2004, Scheuermayer et al 2006, Yoon et al 2007). Not long ago an autotrophic verrucomicrobial methanotroph (Methyiacidiphilum fumariolicum SolV) was discovered in acidic thermophilic environments as the only non-proteobacterial aerobic methanotroph to date (Pol et al 2007).
In marine environments, Verrucomicrobia are also ubiquitous (Freitas et al 2012) and suggested to have a key role as polysaccharide degraders (Cardman et al 2014, Martinez-Garcia et al 2012a). Genomic insights gained through sequencing single cells (Martinez-Garcia et al 2012a) or extracting Verrucomicrobia bins from metagenomes (Herlemann et al 2013) have revealed high abundances of glycoside hydrolase genes, providing more evidence for their critical roles in C cycling in marine environments.
In freshwater, Verrucomicrobia have been suggested to degrade glycolate (Paver and Kent 2010) and polysaccharides (Martinez-Garcia et al 2012a). The abundance of some phylum members was favored by high nutrient availabilities (Haukka et al 2006, Lindström et al 2004), algal blooms (Kolmonen et al 2004), low pH, high temperature, high hydraulic retention time (Lindström et al 2005), and more labile DOC (Arnds et al 2010). To date, there are very few freshwater Verrucomicrobia isolates, including Verrucomicrobium spinosum (Schlesner 1987) and several Prosthecobacter spp. (Hedlund et al 1997). Physiological studies showed that they are aerobes, primarily using carbohydrates, but not amino acids, alcohols, or rarely organic acids for growth. However, these few cultured isolates only represent a single clade within subdivision 1. By contrast, 16S rRNA gene based studies discovered a much wider phylogenie range of freshwater Verrucomicrobia, including subdivisions 1, 2, 3, 4, 5, and 6 (Arnds et al 2010, Eiler and Bertilsson 2004, Martinez-Garcia et al 2012a, Parveen et al 2013, Zwart et al 1998). Due to the very few cultured representatives and few available genomes from this freshwater lineage, the ecological functions of the vast uncultured freshwater Verrucomicrobia are largely unknown.
In this study, we sequenced a total of 184 metagenomes in a time-series study of two lakes with contrasting characteristics, particularly differing in C source, nutrient availabilities, and pH. We recovered a total of 19 Verrucomicrobia draft genomes spanning subdivision 1, 2, 3, and 4 of the seven previously defined Verrucomicrobia subdivisions. We inferred their metabolisms, revealed their adaptation to C and nutrient conditions, and uncovered some interesting and novel features, including a novel putative porin-multiheme cytochrome system that may be involved in extracellular electron transfer. The gained insights advanced our understanding of the ecophysiology, roles in C cycling, and ecological niches of this ubiquitous freshwater bacterial group.
MATERIALS AND METHODs
Study sites
Samples for metagenome sequencing were collected from two temperate lakes in Wisconsin, USA, Lake Mendota and Trout Bog Lake, during ice-off periods of each year (May to November). Mendota is an urban eutrophic lake with most of its C being autochthonous (in-lake produced), whereas Trout Bog is a small, acidic and nutrient-poor dystrophic lake with mostly terrestrially-derived (allochthonous) C. General lake characteristics are summarized in Table 1.
Sampling
For Mendota, we collected depth-integrated water samples from the surface 12 m (mostly consisting of the epilimnion layer) at 94 time points from 2008 to 2012, and samples were referred to as “ME” (Garcia et al 2016). For Trout Bog, we collected the integrated hypolimnion layer at 45 time points from 2007 to 2009 and the integrated epilimnion layer at 45 time points from 2007 to 2009, and samples were referred to as “TH” and “TE”, respectively (Bendall et al 2016). All samples were filtered through 0.22 μm polyethersulfone filters and stored at −80°C until extraction. DNA was extracted from the filters using the FastDNA kit (MP Biomedicals) according to manufacturer’s instruction with some minor modifications as described previously (Shade et al 2008).
Metagenome sequencing, assembly, and draft genome recovery
Details of metagenome sequencing, assembly, and binning were described in Bendall et al. (2016) and Hamilton et al (preprint). Briefly, shotgun Illumina HiSeq 2500 metagenome libraries were constructed for each of the DNA samples. Three combined assemblies were generated by co-assembling reads from all metagenomes within the ME, TE, and TH groups, respectively. Binning was conducted on the three combined assemblies to recover “metagenome-assembled genomes” (MAGs) based on the combination of contig tetranucleotide frequency and differential coverage patterns across time points using MetaBAT (Kang et al 2015). Subsequent manual curation of MAGs was conducted to remove contigs that did not correlate well with the median temporal abundance pattern of all contigs within a MAG, as described in Bendall et al. (2016).
Genome annotation and completeness estimation
MAGs were submitted to the DOE Joint Genome Institute’s Integrated Microbial Genome (IMG) database for gene prediction and function annotation (Markowitz et al 2013). The IMG Taxon Object IDs for Verrucomicrobia MAGs are listed in Table 2. The completeness of each MAG was estimated based on the number of recovered single-copy essential genes compared to the total of 105 single-copy essential genes expected for complete Verrucomicrobia genomes (Albertsen et al 2013). MAGs with an estimated completeness lower than 50% were not included in this study.
Taxonomic and phylogenetic analysis
A total of 19 MAGs were classified to the Verrucomicrobia phylum based on taxonomic assignment by PhyloSift using 37 conserved phylogenetic marker genes (Darling et al 2014), as described in Bendall et al. (2016). A phylogenetic tree was reconstructed from the 19 Verrucomicrobia MAGs and 24 reference genomes using an alignment concatenated from individual protein alignments of five conserved essential single-copy genes (represented by TIGR01391, TIGR01011, TIGR00663, TIGR00460, and TIGR00362) that were recovered in all Verrucomicrobia MAGs. Individual alignments were first generated with MUSCLE (Edgar 2004), concatenated, and trimmed to exclude columns that contain gaps for more than 30% of all sequences. A maximum likelihood phylogenetic tree was constructed using PhyML 3.0 (Guindon et al 2010), with the LG substitution model and the gamma distribution parameter estimated by PhyML. Bootstrap values were calculated based on 100 replicates. Kiritimatiella glycovorans L21-Fru-AB was used as an outgroup in the phylogenetic tree. This bacterium was initially designated as the first (and so far the only) cultured representative of Verrucomicrobia subdivision 5. However, this subdivision was later proposed as a novel sister phylum associated with Verrucomicrobia (Spring et al 2016), making it an ideal outgroup for this analysis.
Estimate of metabolic potential
IMG provides functional annotation based on KO (KEGG orthology) term, COG (cluster of orthologous group), pfam, and TIGRfam. To estimate metabolic potential, we primarily used KO terms due to their direct link to KEGG pathways. COG, pfam, and TIGRfam were also used when KO terms are not available for a function. Pathways are primarily reconstructed according to KEGG modules, and MetaCyc pathway is used if a KEGG module is not available for a pathway. As these MAGs are incomplete genomes, a fraction of genes in a pathway may be missing due to genome incompleteness. Therefore, we estimated the completeness of a pathway as the fraction of recovered enzymes in that pathway (e.g. a pathway is 100% complete if all enzymes in that pathway are encoded by genes recovered in a MAG). As some genes are shared by multiple pathways, signature genes specific for a pathway were used to indicate the presence of a pathway. If signature genes for a pathway were missing in all MAGs, that pathway was likely absent in all genomes. Based on this, we established criteria for estimating pathway completeness in each MAG. If a signature gene in a pathway was present, we report the percentage of genes in the pathway that we found. If a signature gene was absent in a MAG, but present in at least one third of all MAGs (i.e. >=7), we still report the pathway completeness for that MAG in order to account for genome incompleteness. Otherwise, we considered the pathway to be absent (i.e. completeness is 0%).
Glycoside hydrolase identification
Glycoside hydrolase (GH) genes were identified using the dbCAN annotation tool (http://csbl.bmb.uga.edu/dbCAN/annotate.php) (Yin et al 2012) using HMMER search against hidden Markov models (HMMs) built for all GHs, with an E-value cutoff of 1e-7, except GH109, for which we found that the HMM used by dbCAN is not specific for this GH. To identify verrucomicrobial GH109, BLASTP was performed using GH109 sequences from verrucomicrobial Akkermansia muciniphila ATCC BAA-835 listed in the CAZy database (http://www.cazy.org), with E-value cutoff of 1e-6 and query sequence coverage cutoff of 50%.
Other bioinformatic analyses
Protein cellular location was predicted using CELLO v.2.5 (http://cello.life.nctu.edu.tw) (Yu et al 2006) and PSORTb v.3.0 (http://www.psort.org/psortb) (Yu et al 2010). The beta-barrel structure of outer membrane proteins was predicted by PRED-TMBB (http://bioinformatics.biol.uoa.gr//PRED-TMBB) (Bagos et al 2004).
RESULTS AND DISCUSSION
Comparison of the two lakes
The two studied lakes exhibited contrasting characteristics (Table 1). The most notable difference is the primary C source and nutrient availabilities. Mendota is an urban eutrophic lake with most of its C being autochthonous (in-lake produced through photosynthesis). By contrast, Trout Bog is a nutrient-poor dystrophic lake, surrounded by boreal forests and sphagnum mats, thus receiving large amounts of terrestrially-derived allochthonous C that is rich in humic and fulvic acids. Compared to Mendota, Trout Bog features higher DOC levels, but is more limited in nutrient availability, with much higher D0C:TN and D0C:TP ratios (Table 1). Nutrient limitation in Trout Bog is even more extreme than revealed by these ratios because much of the N and P is tied up in complex dissolved organic matter. In addition, Trout Bog has lower oxygenic photosynthesis due to decreased photosynthetically active radiation (PAR) as a result of absorption by DOC (Read and Rose 2013). Together with the high consumption of dissolved oxygen by heterotrophic respiration, oxygen levels decrease quickly with depth in the water column in Trout Bog. Dissolved oxygen levels are below detection in the hypolimnion nearly year-round (Shade et al 2008). Due to these contrasts, we expected to observe differences in bacterial C and nutrient use, as well as differences reflecting the electron acceptor conditions between these two lakes. Hence, the retrieval of numerous Verrucomicrobia draft genomes in the two lakes not only allows the revelation of their general functions in freshwater, but also provides an opportunity to study their ecophysiological adaptation associated with the local environmental differences.
Verrucomicrobia draft genome retrieval and their distribution patterns
Using the binning facilitated by tetranucleotide frequency and relative abundance patterns over time, a total of 19 Verrucomicrobia MAGs were obtained, including eight from ME, three from TE, and eight from TH (Table 2). The 19 MAGs exhibited a clustering of their tetranucleotide frequency largely based on the two lakes (Figure S1), suggesting distinct overall genomic signatures associated with each system.
Genome completeness of the 19 MAGs ranged from 53% to 96%, as determined based on the 105 single-copy essential genes expected for complete Verrucomicrobia genomes (Albertsen et al 2013). We performed phylogenetic analysis of these MAGs using a concatenated alignment of their conserved genes, and found that they span a wide phylogenetic spectrum and distribute in subdivisions 1, 2, 3, and 4 of the seven previously defined Verrucomicrobia subdivisions (Arnds et al 2010, Pol et al 2007, Schlesner et al 2006) (Figure 1), as well as three unclassified Verrucomicrobia MAGs. In fact, the phylogenetic diversity of Verrucomicrobia in these two lakes is higher than that represented by these 19 MAGs. For example, an un-binned contig from the Mendota metagenome is likely from subdivision 6 representing the freshwater clade LD19 (Zwart et al 2003), as its genes share an average amino acid identity of 89% with previously recovered MAGs from subdivision 6 (Hugerth et al 2015), Notably, this contig contains a gene encoding proteorhodopsin, a light-driven proton pump, consistent with a previously recovered rhodopsin gene from a freshwater Verrucomicrobia single cell genome (Martinez-Garcia et al 2012b). However, no rhodopsin gene was observed in any of the 19 other higher quality Verrucomicrobia MAGs that were analyzed extensively in this study. Nevertheless, we restricted our analysis to the 19 MAGs based on their high quality, while at the same time acknowledging that they do not cover all the genomic diversity of Verrucomicrobia populations in our study systems.
Presently available freshwater Verrucomicrobia isolates are restricted to subdivision 1. The recovered MAGs allow the inference of metabolisms and ecology of a considerable diversity within uncultured freshwater Verrucomicrobia. Notably, all MAGs from subdivision 3 were recovered from TH, and all MAGs from subdivision 1, except TH2746, were from the epilimnion (either ME or TE), indicating differences in phylogenetic distribution between lakes and between layers within a lake.
We used normalized fold coverage of MAGs within individual metagenomes to comparatively infer relative population abundance (see detailed coverage depth estimation in Supplementary Text). Briefly, we counted the number of reads mapping with a minimum identity of 95% and calculated a relative abundance for each MAG based on coverage depth per contig and several normalization steps. Thus, we assume that each MAG represents a distinct population within the lake-layer from which it was recovered (Bendall et al 2016, Garcia et al 2016). This estimate does not directly indicate the actual relative abundance of these populations within the total community per se; rather it allows us to compare population abundance levels from different lakes and sampling occasions within the set of 19 MAGs. This analysis indicates that Verrucomicrobia populations in Trout Bog were proportionally more abundant and persistent over time compared to those in Mendota (Figure 2). Verrucomicrobia populations in Mendota were only transiently abundant, bloomed once to a few times during the sampling season and diminished to extremely low levels for the remainder of the year.
Saccharolytic life style and adaptation to different C sources
Verrucomicrobia isolates from different environments are known to grow on various mono-, oligo-, and poly-saccharides, but are unable to grow on amino acids, alcohols, or most organic acids (Chin et al 2001, Derrien et al 2004, Hedlund et al 1996, Hedlund et al 1997, Otsuka et al 2013a, Otsuka et al 2013b, Qiu et al 2014, Sangwan et al 2004, Scheuermayer et al 2006, Shieh and Jean 1998, Yoon et al 2007). Culture-independent research suggests marine Verrucomicrobia as candidate polysaccharide degraders with large number of genes involved in polysaccharide utilization (Cardman et al 2014, Herlemann et al 2013, Martinez-Garcia et al 2012a).
In the 19 Verrucomicrobia MAGs, we observed rich arrays of glycoside hydrolase (GH) genes, representing a total of 78 different GH families acting on diverse polysaccharides (Figure S2). As these genomes have different degrees of completeness, to compare among them, we normalized GH occurrence frequencies by the total number of genes in each MAG to estimate the percentage of genes annotated as GHs (i.e. GH coding density), which ranged from 0.4% to 4.9% for these MAGs (Figure 3a). In general, GH coding density was higher in Trout Bog MAGs than in Mendota MAGs. Notably, six TH MAGs had extremely high (~4%) GH coding densities (Figure 3a), with each MAG harboring 119-239 GH genes, representing 36-59 different GH families (Figures 4 & S2). Although GH coding density in most ME genomes in subdivisions 1 and 2 was relatively low (0.4-1.6%), it was still higher than in many other bacterial groups (Martinez-Garcia et al 2012a).
The GH abundance and diversity within a genome may determine the width of the substrate spectrum and/or the complexity of carbohydrates used by that organism. For example, there are 20 GH genes in the Rubritalea marina genome, and this marine verrucomicrobial aerobe only uses a limited spectrum of carbohydrate monomers and dimers, but not the majority of (poly)saccharides tested (Scheuermayer et al 2006). By contrast, 164 GH genes are present in the Opitutus terrae genome, and this soil verrucomicrobial anaerobe can thus grow on a wider range of mono-, di- and poly-saccharides (Chin et al 2001). Therefore, it is plausible that the GH-rich Trout Bog Verrucomicrobia populations may be able to use a wider range of more complex polysaccharides than the Mendota populations.
The 10 most abundant GH families in these Verrucomicrobia MAGs include GH2, 29, 78, 95, and 106 (Figure 4). These specific GHs were absent or at very low abundances in marine Verrucomicrobia genomes (Herlemann et al 2013, Martinez-Garcia et al 2012a), suggesting a general difference in carbohydrate substrate use between freshwater- and marine Verrucomicrobia. Hierarchical clustering of MAGs based on overall GH abundance profiles indicated a grouping pattern largely separated by lake (Figure S3). Prominently over-represented GHs in most Trout Bog MAGs include GH29, 78, 95, and 106, all of which mainly function as α -L-fucosidases or α -L-rhamnosidases, as well as GH2, a β - galactosidase that also acts on other β -linked dimers. By contrast, over-represented GHs in the Mendota MAGs are GH13, 20, 33, 57, and 77. Among them, GH13 and 57 are α - amylases, and GH20 is β -hexosaminidases, clearly exhibiting different substrate spectra from GHs over-represented in the Trout Bog MAGs. Therefore, the patterns in GH functional profiles may suggest varied carbohydrate substrate preferences and ecological niches occupied by Verrucomicrobia, probably reflecting the different carbohydrate composition derived from different sources between Mendota and Trout Bog.
A particularly interesting contrast of GH genes was observed between ME3880 and TH2746. These two populations were phylogenetically close relatives in subdivision 1 (Figure 1), and had an average nucleotide identity of 74% among the 654 bi-directional best hits between the two genomes. Albeit that, their estimated genome sizes differ substantially (Table 2). Notably, TH2746 possesses a total of 239 GH genes, whereas ME3880 has only 17 GH genes, despite the two MAGs having comparably high levels of completeness. Based on the GH diversity and abundance profile (Figures 3a, 4, and S3), TH2746 is an outlier of subdivision 1, and instead shares more similarity to other TH genomes in subdivision 3. Therefore this subdivision 1 Verrucomicrobia population genome might have been expanded to adapt to the carbohydrate substrate composition in Trout Bog.
Overall, GH diversity and abundance profile may reflect the DOC availability, chemical variety and complexity, and may suggest microbial adaptation to different C sources in the two ecosystems. We speculate that the rich arrays of GH genes, and presumably broader substrate spectra of Trout Bog populations, partly contribute to their higher abundance and persistence over the sampling season (Figure 2), as they are less likely impacted by fluctuations of individual carbohydrates; whereas Mendota populations with fewer GHs and presumably more specific substrate spectra are relying on autochthonous C and therefore exhibit a bloom-and-bust abundance pattern (Figure 2) that might be associated with algal blooms as previous suggested (Kolmonen et al 2004). On the other hand, bogs also experience seasonal algal blooms (Kent et al 2004, Kent et al 2007) that introduce brief pulses of autochthonous C to these otherwise allochthonous-driven systems. Clearly, much remains to be learned about the routes through which C is metabolized by bacteria in such lakes, and comparative genomics is a novel way to use the organisms to tell us about C flow through the ecosystem.
Other genome features of the saccharide-degrading life style
Seven Verrucomicrobia MAGs spanning subdivisions 1, 2, 3, and 4 possess genes needed to construct bacterial microcompartments (BMCs), which are quite rare among studied bacterial lineages. Such BMC genes in Planctomycetes are involved in the degradation of plant and algal cell wall sugars, and are required for growth on L-fucose, L-rhamnose and fucoidans (Erbilgin et al 2014). Genes involved in L-fucose and L-rhamnose degradation cluster with BMC shell protein-coding genes in the seven Verrucomicrobia MAGs (Figure 5a). This is consistent with the high abundance of a -L-fucosidase or a -L-rhamnosidase GH genes in these MAGs (Figure 4), suggesting the importance of fucose- and rhamnose-containing polysaccharides for these Verrucomicrobia populations.
TonB-dependent receptor (TBDR) genes were found in Verrucomicrobia MAGs, and are present at over 20 copies in TE1800 and TH2519. TBDRs are located on the outer cellular membrane of Gram-negative bacteria, usually mediating the transport of iron siderophore complex and vitamin B12 across the outer membrane through an active process. More recently, TBDRs were suggested to be involved in carbohydrate transport across the outer membrane by some bacteria that consume complex carbohydrates, and in their carbohydrate utilization (CUT) loci, TBDR genes usually cluster with genes encoding inner membrane transporters, GHs and regulators for efficient carbohydrate transportation and utilization (Blanvillain et al 2007). Such novel CUT loci are present in TE1800 and TH2519, with TBDR genes clustering with genes encoding inner membrane sugar transporters, monosaccharide utilization enzymes, and GHs involved in the degradation of pectin, xylan, and fucose-containing polymers (Figure 5b). Notably, most GHs in the CUT loci are predicted to be extracellular or outer membrane proteins (Figure 5b), catalyzing extracellular hydrolysis reactions to release mono- and oligo-saccharides, which are transported across the outer membrane by TBDR proteins. Therefore, such CUT loci may allow these Verrucomicrobial populations to coordinately and effectively scavenge the hydrolysis products before they diffuse away.
Genes encoding for inner membrane carbohydrate transporters are abundant in Verrucomicrobia MAGs (Figure S4). The Embden-Mayerhof pathway for glucose degradation, as well as pathways for degrading a variety of other sugar monomers, including galactose, rhamnose, fucose, xylose, and mannose, were recovered (complete or partly-complete) in most MAGs (Figure 6). As these sugars are abundant carbohydrate monomers in plankton and plant cell walls, the presence of these pathways together with GH genes suggest that these Verrucomicrobia populations may use plankton- and plant-derived saccharides. Machinery for pyruvate degradation to acetyl-CoA and the TCA cycle are also present in most MAGs. These results are largely consistent with their hypothesized role in carbohydrate degradation and previous studies on Verrucomicrobia isolates.
Notably, a large number of genes encoding proteins belonging to a sulfatase family (pfam00884) are present in the majority of MAGs (Figure 3b), similar to the high representation of these genes in marine Verrucomicrobia genomes (Herlemann et al 2013, Martinez-Garcia et al 2012a). Sulfatases hydrolyze sulfate esters, which are rich in sulfated polysaccharides. In general, sulfated polysaccharides are abundant in marine algae and plants (mainly in seaweeds) (Jiao et al 2011), but have also been found in some freshwater cyanobacteria (Filali Mouhim et al 1993) and plant species (Dantas-Santos et al 2012). Sulfatase genes in our Verrucomicrobia MAGs were often located in the same neighborhood as genes encoding for extracellular proteins with a putative pectin lyase activity, proteins with a carbohydrate-binding module (pfam13385), GHs, and proteins with PSCyt domains (Figure 3c and discussed later). Their genome context lends support for the participation of these genes in C and sulfur cycling by degrading sulfated polysaccharides, which can serve as an abundant source of sulfur for cell biosynthesis as well as C for energy and growth.
Overall, the high abundance of GH, sulfatase, and carbohydrate transporter genes, metabolic pathways for degrading diverse carbohydrate monomers, and other genome features adapted to the saccharolytic life style suggest that Verrucomicrobia are primarily (poly)saccharide-degraders in freshwater, rather than degraders of the algal exudate glycolate as previously suggested (Paver and Kent 2010) (See details of glycolate utilization by Verrucomicrobia MAGs in Supplementary Text and Figure S5).
Nitrogen (N) metabolism and adaptation to different N availabilities
Most Verrucomicrobia MAGs in our study do not appear to reduce nitrate or other nitrogenous compounds, and they seem to uptake and use ammonia (Figure 6), and occasionally amino acids (Figure S4), as an N-source. Further, some Trout Bog populations may have additional avenues to generate ammonia, including genetic machineries for assimilatory nitrate reduction in TH2746, nitrogenase genes for nitrogen fixation and urease genes in some of the Trout Bog MAGs (Figure 6), probably as adaptions to N-limited conditions in Trout Bog.
Although Mendota is a eutrophic lake, N can become temporarily limiting during the high-biomass period when N is consumed by large amounts of phytoplankton and bacterioplankton (Beversdorf et al 2013). For some bacteria, when N is temporarily limited while C is in excess, cells convert and store the extra C as biopolymers. For example, the verrucomicrobial methanotroph M. fumariolicum SolV accumulated a large amount of glycogen (up to 36% of the total dry weight of cells) when the culture was N-limited (Khadem et al 2012). Similar to this verrucomicrobial methanotroph, genes in glycogen biosynthesis are present in most MAGs from Mendota and Trout Bog (Figure 6). Indeed, a glycogen synthesis pathway is also present in most Verrucomicrobia genomes in the public database (data not shown), suggesting that glycogen accumulation might be a common feature for this phylum to cope with the changing pools of C and N in the environment and facilitate their survival when either is temporally limited.
To cope with long-term sustained N-limitation, such as that which occurs in Trout Bog, some bacteria may have evolved features to minimize N cost in amino acid and nucleotide biosynthesis, thus having reduced proteome N-contents and low genome G+C contents (Bragg and Hyder 2004, Grzymski and Dussaq 2012). Grzymski and Dussaq (2012) used the numbers of N and C atoms per amino-acid residue side chain (ARSC) for each predicted protein to indicate proteome N-contents. Based on ARSC, we found that most, although not all, Trout Bog proteomes have lower N-contents than most Mendota proteomes, with TE1800, TH2519 and TH4093 having extremely low N-contents (Figure S6a, c). Interestingly, proteome C-content is reversely correlated with N-content (correlation coefficient r = −0.83), exhibiting an opposite trend in terms of N and C atom usage between the two lakes (Figure S6b, d, e), probably reflecting the extremely high DOC and low available N in Trout Bog as compared to Mendota. The genome G+C contents ranged from 42% to 68%, and are positively correlated to proteome N-contents (r = 0.84) (Figure S6f), consistent with the expectation that low G+C genomes are favored under long-term N limitation (Bragg and Hyder 2004). Therefore, genome G+C and proteome N contents both suggest that some of the Verrucomicrobia populations in Trout Bog have been adapted to long-term N-limitation.
Phosphorus (P) metabolism and other metabolic features
Verrucomicrobia populations represented by these MAGs may be able to survive under low P conditions, as suggested by the presence of genes responding to P limitation, such as the two-component regulator (phoRB), alkaline phosphatase (phoA), phosphonoacetate hydrolase (phnA), and high-affinity phosphate-specific transporter system (pstABC) (Figure 6). Detailed discussion in P acquisition and metabolism and other metabolic aspects, such as acetate metabolism, sulfur metabolism, oxygen tolerance, and the presence of the alternative complex III and cytochrome c oxidase genes in the oxidative phosphorylation pathway, are discussed in the Supplementary Text (Figure S7).
Anaerobic respiration and a putative porin-multiheme cytochrome c system
Respiration using alternative electron acceptors is important for overall lake metabolism in the DOC-rich humic Trout Bog, as the oxygen levels decrease quickly with depth in the water column. We therefore searched for genes involved in anaerobic respiration, and found that genes in the dissimilatory reduction of nitrate, nitrite, sulfate, sulfite, DMSO, and TMAO are largely absent in all MAGs (Supplementary Text, Figure S7). Compared to those anaerobic processes, genes for dissimilatory metal reduction are less well understood. In more extensively studied cultured iron [Fe(III)] reducers, outer surface c- type cytochromes (cytc), such as OmcE and OmcS in Geobacter sulfurreducens are involved in Fe(III) reduction at the cell outer surface (Mehta et al 2005). Further, a periplasmic multiheme cytochrome c (MHC, e.g. MtrA in Shewanella oneidensis and OmaB/OmaC in G. sulfurreducens) can be embedded into a porin (e.g. MtrB in S. oneidensis and OmbB/OmbC in G. sulfurreducens), forming a porin-MHC complex as an extracellular electron transfer (EET) conduit to reduce extracellular Fe(III) (Liu et al 2014, Shi et al 2014). Such outer surface cytc and porin-MHC systems involved in Fe(III) reduction were also suggested to be important in reducing the quinone groups in humic substances (HS) at the cell surface (Bucking et al 2012, Shyu et al 2002, Voordeckers et al 2010). The reduced HS can be re-oxidized by Fe(III) or oxygen, thus HS can serve as electron shuttles to facilitate Fe(III) reduction (Lovley et al 1996, Lovley and Blunt-Harris 1999) or as regenerateable electron acceptors at the anoxic-oxic interface or over redox cycles (Klupfel et al 2014).
Outer surface cytc or porin-MHC systems homologous to the ones in G. sulfurreducens and S. oneidensis are not present in Verrucomicrobia MAGs. Instead, we identified a novel porin-coding gene clustering with MHC genes in six MAGs (Figure 5c). These porins were predicted to have at least 20 transmembrane motifs, and their adjacent cytc were predicted to be periplasmic proteins with eight conserved heme-binding sites. In several cases, a gene encoding an extracellular MHC is also located in the same gene cluster. As their gene organization is analogous to the porin-MHC gene clusters in G. sulfurreducens and S. oneidensis, we hypothesize that these genes in Verrucomicrobia may encode a novel porin-MHC complex involved in EET.
As these porin-MHC gene clusters are novel, we further confirmed that they are indeed from Verrucomicrobia. Their containing contigs were indeed classified to Verrucomicrobia based on the consensus of the best BLASTP hits for genes on these contigs. Notably, the porin-MHC gene cluster was only observed in MAGs recovered from the HS-rich Trout Bog, especially from the anoxic hypolimnion environment. Searching the NCBI and IMG databases for the porin-MHC gene clusters homologous to those in Trout Bog, we identified homologs in genomes within the Verrucomicrobia phylum, including Opitutus terrae PB90-1 isolated from rice paddy soil, Opitutus sp. GAS368 isolated from forest soil, “Candidatus Udaeobacter copiosus” recovered from prairie soil, Opititae-40 and Opititae-129 recovered from freshwater sediment, and Verrucomicrobia bacterium IMCC26134 recovered from freshwater; some of their residing environments are also rich in HS. Therefore, based on the occurrence pattern of porin-MHC among Verrucomicrobia genomes, we hypothesize that such porin-MHCs might participate in EET to HS in anoxic HS-rich environments, and HS may further shuttle electrons to poorly soluble metal oxides or be regenerated at the anoxic-oxic interface, thereby diverting more C flux to respiration instead of fermentation and methanogenesis, which could impact the overall energy metabolism and green-house gas emission in the bog environment.
Occurrence of Planctomycete-specific cytochrome c and domains
One of the interesting features of Verrucomicrobia and its sister phyla in the PVC superphylum is the presence of a number of novel protein domains in some of their member genomes (Kamneva et al 2012, Studholme et al 2004). These domains were initially identified in marine planctomycete Rhodopirellula baltica (Studholme et al 2004) and therefore, were referred to as “Planctomycete-specific”, although some of them were later identified in other PVC members (Kamneva et al 2012). In our Verrucomicrobia MAGs, most genes containing Planctomycete-specific cytochrome c domains (PSCyt1 to PSCyt3) also contain other Planctomycete-specific domains (PSD1 through PSD5) with various combinations and arrangements (Figure 7 and S8a). Further, PSCyt2-containing and PSCyt3-containing genes are usually next to two different families of unknown genes, respectively (Figure S8b). Such conserved domain architectures and gene organizations, as well as their high occurrence frequencies in some of the Verrucomicrobia MAGs are intriguing, yet nothing is known about their functions. However, some of the PSCyt-containing genes also contain protein domains identifiable as carbohydrate-binding modules (CBMs), suggesting a role in carbohydrate metabolism (see detailed discussion in Supplementary Text).
The coding density of PSCyt-containing genes indicates that they tend to be more abundant in the epilimnion (either ME or TE) genomes (Figure 3c) and exhibit a reverse correlation with the GH coding density (r = −0.62). Interestingly, sulfatase-coding genes are often in the neighborhood of PSCyt-containing genes in ME and TE genomes, whereas sulfatase-coding genes often neighbor with GH genes in TH genomes. The genomic context suggests PSCyt-containing gene functions somewhat mirror those of GHs (although their reaction mechanisms likely differ fundamentally). However, these PSCyt-containing genes were predicted to be periplasmic or cytoplasmic proteins rather than extracellular or outer membrane proteins. Hence, if they are indeed involved in carbohydrate degradation, they likely act on mono- or oligomers that can be transported into the cell. Further, the distribution patterns of GH versus PSCyt-containing genes between the epilimnion and hypolimnion may reflect the difference in oxygen availability and their carbohydrate substrate complexity between the two layers, suggesting some niche differentiation within Verrucomicrobia in freshwater systems. Therefore, a plausible hypothesis is that these bacterial populations have evolved different genetic machineries to use carbohydrates of different complexities under different oxygen and C availabilities.
SUMMARY
Verrucomicrobia MAGs recovered from the two contrasting lakes greatly expanded the known genomic diversity of freshwater Verrucomicrobia, revealed the ecophysiology and some interesting adaptive features of this ubiquitous yet less understood freshwater lineage. The overrepresentation of GH, sulfatase, and carbohydrate transporter genes, the genetic potential to use various sugars, and the microcompartments for fucose and rhamnose degradation suggest that they are primarily (poly)saccharide degraders in freshwater. Most of the MAGs encode machineries to cope with the changing availability of N and P and can survive nutrient limitation. Despite these generalities, these Verrucomicrobia differ significantly between lakes in the abundance and functional profiles of their GH genes, which may reflect different C sources of the two lakes. Further, several Trout Bog MAGs exhibit N cost minimization in their proteomes and genomes, likely an adaptation to long-term sustained N limitation. Interestingly, a number of MAGs in Trout Bog possess gene clusters potentially encoding a novel porin-multiheme cytochrome c complex, and might be involved in extracellular electron transfer in the anoxic humic-rich environment. Intriguingly, large numbers of Planctomycete-specific cytochrome c- containing genes are present in MAGs from the epilimnion, exhibiting nearly opposite distribution patterns with GH genes. Future studies are needed to elucidate the functions of these novel and fascinating genomic features.
CONFLICT OF INTEREST
The authors declare no conflict of interest.
ACKNOWLEDGEMENTS
We thank the North Temperate Lakes Microbial Observatory 2007-2012 field crews, UW-Trout Lake Station, the UW Center for Limnology, and the Global Lakes Ecological Observatory Network for field and logistical support. We give special thanks to past McMahon lab graduate students Ashley Shade, Ryan Newton, Emily Read, and Lucas Beversdorf. We acknowledge efforts by many McMahon Lab undergrads and technicians related to sample collection and DNA extraction, particularly Georgia Wolfe. KDM acknowledges funding from the United States National Science Foundation Microbial Observatories program (MCB-0702395), the Long Term Ecological Research program (NTL-LTER DEB-1440297) and an INSPIRE award (DEB-1344254). This material is also based upon work that supported by the National Institute of Food and Agriculture, U.S. Department of Agriculture (Hatch Project 1002996). Finally, we personally thank the individual program directors and leadership at the National Science Foundation for their commitment to continued support of long term ecological research.