There is widespread interest in how geochemistry affects the genomic makeup of microbial communities, but the possible impacts of oxidation-reduction (redox) conditions on the chemical composition of biomacromolecules remain largely unexplored. Here we document systematic changes in the carbon oxidation state, a metric derived from the chemical formulas of biomacromolecular sequences, using published metagenomic and metatranscriptomic datasets from 18 studies representing different marine and terrestrial environments. We find that the carbon oxidation states of DNA, as well as proteins inferred from coding sequences, follow geochemical redox gradients associated with mixing and cooling of hot spring fluids in Yellowstone National Park (USA) and submarine hydrothermal fluids. Thermodynamic calculations provide independent predictions for the environmental shaping of the gene and protein composition of microbial communities in these systems. On the other hand, the carbon oxidation state of DNA is negatively correlated with oxygen concentration in marine oxygen minimum zones. In this case, a thermodynamic model is not viable, but the low carbon oxidation state of DNA near the ocean surface reflects a low GC content, which can be attributed to genome reduction in organisms adapted to low-nutrient conditions. We also present evidence for a depth-dependent increase of oxidation state at the species level, which might be associated with alteration of DNA through horizontal gene transfer and/or selective degradation of relatively reduced (AT-rich) extracellular DNA by heterotrophic bacteria. Sediments exhibit even more complex behavior, where carbon oxidation state minimizes near the sulfate-methane transition zone and rises again at depth; markedly higher oxidation states are also associated with older freshwater-dominated sediments in the Baltic Sea that are enriched in iron oxides and have low organic carbon. This geobiochemical study of carbon oxidation state reveals a new aspect of environmental information in metagenomic sequences, and provides a reference frame for future studies that may use ancient DNA sequences as a paleoredox indicator.
Pubmed ID: 30804909 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
NIH genetic sequence database that provides annotated collection of all publicly available DNA sequences for almost 280 000 formally described species (Jan 2014) .These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using web-based BankIt or standalone Sequin programs, and GenBank staff assigns accession numbers upon data receipt. It is part of International Nucleotide Sequence Database Collaboration and daily data exchange with European Nucleotide Archive (ENA) and DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through NCBI Entrez retrieval system, which integrates data from major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of GenBank database are available by FTP.
View all literature mentionsAn automated analysis platform for metagenomes providing quantitative insights into microbial populations based on sequence data. The server primarily provides upload, quality control, automated annotation and analysis for prokaryotic metagenomic shotgun samples.
View all literature mentionsA set of software tools ( Reaper, Tally and Sequence Imp) designed to streamline the analysis of next-generation sequencing data. Although designed with small RNA sequence analysis in mind the tools can be used to address issues facing next-generation sequencing in general.
View all literature mentionsA software application for finding fragmented genes in short reads and may be applied to predict prokaryotic genes in incomplete assemblies or complete genomes.
View all literature mentions