The deep sea is a massive, largely oligotrophic ecosystem, stretched over nearly 65% of the planet's surface. Deep-sea planktonic communities are almost completely dependent upon organic carbon sinking from the productive surface, forming a vital component of global biogeochemical cycles. However, despite their importance, viruses from the deep ocean remain largely unknown. Here, we describe the first complete genomes of deep-sea viruses assembled from metagenomic fosmid libraries. "Candidatus Pelagibacter" (SAR11) phage HTVC010P and Puniceispirillum phage HMO-2011 are considered the most abundant cultured marine viruses known to date. Remarkably, some of the viruses described here recruited as many reads from deep waters as these viruses do in the photic zone, and, considering the gigantic scale of the bathypelagic habitat, these genomes provide information about what could be some of the most abundant viruses in the world at large. Their role in the viral shunt in the global ocean could be very significant. Despite the challenges encountered in inferring the identity of their hosts, we identified one virus predicted to infect members of the globally distributed SAR11 cluster. We also identified a number of putative proviruses from diverse taxa, including deltaproteobacteria, bacteroidetes, SAR11, and gammaproteobacteria. Moreover, our findings also indicate that lysogeny is the preferred mode of existence for deep-sea viruses inhabiting an energy-limited environment, in sharp contrast to the predominantly lytic lifestyle of their photic-zone counterparts. Some of the viruses show a widespread distribution, supporting the tenet "everything is everywhere" for the deep-ocean virome.
Pubmed ID: 27460793 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Database of biological data related to a single initiative, originating from a single organization or from a consortium. A BioProject record provides users a single place to find links to the diverse data types generated for that project. It is a searchable collection of complete and incomplete (in-progress) large-scale sequencing, assembly, annotation, and mapping projects for cellular organisms. Submissions are supported by a web-based Submission Portal. The database facilitates organization and classification of project data submitted to NCBI, EBI and DDBJ databases that captures descriptive information about research projects that result in high volume submissions to archival databases, ties together related data across multiple archives and serves as a central portal by which to inform users of data availability. BioProject records link to corresponding data stored in archival repositories. The BioProject resource is a redesigned, expanded, replacement of the NCBI Genome Project resource. The redesign adds tracking of several data elements including more precise information about a project''''s scope, material, and objectives. Genome Project identifiers are retained in the BioProject as the ID value for a record, and an Accession number has been added. Database content is exchanged with other members of the International Nucleotide Sequence Database Collaboration (INSDC). BioProject is accessible via FTP.
View all literature mentionsWeb application to search nucleotide databases using a nucleotide query. Algorithms: blastn, megablast, discontiguous megablast.
View all literature mentionsA database of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Users can analyze protein sequences for Pfam matches, view Pfam family annotation and alignments, see groups of related families, look at the domain organization of a protein sequence, find the domains on a PDB structure, and query Pfam by keywords. There are two components to Pfam: Pfam-A and Pfam-B. Pfam-A entries are high quality, manually curated families that may automatically generate a supplement using the ADDA database. These automatically generated entries are called Pfam-B. Although of lower quality, Pfam-B families can be useful for identifying functionally conserved regions when no Pfam-A entries are found. Pfam also generates higher-level groupings of related families, known as clans (collections of Pfam-A entries which are related by similarity of sequence, structure or profile-HMM).
View all literature mentionsA free package of software programs for inferring phylogenies (evolutionary trees). The source code is distributed (in C), and executables are also distributed. In particular, already-compiled executables are available for Windows (95/98/NT/2000/me/xp/Vista), Mac OS X, and Linux systems. Older executables are also available for Mac OS 8 or 9 systems.
View all literature mentionsSoftware package as multiple alignment program for amino acid or nucleotide sequences. Can align up to 500 sequences or maximum file size of 1 MB. First version of MAFFT used algorithm based on progressive alignment, in which sequences were clustered with help of Fast Fourier Transform. Subsequent versions have added other algorithms and modes of operation, including options for faster alignment of large numbers of sequences, higher accuracy alignments, alignment of non-coding RNA sequences, and addition of new sequences to existing alignments.
View all literature mentionsA web-based tool used to search translated nucleotide databases using a translated nucleotide query.
View all literature mentions