Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Molecular fossils illuminate the evolution of retroviruses following a macroevolutionary transition from land to water.

PLoS pathogens | 2021

The ancestor of cetaceans underwent a macroevolutionary transition from land to water early in the Eocene Period >50 million years ago. However, little is known about how diverse retroviruses evolved during this shift from terrestrial to aquatic environments. Did retroviruses transition into water accompanying their hosts? Did retroviruses infect cetaceans through cross-species transmission after cetaceans invaded the aquatic environments? Endogenous retroviruses (ERVs) provide important molecular fossils for tracing the evolution of retroviruses during this macroevolutionary transition. Here, we use a phylogenomic approach to study the origin and evolution of ERVs in cetaceans. We identify a total of 8,724 ERVs within the genomes of 25 cetaceans, and phylogenetic analyses suggest these ERVs cluster into 315 independent lineages, each of which represents one or more independent endogenization events. We find that cetacean ERVs originated through two possible routes. 298 ERV lineages may derive from retrovirus endogenization that occurred before or during the transition from land to water of cetaceans, and most of these cetacean ERVs were reaching evolutionary dead-ends. 17 ERV lineages are likely to arise from independent retrovirus endogenization events that occurred after the split of mysticetes and odontocetes, indicating that diverse retroviruses infected cetaceans through cross-species transmission from non-cetacean mammals after the transition to aquatic life of cetaceans. Both integration time and synteny analyses support the recent or ongoing activity of multiple retroviral lineages in cetaceans, some of which proliferated into hundreds of copies within the host genomes. Although ERVs only recorded a proportion of past retroviral infections, our findings illuminate the complex evolution of retroviruses during one of the most marked macroevolutionary transitions in vertebrate history.

Pubmed ID: 34252162 RIS Download

Research resources used in this publication

None found

Antibodies used in this publication

None found

Associated grants

None

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


Open Reading Frame Finder (tool)

RRID:SCR_016643

Software tool to search for open reading frames (ORFs) in the DNA sequence. The program returns the range of each ORF, along with its protein translation. Used to search newly sequenced DNA for potential protein encoding segments, verify predicted protein. Limited to the subrange of the query sequence up to 50 kb long.

View all literature mentions

BLASTN (tool)

RRID:SCR_001598

Web application to search nucleotide databases using a nucleotide query. Algorithms: blastn, megablast, discontiguous megablast.

View all literature mentions

MAFFT (tool)

RRID:SCR_011811

Software package as multiple alignment program for amino acid or nucleotide sequences. Can align up to 500 sequences or maximum file size of 1 MB. First version of MAFFT used algorithm based on progressive alignment, in which sequences were clustered with help of Fast Fourier Transform. Subsequent versions have added other algorithms and modes of operation, including options for faster alignment of large numbers of sequences, higher accuracy alignments, alignment of non-coding RNA sequences, and addition of new sequences to existing alignments.

View all literature mentions

MUSCLE (tool)

RRID:SCR_011812

Multiple sequence alignment method with reduced time and space complexity.Multiple sequence alignment with high accuracy and high throughput. Data analysis service for multiple sequence comparison by log- expectation.

View all literature mentions

TBLASTN (tool)

RRID:SCR_011822

Tool to search translated nucleotide databases using a protein query.

View all literature mentions

PAML (tool)

RRID:SCR_014932

Package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood. PAML estimates parameters and tests hypotheses to study the evolutionary process from a phylogenetic tree.

View all literature mentions

FastTree (tool)

RRID:SCR_015501

Source code that infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. It uses the Jukes-Cantor or generalized time-reversible (GTR) models of nucleotide evolution and the JTT, WAG, or LG models of amino acid evolution.

View all literature mentions

TimeTree (tool)

RRID:SCR_021162

Public knowledge base for information on evolutionary timescale of life. Data from thousands of published studies are assembled into searchable tree of life scaled to time.

View all literature mentions