Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Publication

Dense sampling of bird diversity increases power of comparative genomics.

Nature | 2020

Whole-genome sequencing projects are increasingly populating the tree of life and characterizing biodiversity1-4. Sparse taxon sampling has previously been proposed to confound phylogenetic inference5, and captures only a fraction of the genomic diversity. Here we report a substantial step towards the dense representation of avian phylogenetic and molecular diversity, by analysing 363 genomes from 92.4% of bird families-including 267 newly sequenced genomes produced for phase II of the Bird 10,000 Genomes (B10K) Project. We use this comparative genome dataset in combination with a pipeline that leverages a reference-free whole-genome alignment to identify orthologous regions in greater numbers than has previously been possible and to recognize genomic novelties in particular bird lineages. The densely sampled alignment provides a single-base-pair map of selection, has more than doubled the fraction of bases that are confidently predicted to be under conservation and reveals extensive patterns of weak selection in predominantly non-coding DNA. Our results demonstrate that increasing the diversity of genomes used in comparative studies can reveal more shared and lineage-specific variation, and improve the investigation of genomic characteristics. We anticipate that this genomic resource will offer new perspectives on evolutionary processes in cross-species comparative analyses and assist in efforts to conserve species.

Pubmed ID: 33177665 RIS Download

Research resources used in this publication

None found

Additional research tools detected in this publication

Antibodies used in this publication

None found

Associated grants

Agency: NHLBI NIH HHS, United States
Id: U01 HL137183
Agency: Howard Hughes Medical Institute, United States
Agency: NHGRI NIH HHS, United States
Id: T32 HG008345
Agency: NHGRI NIH HHS, United States
Id: R01 HG010053
Agency: NHGRI NIH HHS, United States
Id: R01 HG010485
Agency: NHGRI NIH HHS, United States
Id: U54 HG007990
Agency: NIGMS NIH HHS, United States
Id: R35 GM133412

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.

RefSeq (tool)

RRID:SCR_003496

Collection of curated, non-redundant genomic DNA, transcript RNA, and protein sequences produced by NCBI. Provides a reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis, expression studies, and comparative analyses. Accessed through the Nucleotide and Protein databases.

View all literature mentions

NONCODE (tool)

RRID:SCR_007822

Collection of non-coding RNAs (excluding tRNAs and rRNAs) as an integrated knowledge database. Used to get text information such as class,name,location,related publication,mechanism through which it exerts its function, view figures which show their location in the genome or in a specific DNA fragment, and the regulation elements flanking the ncRNA gene sequences.

View all literature mentions

ALLPATHS-LG (tool)

RRID:SCR_010742

Software tool as whole genome shotgun assembler that can generate high quality genome assemblies using short reads (~100bp) such as those produced by the new generation of sequencers.

View all literature mentions

SOAPdenovo (tool)

RRID:SCR_010752

THIS RESOURCE IS NO LONGER IN SERVICE. Documented on February 24,2023. Software tool for de novo assembly of human genomes with massively parallel short read sequencing.Short-read assembly method that can build de novo draft assembly for human sized genomes.Software package for assembling short oligonucleotide into contigs and scaffolds.

View all literature mentions

RepeatMasker (tool)

RRID:SCR_012954

Software tool that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library ) and RepBase ( consensus sequence library ).

View all literature mentions

BUSCO (tool)

RRID:SCR_015008

Software tool to quantitatively measure genome assembly and annotation completeness based on evolutionarily informed expectations of gene content.

View all literature mentions

RepeatModeler (tool)

RRID:SCR_015027

Sequence analysis software that performs repeat family identification and creates models for sequence data. RepeatModeler utilizes RepeatScout and RECON to identify repeat element boundaries and family relationships.

View all literature mentions

GeneWise (tool)

RRID:SCR_015054

Gene alignment tool from the EBI which predicts gene structure using similar protein sequences. See also the associated GenomeWise tool.

View all literature mentions

Conservation (tool)

RRID:SCR_016064

Software for scoring protein sequence conservation using the Jensen-Shannon divergence. It can be used to predict catalytic sites and residues near bound ligands.

View all literature mentions

Examl (tool)

RRID:SCR_016087

Source code for large-scale phylogenetic analyses on whole-transcriptome and whole-genome alignments using supercomputers.

View all literature mentions

UniProt (tool)

RRID:SCR_002380

Collection of data of protein sequence and functional information. Resource for protein sequence and annotation data. Consortium for preservation of the UniProt databases: UniProt Knowledgebase (UniProtKB), UniProt Reference Clusters (UniRef), and UniProt Archive (UniParc), UniProt Proteomes. Collaboration between European Bioinformatics Institute (EMBL-EBI), SIB Swiss Institute of Bioinformatics and Protein Information Resource. Swiss-Prot is a curated subset of UniProtKB.

View all literature mentions

UCSC Genome Browser (tool)

RRID:SCR_005780

Portal to interactively visualize genomic data. Provides reference sequences and working draft assemblies for collection of genomes and access to ENCODE and Neanderthal projects. Includes collection of vertebrate and model organism assemblies and annotations, along with suite of tools for viewing, analyzing and downloading data.

View all literature mentions

NCBI (tool)

RRID:SCR_006472

A portal to biomedical and genomic information. NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information for the better understanding of molecular processes affecting human health and disease.

View all literature mentions

About

The SciCrunch Infrastructure was developed as a cooperative data platform to be used by diverse communities in making data more FAIR.

Contact Us

FAIR Data Informatics Lab

University of California, San Diego

9500 Gilman Drive, Mail Code 0608

La Jolla, CA 92093-0608

United States

info

scicrunch.org

About SciCrunch | Privacy Policy | Terms of Service

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Dense sampling of bird diversity increases power of comparative genomics.

Research resources used in this publication

Additional research tools detected in this publication

Antibodies used in this publication

Associated grants

This is a list of tools and resources that we have found mentioned in this publication.