The White-eared Night-Heron (Gorsachius magnificus, G. magnificus) is a critically endangered heron that is very poorly known and only found in southern China and northern Vietnam, with an estimated population of 250 to 999 mature individuals. However, the lack of a reference genome has hindered the implementation of conservation management efforts. In this study, we present the first high-quality chromosome-scale reference genome, which was assembled by integrating PacBio long-reads sequencing, Illumina paired-end sequencing, and Hi-C technology. The genome has a total length of 1.176 Gb, with a scaffold N50 of 84.77 Mb and a contig N50 of 18.46 Mb. Utilizing Hi-C data, we anchored 99.89% of the scaffold sequences onto 29 pairs of chromosomes. Additionally, we identified 18,062 protein-coding genes in the genome, with 95.00% of which were functionally annotated. Notably, BUSCO assessment confirmed the presence of 97.2% of highly conserved Aves genes within the genome. This chromosome-level genome assembly and annotation will be valuable for future investigating the G. magnificus's evolutionary adaptation and conservation.
Pubmed ID: 38228677 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Integrated database resource consisting of 16 main databases, broadly categorized into systems information, genomic information, and chemical information. In particular, gene catalogs in completely sequenced genomes are linked to higher-level systemic functions of cell, organism, and ecosystem. Analysis tools are also available. KEGG may be used as reference knowledge base for biological interpretation of large-scale datasets generated by sequencing and other high-throughput experimental technologies.
View all literature mentionsSoftware tool that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library ) and RepBase ( consensus sequence library ).
View all literature mentionsSoftware for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data.
View all literature mentionsSoftware tool to quantitatively measure genome assembly and annotation completeness based on evolutionarily informed expectations of gene content.
View all literature mentionsSequence analysis software that performs repeat family identification and creates models for sequence data. RepeatModeler utilizes RepeatScout and RECON to identify repeat element boundaries and family relationships.
View all literature mentions