Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Chromosome-scale Genome Assembly of the Yellow Nutsedge (Cyperus esculentus).

Genome biology and evolution | 2023

The yellow nutsedge (Cyperus esculentus L. 1753) is an unconventional oil plant with oil-rich tubers, and a potential alternative for traditional oil crops. Here, we reported the first high-quality and chromosome-level genome assembly of the yellow nutsedge generated by combining PacBio HiFi long reads, Novaseq short reads, and Hi-C data. The final genome size is 225.6 Mb with an N50 of 4.3 Mb. More than 222.9 Mb scaffolds were anchored to 54 pseudochromosomes with a BUSCO score of 96.0%. We identified 76.5 Mb (33.9%) repetitive sequences across the genome. A total of 23,613 protein-coding genes were predicted in this genome, of which 22,847 (96.8%) were functionally annotated. A whole-genome duplication event was found after the divergence of Carex littledalei and Rhynchospora breviuscula, indicating the rich genetic resources of this species for adaptive evolution. Several significantly enriched GO terms were related to invasiveness of the yellow nutsedge, which may explain its plastic adaptability. In addition, several enriched Kyoto Encyclopedia of Genes and Genomes pathways and expanded gene families were closely related with substances in tubers, partially explaining the genomic basis of characteristics of this oil-rich tuber.

Pubmed ID: 36807517 RIS Download

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


BLASTN (tool)

RRID:SCR_001598

Web application to search nucleotide databases using a nucleotide query. Algorithms: blastn, megablast, discontiguous megablast.

View all literature mentions

PRINTS (tool)

RRID:SCR_003412

Compendium of protein fingerprints. Diagnostic fingerprint database.

View all literature mentions

Pfam (tool)

RRID:SCR_004726

A database of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Users can analyze protein sequences for Pfam matches, view Pfam family annotation and alignments, see groups of related families, look at the domain organization of a protein sequence, find the domains on a PDB structure, and query Pfam by keywords. There are two components to Pfam: Pfam-A and Pfam-B. Pfam-A entries are high quality, manually curated families that may automatically generate a supplement using the ADDA database. These automatically generated entries are called Pfam-B. Although of lower quality, Pfam-B families can be useful for identifying functionally conserved regions when no Pfam-A entries are found. Pfam also generates higher-level groupings of related families, known as clans (collections of Pfam-A entries which are related by similarity of sequence, structure or profile-HMM).

View all literature mentions

MAKER (tool)

RRID:SCR_005309

Software genome annotation pipeline. Portable and easily configurable genome annotation pipeline. Used to allow smaller eukaryotic and prokaryotic genomeprojects to independently annotate their genomes and to create genome databases. MAKER identifies repeats, aligns ESTs and proteins to genome, produces ab-initio gene predictions and automatically synthesizes these data into gene annotations having evidence based quality values.

View all literature mentions

ProDom (tool)

RRID:SCR_006969

Comprehensive set of protein domain families automatically generated from UniProt Knowledge Database. Automated clustering of homologous domains generated from global comparison of all available protein sequences.

View all literature mentions

SNAP (tool)

RRID:SCR_007936

A sequence analysis tool providing a simple but detailed analysis of human genes and their variations. For each gene, a gene-gene relationship network can be generated based on protein-protein interaction data, metabolic pathway connections and extended through phylogenetic relations. Snap provides tools for designing sequence primers and evaluating RNA splicing effects of single SNPs - known from the databases or defined by you. Primers can be designed for the amplification or sequencing of cDNA, genomic DNA, introns only or exons only.

View all literature mentions

Augustus (tool)

RRID:SCR_008417

Software for gene prediction in eukaryotic genomic sequences. Serves as a basis for further steps in the analysis of sequenced and assembled eukaryotic genomes.

View all literature mentions

GMAP (tool)

RRID:SCR_008992

THIS RESOURCE IS NO LONGER IN SERVICE, documented August 29, 2016. A software program for mapping and aligning cDNA sequences to a genome. The program maps and aligns a single sequence with minimal startup time and memory requirements, and provides fast batch processing of large sequence sets. The program generates accurate gene structures, even in the presence of substantial polymorphisms and sequence errors, without using probabilistic splice site models. Methodology underlying the program includes a minimal sampling strategy for genomic mapping, oligomer chaining for approximate alignment, sandwich DP for splice site detection, and microexon identification with statistical significance testing.

View all literature mentions

MAFFT (tool)

RRID:SCR_011811

Software package as multiple alignment program for amino acid or nucleotide sequences. Can align up to 500 sequences or maximum file size of 1 MB. First version of MAFFT used algorithm based on progressive alignment, in which sequences were clustered with help of Fast Fourier Transform. Subsequent versions have added other algorithms and modes of operation, including options for faster alignment of large numbers of sequences, higher accuracy alignments, alignment of non-coding RNA sequences, and addition of new sequences to existing alignments.

View all literature mentions

TBLASTN (tool)

RRID:SCR_011822

Tool to search translated nucleotide databases using a protein query.

View all literature mentions

KEGG (tool)

RRID:SCR_012773

Integrated database resource consisting of 16 main databases, broadly categorized into systems information, genomic information, and chemical information. In particular, gene catalogs in completely sequenced genomes are linked to higher-level systemic functions of cell, organism, and ecosystem. Analysis tools are also available. KEGG may be used as reference knowledge base for biological interpretation of large-scale datasets generated by sequencing and other high-throughput experimental technologies.

View all literature mentions

RepeatMasker (tool)

RRID:SCR_012954

Software tool that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library ) and RepBase ( consensus sequence library ).

View all literature mentions

PAML (tool)

RRID:SCR_014932

Package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood. PAML estimates parameters and tests hypotheses to study the evolutionary process from a phylogenetic tree.

View all literature mentions

BUSCO (tool)

RRID:SCR_015008

Software tool to quantitatively measure genome assembly and annotation completeness based on evolutionarily informed expectations of gene content.

View all literature mentions

GeneWise (tool)

RRID:SCR_015054

Gene alignment tool from the EBI which predicts gene structure using similar protein sequences. See also the associated GenomeWise tool.

View all literature mentions

Semi-Manual Alignment to Reference Templates (tool)

RRID:SCR_019265

Software tool that extends WholeBrain framework in R for segmenting and registering experimental images to Allen Mouse Common Coordinate Framework (CCF). Streamlines processing of large volumetric LSFM datasets and solves issues with non-uniform morphing across anterior-posterior axis with interactive “choice game.” Accounts for duplicate cell counts in adjacent z images and presents new ways to easily parse apart and interactively visualize final mapped datasets.

View all literature mentions