Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Publication

A high-quality genome assembly highlights rye genomic characteristics and agronomically important genes.

Nature genetics | 2021

Rye is a valuable food and forage crop, an important genetic resource for wheat and triticale improvement and an indispensable material for efficient comparative genomic studies in grasses. Here, we sequenced the genome of Weining rye, an elite Chinese rye variety. The assembled contigs (7.74 Gb) accounted for 98.47% of the estimated genome size (7.86 Gb), with 93.67% of the contigs (7.25 Gb) assigned to seven chromosomes. Repetitive elements constituted 90.31% of the assembled genome. Compared to previously sequenced Triticeae genomes, Daniela, Sumaya and Sumana retrotransposons showed strong expansion in rye. Further analyses of the Weining assembly shed new light on genome-wide gene duplications and their impact on starch biosynthesis genes, physical organization of complex prolamin loci, gene expression features underlying early heading trait and putative domestication-associated chromosomal regions and loci in rye. This genome sequence promises to accelerate genomic and breeding studies in rye and related cereal crops.

Pubmed ID: 33737755 RIS Download

Research resources used in this publication

None found

Additional research tools detected in this publication

Antibodies used in this publication

None found

Associated grants

None

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.

VCFtools (tool)

RRID:SCR_001235

Software package for working with VCF files. Used to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.Implements various utilities for processing Variant Call Format files, including validation, merging, comparing. Provides general Perl API.

View all literature mentions

BLASTN (tool)

RRID:SCR_001598

Web application to search nucleotide databases using a nucleotide query. Algorithms: blastn, megablast, discontiguous megablast.

View all literature mentions

SAMTOOLS (tool)

RRID:SCR_002105

Original SAMTOOLS package has been split into three separate repositories including Samtools, BCFtools and HTSlib. Samtools for manipulating next generation sequencing data used for reading, writing, editing, indexing,viewing nucleotide alignments in SAM,BAM,CRAM format. BCFtools used for reading, writing BCF2,VCF, gVCF files and calling, filtering, summarising SNP and short indel sequence variants. HTSlib used for reading, writing high throughput sequencing data.

View all literature mentions

SnpEff (tool)

RRID:SCR_005191

Genetic variant annotation and effect prediction software toolbox that annotates and predicts effects of variants on genes (such as amino acid changes). By using standards, such as VCF, SnpEff makes it easy to integrate with other programs.

View all literature mentions

OrthoMCL DB: Ortholog Groups of Protein Sequences (tool)

RRID:SCR_007839

OrthoMCL is a genome-scale algorithm for grouping orthologous protein sequences. It provides not only groups shared by two or more species/genomes, but also groups representing species-specific gene expansion families. OrthoMCL starts with reciprocal best hits within each genome as putative in-paralog/recent paralog pairs and reciprocal best hits across any two genomes as putative ortholog pairs. Related proteins are interlinked in a similarity graph. Then MCL (Markov Clustering algorithm,Van Dongen 2000; www.micans.org/mcl) is invoked to split mega-clusters. This process is analogous to the manual review in COG construction. MCL clustering is based on weights between each pair of proteins, so to correct for differences in evolutionary distance the weights are normalized before running MCL.

View all literature mentions

BEAST (tool)

RRID:SCR_010228

A cross-platform software program for Bayesian MCMC analysis of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability. We include a simple to use user-interface program for setting up standard analyses and a suit of programs for analysing the results.

View all literature mentions

MUSCLE (tool)

RRID:SCR_011812

Multiple sequence alignment method with reduced time and space complexity.Multiple sequence alignment with high accuracy and high throughput. Data analysis service for multiple sequence comparison by log- expectation.

View all literature mentions

cutadapt (tool)

RRID:SCR_011841

Software tool that removes adapter sequences from DNA sequencing reads.

View all literature mentions

BLAT (tool)

RRID:SCR_011919

Software designed to quickly find sequences of 95% and greater similarity of length 25 bases or more.

View all literature mentions

KEGG (tool)

RRID:SCR_012773

Integrated database resource consisting of 16 main databases, broadly categorized into systems information, genomic information, and chemical information. In particular, gene catalogs in completely sequenced genomes are linked to higher-level systemic functions of cell, organism, and ecosystem. Analysis tools are also available. KEGG may be used as reference knowledge base for biological interpretation of large-scale datasets generated by sequencing and other high-throughput experimental technologies.

View all literature mentions

RepeatMasker (tool)

RRID:SCR_012954

Software tool that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library ) and RepBase ( consensus sequence library ).

View all literature mentions

TopHat (tool)

RRID:SCR_013035

Software tool for fast and high throughput alignment of shotgun cDNA sequencing reads generated by transcriptomics technologies. Fast splice junction mapper for RNA-Seq reads. Aligns RNA-Seq reads to mammalian-sized genomes using ultra high-throughput short read aligner Bowtie, and then analyzes mapping results to identify splice junctions between exons.TopHat2 is accurate alignment of transcriptomes in presence of insertions, deletions and gene fusions.

View all literature mentions

New England Biolabs (tool)

RRID:SCR_013517

An Antibody supplier

View all literature mentions

Cufflinks (tool)

RRID:SCR_014597

Software tool for transcriptome assembly and differential expression analysis for RNA-Seq. Includes script called cuffmerge that can be used to merge together several Cufflinks assemblies. It also handles running Cuffcompare as well as automatically filtering a number of transfrags that are likely to be artifacts. If the researcher has a reference GTF file, the researcher can provide it to the script to more effectively merge novel isoforms and maximize overall assembly quality.

View all literature mentions

RepeatScout (tool)

RRID:SCR_014653

Algorithm used to identify de novo repeat families in newly sequenced genomes. Repeat libraries for C. briggsae, M. muscles (X chromosome), R. novegicus (X chromosome), armadillo, H. sapiens (X chromosome), and various other mammals created using RepeatScout are available on the main site.

View all literature mentions

PASA (tool)

RRID:SCR_014656

Gene structure annotation and analysis tool that uses spliced alignments of expressed transcript sequences to automatically model gene structures. It also incorporates gene structures based on transcript alignments into existing gene structure annotations. It is one component of a larger eukayotic annotation pipeline implemented at the Broad Institute.

View all literature mentions

EVidenceModeler (tool)

RRID:SCR_014659

Software tool for automated eukaryotic gene structure annotation that reports eukaryotic gene structures as weighted consensus of all available evidence. Used to combine ab intio gene predictions and protein and transcript alignments into weighted consensus gene structures. Inputs include genome sequence, gene predictions, and alignment data (in GFF3 format).

View all literature mentions

Pilon (tool)

RRID:SCR_014731

Software tool to automatically improve draft assemblies and find variation among strains, including large event detection. FASTA files of genome along with one or more BAM files of reads aligned as input. Read alignment analysis is used to identify inconsistencies between input genome and evidence in reads, then attempts to make improvements to genome.

View all literature mentions

HISAT2 (tool)

RRID:SCR_015530

Graph-based alignment of next generation sequencing reads to a population of genomes.

View all literature mentions

funRiceGenes (tool)

RRID:SCR_015778

Dataset of functionally characterized rice genes and members of different gene families. The dataset was created by integrating data from available databases and reviewing publications of rice functional genomic studies.

View all literature mentions

Canu (tool)

RRID:SCR_015880

Software for scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Canu is a fork of the Celera Assembler and is designed for high-noise single-molecule sequencing (such as the PacBio RS II/Sequel or Oxford Nanopore MinION).

View all literature mentions

StringTie (tool)

RRID:SCR_016323

Software application for assembling of RNA-Seq alignments into potential transcripts. It enables improved reconstruction of a transcriptome from RNA-seq reads. This transcript assembling and quantification program is implemented in C++ .

View all literature mentions

iTOL (tool)

RRID:SCR_018174

Web tool for display, annotation and management of phylogenetic trees. Accessible with any modern web browser.

View all literature mentions

FALCON (tool)

RRID:SCR_018804

Web tool as high throughput protein structure prediction service. High throughput server for protein structure prediction.

View all literature mentions

MCScanX (tool)

RRID:SCR_022067

Software toolkit for detection and evolutionary analysis of gene synteny and collinearity.

View all literature mentions

About

The SciCrunch Infrastructure was developed as a cooperative data platform to be used by diverse communities in making data more FAIR.

Contact Us

FAIR Data Informatics Lab

University of California, San Diego

9500 Gilman Drive, Mail Code 0608

La Jolla, CA 92093-0608

United States

info

scicrunch.org

About SciCrunch | Privacy Policy | Terms of Service

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Log in

Log in

Publication

A high-quality genome assembly highlights rye genomic characteristics and agronomically important genes.

Research resources used in this publication

Additional research tools detected in this publication

Antibodies used in this publication

Associated grants

This is a list of tools and resources that we have found mentioned in this publication.

RRID:SCR_001235

RRID:SCR_001598

RRID:SCR_002105

RRID:SCR_005191

RRID:SCR_007839

RRID:SCR_010228

RRID:SCR_011812

RRID:SCR_011841

RRID:SCR_011919

RRID:SCR_012773

RRID:SCR_012954

RRID:SCR_013035

RRID:SCR_013517

RRID:SCR_014597

RRID:SCR_014653

RRID:SCR_014656

RRID:SCR_014659

RRID:SCR_014731

RRID:SCR_015530

RRID:SCR_015778

RRID:SCR_015880

RRID:SCR_016323

RRID:SCR_018174

RRID:SCR_018804

RRID:SCR_022067

About

Recent News Entries

Contact Us

SciCrunch