2024MAY10: Our hosting provider is experiencing intermittent networking issues. We apologize for any inconvenience.

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Diploid genomic architecture of Nitzschia inconspicua, an elite biomass production diatom.

Scientific reports | 2021

A near-complete diploid nuclear genome and accompanying circular mitochondrial and chloroplast genomes have been assembled from the elite commercial diatom species Nitzschia inconspicua. The 50 Mbp haploid size of the nuclear genome is nearly double that of model diatom Phaeodactylum tricornutum, but 30% smaller than closer relative Fragilariopsis cylindrus. Diploid assembly, which was facilitated by low levels of allelic heterozygosity (2.7%), included 14 candidate chromosome pairs composed of long, syntenic contigs, covering 93% of the total assembly. Telomeric ends were capped with an unusual 12-mer, G-rich, degenerate repeat sequence. Predicted proteins were highly enriched in strain-specific marker domains associated with cell-surface adhesion, biofilm formation, and raphe system gliding motility. Expanded species-specific families of carbonic anhydrases suggest potential enhancement of carbon concentration efficiency, and duplicated glycolysis and fatty acid synthesis pathways across cytosolic and organellar compartments may enhance peak metabolic output, contributing to competitive success over other organisms in mixed cultures. The N. inconspicua genome delivers a robust new reference for future functional and transcriptomic studies to illuminate the physiology of benthic pennate diatoms and harness their unique adaptations to support commercial algae biomass and bioproduct production.

Pubmed ID: 34341414 RIS Download

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


BLASTN (tool)

RRID:SCR_001598

Web application to search nucleotide databases using a nucleotide query. Algorithms: blastn, megablast, discontiguous megablast.

View all literature mentions

BLASTX (tool)

RRID:SCR_001653

Web application to search protein databases using a translated nucleotide query. Translated BLAST services are useful when trying to find homologous proteins to a nucleotide coding region. Blastx compares translational products of the nucleotide query sequence to a protein database. Because blastx translates the query sequence in all six reading frames and provides combined significance statistics for hits to different frames, it is particularly useful when the reading frame of the query sequence is unknown or it contains errors that may lead to frame shifts or other coding errors. Thus blastx is often the first analysis performed with a newly determined nucleotide sequence and is used extensively in analyzing EST sequences. This search is more sensitive than nucleotide blast since the comparison is performed at the protein level.

View all literature mentions

GENEWIZ (tool)

RRID:SCR_003177

Commercial organization for research and development genomics services and technical support to researchers.

View all literature mentions

Jellyfish (tool)

RRID:SCR_005491

A software tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence. JELLYFISH can count k-mers quickly by using an efficient encoding of a hash table and by exploiting the compare-and-swap CPU instruction to increase parallelism. Jellyfish is a command-line program that reads FASTA and multi-FASTA files containing DNA sequences. It outputs its k-mer counts in an binary format, which can be translated into a human-readable text format using the jellyfish dump command.

View all literature mentions

MUSCLE (tool)

RRID:SCR_011812

Multiple sequence alignment method with reduced time and space complexity.Multiple sequence alignment with high accuracy and high throughput. Data analysis service for multiple sequence comparison by log- expectation.

View all literature mentions

Trimmomatic (tool)

RRID:SCR_011848

Software Java pipeline for trimming tasks for Illumina paired end and single ended data. Flexible Trimmer for Illumina Sequence Data. Pair aware preprocessing tool optimized for Illumina next generation sequencing data. Includes several processing steps for read trimming and filtering. Operating systems Unix/Linux, Mac OS, Windows.

View all literature mentions

KEGG (tool)

RRID:SCR_012773

Integrated database resource consisting of 16 main databases, broadly categorized into systems information, genomic information, and chemical information. In particular, gene catalogs in completely sequenced genomes are linked to higher-level systemic functions of cell, organism, and ecosystem. Analysis tools are also available. KEGG may be used as reference knowledge base for biological interpretation of large-scale datasets generated by sequencing and other high-throughput experimental technologies.

View all literature mentions

Trinity (tool)

RRID:SCR_013048

Software for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data.

View all literature mentions

BUSCO (tool)

RRID:SCR_015008

Software tool to quantitatively measure genome assembly and annotation completeness based on evolutionarily informed expectations of gene content.

View all literature mentions

SignalP (tool)

RRID:SCR_015644

Web application for prediction of the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms. The method incorporates a prediction of cleavage sites and a signal peptide/non-signal peptide prediction based on a combination of several artificial neural networks.

View all literature mentions

Canu (tool)

RRID:SCR_015880

Software for scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Canu is a fork of the Celera Assembler and is designed for high-noise single-molecule sequencing (such as the PacBio RS II/Sequel or Oxford Nanopore MinION).

View all literature mentions

OrthoVenn2 (tool)

RRID:SCR_022504

Web server for whole genome comparison and annotation of orthologous clusters across multiple species.Works on any operating system with modern browser and Javascript enabled. Used to identify orthologous gene clusters and supports user define species to upload customized protein sequences. Interactive graphic tool which provides Venn diagram view for comparing multiple species protein sequences.

View all literature mentions