Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Publication

Finding needles in haystacks: linking scientific names, reference specimens and molecular data for Fungi.

Database : the journal of biological databases and curation | 2014

DNA phylogenetic comparisons have shown that morphology-based species recognition often underestimates fungal diversity. Therefore, the need for accurate DNA sequence data, tied to both correct taxonomic names and clearly annotated specimen data, has never been greater. Furthermore, the growing number of molecular ecology and microbiome projects using high-throughput sequencing require fast and effective methods for en masse species assignments. In this article, we focus on selecting and re-annotating a set of marker reference sequences that represent each currently accepted order of Fungi. The particular focus is on sequences from the internal transcribed spacer region in the nuclear ribosomal cistron, derived from type specimens and/or ex-type cultures. Re-annotated and verified sequences were deposited in a curated public database at the National Center for Biotechnology Information (NCBI), namely the RefSeq Targeted Loci (RTL) database, and will be visible during routine sequence similarity searches with NR_prefixed accession numbers. A set of standards and protocols is proposed to improve the data quality of new sequences, and we suggest how type and other reference sequences can be used to improve identification of Fungi. Database URL: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA177353.

Pubmed ID: 24980130 RIS Download

Research resources used in this publication

None found

Additional research tools detected in this publication

Antibodies used in this publication

None found

Associated grants

Agency: Intramural NIH HHS, United States

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.

NCBI Taxonomy (tool)

RRID:SCR_003256

Database for a curated classification and nomenclature that contains the names of all organisms that are represented in the public sequence databases with at least one nucleotide or protein sequence. Data provided encompasses archaea, bacteria, eukaryota, viroids and viruses. The NCBI taxonomy database is not a primary source for taxonomic or phylogenetic information. Furthermore, the database does not follow a single taxonomic treatise but rather attempts to incorporate phylogenetic and taxonomic knowledge from a variety of sources, including the published literature, web-based databases, and the advice of sequence submitters and outside taxonomy experts. Consequently, the NCBI taxonomy database is not a phylogenetic or taxonomic authority and should not be cited as such.

View all literature mentions

European Bioinformatics Institute (tool)

RRID:SCR_004727

Non-profit academic organization for research and services in bioinformatics. Provides freely available data from life science experiments, performs basic research in computational biology, and offers user training programme, manages databases of biological data including nucleic acid, protein sequences, and macromolecular structures. Part of EMBL.

View all literature mentions

NCBI Nucleotide (tool)

RRID:SCR_004860

Database of nucleotide sequences from several sources, including GenBank, RefSeq, TPA and PDB. Genome, gene and transcript sequence data provide the foundation for biomedical research and discovery.

View all literature mentions

NCBI (tool)

RRID:SCR_006472

A portal to biomedical and genomic information. NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information for the better understanding of molecular processes affecting human health and disease.

View all literature mentions

RefSeq (tool)

RRID:SCR_003496

Collection of curated, non-redundant genomic DNA, transcript RNA, and protein sequences produced by NCBI. Provides a reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis, expression studies, and comparative analyses. Accessed through the Nucleotide and Protein databases.

View all literature mentions

UNITE (tool)

RRID:SCR_006518

A fungal rDNA internal transcribed spacer (ITS) sequence database (although additional genes and genetic markers are also welcome) to facilitate identification of environmental samples of fungal DNA. Additional important features include user annotation of INSD sequences to add metadata on, e.g., locality, habitat, soil, climate, and interacting taxa. The user can furthermore annotate INSD sequences with additional species identifications that will appear in the results of any analyses done. UNITE focuses on high-quality ITS sequences generated from fruiting bodies collected and identified by experts and deposited in public herbaria. In addition, it also holds all fungal ITS sequences in the International Nucleotide Sequence Databases (INSD: NCBI, EMBL, DDBJ). Both sets of sequences may be used in any analyses carried out. UNITE is accompanied by a project management system called PlutoF, where users can store field data, document the sequencing lab procedures, manage sequences, and make analyses. PlutoF intends to make it possible for taxonomists, ecologists, and biogeographers to use a common platform for data storage, handling, and analyses, with the intent of facilitating an integration of these disciplines. A user can have an unlimited number of projects but still make analyses across any project data available to him.

View all literature mentions

BioEdit (tool)

RRID:SCR_007361

Software tool as biological sequence alignment editor written for Windows 95/98/NT/2000/XP/7 and sequence analysis program. Provides sequence manipulation and analysis options and links to external analysis programs to view and manipulate sequences with simple point and click operations.

View all literature mentions

MAFFT (tool)

RRID:SCR_011811

Software package as multiple alignment program for amino acid or nucleotide sequences. Can align up to 500 sequences or maximum file size of 1 MB. First version of MAFFT used algorithm based on progressive alignment, in which sequences were clustered with help of Fast Fourier Transform. Subsequent versions have added other algorithms and modes of operation, including options for faster alignment of large numbers of sequences, higher accuracy alignments, alignment of non-coding RNA sequences, and addition of new sequences to existing alignments.

View all literature mentions

About

The SciCrunch Infrastructure was developed as a cooperative data platform to be used by diverse communities in making data more FAIR.

Contact Us

FAIR Data Informatics Lab

University of California, San Diego

9500 Gilman Drive, Mail Code 0608

La Jolla, CA 92093-0608

United States

info

scicrunch.org

About SciCrunch | Privacy Policy | Terms of Service

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Finding needles in haystacks: linking scientific names, reference specimens and molecular data for Fungi.

Research resources used in this publication

Additional research tools detected in this publication

Antibodies used in this publication

Associated grants

This is a list of tools and resources that we have found mentioned in this publication.