Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula.

BMC genomics | 2017

Third generation sequencing technologies, with sequencing reads in the tens- of kilo-bases, facilitate genome assembly by spanning ambiguous regions and improving continuity. This has been critical for plant genomes, which are difficult to assemble due to high repeat content, gene family expansions, segmental and tandem duplications, and polyploidy. Recently, high-throughput mapping and scaffolding strategies have further improved continuity. Together, these long-range technologies enable quality draft assemblies of complex genomes in a cost-effective and timely manner.

Pubmed ID: 28778149 RIS Download

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


Scalable Nucleotide Alignment Program (tool)

RRID:SCR_005501

A sequence aligner software program that is 10-100x faster and simultaneously more accurate than existing tools like BWA, Bowtie2 and SOAP2. It runs on commodity x86 processors, and supports a rich error model that lets it cheaply match reads with more differences from the reference than other tools. This gives SNAP up to 2x lower error rates than existing tools and lets it match larger mutations that they may miss. SNAP also natively reads BAM, FASTQ, or gzipped FASTQ, and natively writes SAM or BAM, with built-in sorting, duplicate marking, and BAM indexing.

View all literature mentions

Amplicon (tool)

RRID:SCR_003294

Software tool for designing PCR primers on aligned groups of DNA sequences. The most important application is the design of "group-specific" PCR primer sets that amplify a DNA region from a given taxonomic group but do not amplify orthologous regions from other taxonomic groups. It is written in Python 2.3 and Tkinter 8.4. The current script was created for Windows and an executable is available. Future versions of the script should be able to run on Linux and Mac

View all literature mentions

Pfam (tool)

RRID:SCR_004726

A database of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Users can analyze protein sequences for Pfam matches, view Pfam family annotation and alignments, see groups of related families, look at the domain organization of a protein sequence, find the domains on a PDB structure, and query Pfam by keywords. There are two components to Pfam: Pfam-A and Pfam-B. Pfam-A entries are high quality, manually curated families that may automatically generate a supplement using the ADDA database. These automatically generated entries are called Pfam-B. Although of lower quality, Pfam-B families can be useful for identifying functionally conserved regions when no Pfam-A entries are found. Pfam also generates higher-level groupings of related families, known as clans (collections of Pfam-A entries which are related by similarity of sequence, structure or profile-HMM).

View all literature mentions

Hmmer (tool)

RRID:SCR_005305

Tool for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using probabilistic models called profile hidden Markov models (profile HMMs). Compared to BLAST, FASTA, and other sequence alignment and database search tools based on older scoring methodology, HMMER aims to be significantly more accurate and more able to detect remote homologs because of the strength of its underlying mathematical models. In the past, this strength came at significant computational expense, but in the new HMMER3 project, HMMER is now essentially as fast as BLAST.

View all literature mentions

ESTScan (tool)

RRID:SCR_005742

ESTScan is a program that can detect coding regions in DNA sequences, even if they are of low quality. ESTScan will also detect and correct sequencing errors that lead to frameshifts. ESTScan is not a gene prediction program , nor is it an open reading frame detector. In fact, its strength lies in the fact that it does not require an open reading frame to detect a coding region. As a result, the program may miss a few translated amino acids at either the N or the C terminus, but will detect coding regions with high selectivity and sensitivity. ESTScan takes advantages of the bias in hexanucleotide usage found in coding regions relative to non-coding regions. This bias is formalized as an inhomogeneous 3-periodic fifth-order Hidden Markov Model (HMM). Additionally, the HMM of ESTScan has been extended to allows insertions and deletions when these improve the coding region statistics.

View all literature mentions

CAP3 Sequence Assembly Program (tool)

RRID:SCR_007250

This form allows you to assemble a set of contiguous sequences (contigs) with the CAP3 program. The CAP3 program has a capability to clip 5'' and 3'' low-quality regions of reads. It uses base quality values in computation of overlaps between reads, construction of multiple sequence alignments of reads, and generation of consensus sequences. The program also uses forward-reverse constraints to correct assembly errors and link contigs. Results of CAP3 on four BAC data sets are presented. The performance of CAP3 was compared with that of PHRAP on a number of BAC data sets. PHRAP often produces longer contigs than CAP3 whereas CAP3 often produces fewer errors in consensus sequences than PHRAP. It is easier to construct scaffolds with CAP3 than with PHRAP on low-pass data with forward-reverse constraints. Sponsors: This project was supported by NIH Grant R01HG01502-02 from NHGRI. Keywords: CAP3, Program, Form, Computation, DNA, Dataset, Database, Program,

View all literature mentions

GMAP (tool)

RRID:SCR_008992

THIS RESOURCE IS NO LONGER IN SERVICE, documented August 29, 2016. A software program for mapping and aligning cDNA sequences to a genome. The program maps and aligns a single sequence with minimal startup time and memory requirements, and provides fast batch processing of large sequence sets. The program generates accurate gene structures, even in the presence of substantial polymorphisms and sequence errors, without using probabilistic splice site models. Methodology underlying the program includes a minimal sampling strategy for genomic mapping, oligomer chaining for approximate alignment, sandwich DP for splice site detection, and microexon identification with statistical significance testing.

View all literature mentions

Suite of Nucleotide Analysis Programs (tool)

RRID:SCR_009399

THIS RESOURCE IS NO LONGER IN SERVICE, documented May 10, 2017. A pilot effort that has developed a centralized, web-based biospecimen locator that presents biospecimens collected and stored at participating Arizona hospitals and biospecimen banks, which are available for acquisition and use by researchers. Researchers may use this site to browse, search and request biospecimens to use in qualified studies. The development of the ABL was guided by the Arizona Biospecimen Consortium (ABC), a consortium of hospitals and medical centers in the Phoenix area, and is now being piloted by this Consortium under the direction of ABRC. You may browse by type (cells, fluid, molecular, tissue) or disease. Common data elements decided by the ABC Standards Committee, based on data elements on the National Cancer Institute''s (NCI''s) Common Biorepository Model (CBM), are displayed. These describe the minimum set of data elements that the NCI determined were most important for a researcher to see about a biospecimen. The ABL currently does not display information on whether or not clinical data is available to accompany the biospecimens. However, a requester has the ability to solicit clinical data in the request. Once a request is approved, the biospecimen provider will contact the requester to discuss the request (and the requester''s questions) before finalizing the invoice and shipment. The ABL is available to the public to browse. In order to request biospecimens from the ABL, the researcher will be required to submit the requested required information. Upon submission of the information, shipment of the requested biospecimen(s) will be dependent on the scientific and institutional review approval. Account required. Registration is open to everyone., documented September 29, 2016. A workbench tool to make existing population genetic software more accessible and to facilitate the integration of new tools for analyzing patterns of DNA sequence variation, within a phylogenetic context. Collectively, SNAP tools can serve as a bridge between theoretical and applied population genetic analysis. The exploration of DNA sequence variation for making inferences on evolutionary processes in populations requires the coordinated implementation of a Suite of Nucleotide Analysis Programs (SNAP), each bound by specific assumptions and limitations.

View all literature mentions

ABySS (tool)

RRID:SCR_010709

Software providing de novo, parallel, paired-end sequence assembler that is designed for short reads. ABySS 1.0 originally showed that assembling human genome using short 50 bp sequencing reads was possible by aggregating half terabyte of compute memory needed over several computers using standardized message passing system. ABySS 2.0 is Resource Efficient Assembly of Large Genomes using Bloom Filter. ABySS 2.0 departs from MPI and instead implements algorithms that employ Bloom filter, probabilistic data structure, to represent de Bruijn graph and reduce memory requirements.

View all literature mentions

TBLASTN (tool)

RRID:SCR_011822

Tool to search translated nucleotide databases using a protein query.

View all literature mentions

BLAT (tool)

RRID:SCR_011919

Software designed to quickly find sequences of 95% and greater similarity of length 25 bases or more.

View all literature mentions

OrthoDB (tool)

RRID:SCR_011980

Database of orthologous protein coding genes across vertebrates, arthropods, fungi, basal metazoans, and bacteria.

View all literature mentions

BUSCO (tool)

RRID:SCR_015008

Software tool to quantitatively measure genome assembly and annotation completeness based on evolutionarily informed expectations of gene content.

View all literature mentions

Falcon (tool)

RRID:SCR_016089

Software package for aligning long sequencing reads as a diploid-aware genome assembler. Used for assembling non-inbred or rearranged heterozygous genomes.

View all literature mentions

BioNano Irys system (tool)

RRID:SCR_016754

System by BioNano Genomics ( formerly BioNanomatrix) which provides optical next generation mapping (NGM). Used for sequence assembly and structural variation analysis. Provides Scaffold Bionano genome mapping data with sequencing data to improve assembly contiguity, reduce sequencing coverage needed, and automatically correct errors in sequencing based assemblies.

View all literature mentions