Next-generation sequencing (NGS) technology has paved the way for rapid and cost-efficient de novo sequencing of bacterial genomes. In particular, the introduction of PCR-free library preparation procedures (LPPs) lead to major improvements as PCR bias is largely reduced. However, in order to facilitate the assembly of Illumina paired-end sequence data and to enhance assembly performance, an increase of insert sizes to facilitate the repeat bridging and resolution capabilities of current state of the art assembly tools is needed. In addition, information concerning the relationships between genomic GC content, library insert size and sequencing quality as well as the influence of library insert size, read length and sequencing depth on assembly performance would be helpful to specifically target sequencing projects.
Pubmed ID: 27176120 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Database of biological data related to a single initiative, originating from a single organization or from a consortium. A BioProject record provides users a single place to find links to the diverse data types generated for that project. It is a searchable collection of complete and incomplete (in-progress) large-scale sequencing, assembly, annotation, and mapping projects for cellular organisms. Submissions are supported by a web-based Submission Portal. The database facilitates organization and classification of project data submitted to NCBI, EBI and DDBJ databases that captures descriptive information about research projects that result in high volume submissions to archival databases, ties together related data across multiple archives and serves as a central portal by which to inform users of data availability. BioProject records link to corresponding data stored in archival repositories. The BioProject resource is a redesigned, expanded, replacement of the NCBI Genome Project resource. The redesign adds tracking of several data elements including more precise information about a project''''s scope, material, and objectives. Genome Project identifiers are retained in the BioProject as the ID value for a record, and an Accession number has been added. Database content is exchanged with other members of the International Nucleotide Sequence Database Collaboration (INSDC). BioProject is accessible via FTP.
View all literature mentionsRepository of raw sequencing data from next generation of sequencing platforms including including Roche 454 GS System, Illumina Genome Analyzer, Applied Biosystems SOLiD System, Helicos Heliscope, Complete Genomics, and Pacific Biosciences SMRT. In addition to raw sequence data, SRA now stores alignment information in form of read placements on reference sequence. Data submissions are welcome. Archive of high throughput sequencing data,part of international partnership of archives (INSDC) at NCBI, European Bioinformatics Institute and DNA Database of Japan. Data submitted to any of this three organizations are shared among them.
View all literature mentionsQuality control software that perform checks on raw sequence data coming from high throughput sequencing pipelines. This software also provides a modular set of analyses which can give a quick impression of the quality of the data prior to further analysis.
View all literature mentionsQuality assessment software tool for evaluating and comparing genome assemblies. It works both with and without a given reference genome. It produces many reports, summary tables and plots.
View all literature mentionsSoftware providing de novo, parallel, paired-end sequence assembler that is designed for short reads. ABySS 1.0 originally showed that assembling human genome using short 50 bp sequencing reads was possible by aggregating half terabyte of compute memory needed over several computers using standardized message passing system. ABySS 2.0 is Resource Efficient Assembly of Large Genomes using Bloom Filter. ABySS 2.0 departs from MPI and instead implements algorithms that employ Bloom filter, probabilistic data structure, to represent de Bruijn graph and reduce memory requirements.
View all literature mentionsTHIS RESOURCE IS NO LONGER IN SERVICE. Documented on February 28,2023. Software package as de novo genomic assembler for short read sequencing technologies using de Bruijn graphs. Takes in short read sequences, removes errors, then produces high quality unique contigs, retrieves repeated areas between contigs. Can leverage very short reads in combination with read pairs to produce useful assemblies. Operating system Unix/Linux.
View all literature mentions