FDI Lab - SciCrunch.org | Searching in Literature

The European Nucleotide Archive in 2018.

Peter W Harrison‎ et al.
Nucleic acids research‎
2019‎

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena), provided from EMBL-EBI, has for more than three decades been responsible for archiving the world's public sequencing data and presenting this important resource to the scientific community to support and accelerate the global research effort. Here, we outline ENA services and content in 2018 and provide an overview of a selection of focus areas of development work: extending data coordination services around ENA, sequence submissions through template expansion, early pre-submission validation tools and our move towards a new browser and retrieval infrastructure.

Manual annotation and analysis of the defensin gene cluster in the C57BL/6J mouse reference genome.

Clara Amid‎ et al.
BMC genomics‎
2009‎

Host defense peptides are a critical component of the innate immune system. Human alpha- and beta-defensin genes are subject to copy number variation (CNV) and historically the organization of mouse alpha-defensin genes has been poorly defined. Here we present the first full manual genomic annotation of the mouse defensin region on Chromosome 8 of the reference strain C57BL/6J, and the analysis of the orthologous regions of the human and rat genomes. Problems were identified with the reference assemblies of all three genomes. Defensins have been studied for over two decades and their naming has become a critical issue due to incorrect identification of defensin genes derived from different mouse strains and the duplicated nature of this region.

The COMPARE Data Hubs.

Clara Amid‎ et al.
Database : the journal of biological databases and curation‎
2019‎

Data sharing enables research communities to exchange findings and build upon the knowledge that arises from their discoveries. Areas of public and animal health as well as food safety would benefit from rapid data sharing when it comes to emergencies. However, ethical, regulatory and institutional challenges, as well as lack of suitable platforms which provide an infrastructure for data sharing in structured formats, often lead to data not being shared or at most shared in form of supplementary materials in journal publications. Here, we describe an informatics platform that includes workflows for structured data storage, managing and pre-publication sharing of pathogen sequencing data and its analysis interpretations with relevant stakeholders.

The European Nucleotide Archive in 2019.

Clara Amid‎ et al.
Nucleic acids research‎
2020‎

The European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena) at the European Molecular Biology Laboratory's European Bioinformatics Institute provides open and freely available data deposition and access services across the spectrum of nucleotide sequence data types. Making the world's public sequencing datasets available to the scientific community, the ENA represents a globally comprehensive nucleotide sequence resource. Here, we outline ENA services and content in 2019 and provide an insight into selected key areas of development in this period.

Structural and functional annotation of the porcine immunome.

Harry D Dawson‎ et al.
BMC genomics‎
2013‎

The domestic pig is known as an excellent model for human immunology and the two species share many pathogens. Susceptibility to infectious disease is one of the major constraints on swine performance, yet the structure and function of genes comprising the pig immunome are not well-characterized. The completion of the pig genome provides the opportunity to annotate the pig immunome, and compare and contrast pig and human immune systems.

Global monitoring of antimicrobial resistance based on metagenomics analyses of urban sewage.

Rene S Hendriksen‎ et al.
Nature communications‎
2019‎

Antimicrobial resistance (AMR) is a serious threat to global public health, but obtaining representative data on AMR for healthy human populations is difficult. Here, we use metagenomic analysis of untreated sewage to characterize the bacterial resistome from 79 sites in 60 countries. We find systematic differences in abundance and diversity of AMR genes between Europe/North-America/Oceania and Africa/Asia/South-America. Antimicrobial use data and bacterial taxonomy only explains a minor part of the AMR variation that we observe. We find no evidence for cross-selection between antimicrobial classes, or for effect of air travel between sites. However, AMR gene abundance strongly correlates with socio-economic, health and environmental factors, which we use to predict AMR gene abundances in all countries in the world. Our findings suggest that global AMR gene diversity and abundance vary by region, and that improving sanitation and health could potentially limit the global burden of AMR. We propose metagenomic analysis of sewage as an ethically acceptable and economically feasible approach for continuous global surveillance and prediction of AMR.

Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition.

Adriana Alberti‎ et al.
Scientific data‎
2017‎

A unique collection of oceanic samples was gathered by the Tara Oceans expeditions (2009-2013), targeting plankton organisms ranging from viruses to metazoans, and providing rich environmental context measurements. Thanks to recent advances in the field of genomics, extensive sequencing has been performed for a deep genomic analysis of this huge collection of samples. A strategy based on different approaches, such as metabarcoding, metagenomics, single-cell genomics and metatranscriptomics, has been chosen for analysis of size-fractionated plankton communities. Here, we provide detailed procedures applied for genomic data generation, from nucleic acids extraction to sequence production, and we describe registries of genomics datasets available at the European Nucleotide Archive (ENA, www.ebi.ac.uk/ena). The association of these metadata to the experimental procedures applied for their generation will help the scientific community to access these data and facilitate their analysis. This paper complements other efforts to provide a full description of experiments and open science resources generated from the Tara Oceans project, further extending their value for the study of the world's planktonic ecosystems.

Integrative annotation of 21,037 human genes validated by full-length cDNA clones.

Tadashi Imanishi‎ et al.
PLoS biology‎
2004‎

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.

Accelerating surveillance and research of antimicrobial resistance - an online repository for sharing of antimicrobial susceptibility data associated with whole-genome sequences.

Sébastien Matamoros‎ et al.
Microbial genomics‎
2020‎

Antimicrobial resistance (AMR) is an emerging threat to modern medicine. Improved diagnostics and surveillance of resistant bacteria require the development of next-generation analysis tools and collaboration between international partners. Here, we present the 'AMR Data Hub', an online infrastructure for storage and sharing of structured phenotypic AMR data linked to bacterial whole-genome sequences. Leveraging infrastructure built by the European COMPARE Consortium and structured around the European Nucleotide Archive (ENA), the AMR Data Hub already provides an extensive data collection of more than 2500 isolates with linked genome and AMR data. Representing these data in standardized formats, we provide tools for the validation and submission of new data and services supporting search, browse and retrieval. The current collection was created through a collaboration by several partners from the European COMPARE Consortium, demonstrating the capacities and utility of the AMR Data Hub and its associated tools. We anticipate growth of content and offer the hub as a basis for future research into methods to explore and predict AMR.

Metagenomics-Based Proficiency Test of Smoked Salmon Spiked with a Mock Community.

Claudia Sala‎ et al.
Microorganisms‎
2020‎

An inter-laboratory proficiency test was organized to assess the ability of participants to perform shotgun metagenomic sequencing of cold smoked salmon, experimentally spiked with a mock community composed of six bacteria, one parasite, one yeast, one DNA, and two RNA viruses. Each participant applied its in-house wet-lab workflow(s) to obtain the metagenomic dataset(s), which were then collected and analyzed using MG-RAST. A total of 27 datasets were analyzed. Sample pre-processing, DNA extraction protocol, library preparation kit, and sequencing platform, influenced the abundance of specific microorganisms of the mock community. Our results highlight that despite differences in wet-lab protocols, the reads corresponding to the mock community members spiked in the cold smoked salmon, were both detected and quantified in terms of relative abundance, in the metagenomic datasets, proving the suitability of shotgun metagenomic sequencing as a genomic tool to detect microorganisms belonging to different domains in the same food matrix. The implementation of standardized wet-lab protocols would highly facilitate the comparability of shotgun metagenomic sequencing dataset across laboratories and sectors. Moreover, there is a need for clearly defining a sequencing reads threshold, to consider pathogens as detected or undetected in a food sample.

Beyond the Genome: genomics research ten years after the human genome sequence.

Amanda M Casto‎ et al.
Genome biology‎
2010‎

A report on the meeting 'Beyond the Genome', Boston, USA, 11-13 October 2010.

Major submissions tool developments at the European Nucleotide Archive.

Clara Amid‎ et al.
Nucleic acids research‎
2012‎

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena), Europe's primary nucleotide sequence resource, captures and presents globally comprehensive nucleic acid sequence and associated information. Covering the spectrum from raw data to assembled and functionally annotated genomes, the ENA has witnessed a dramatic growth resulting from advances in sequencing technology and ever broadening application of the methodology. During 2011, we have continued to operate and extend the broad range of ENA services. In particular, we have released major new functionality in our interactive web submission system, Webin, through developments in template-based submissions for annotated sequences and support for raw next-generation sequence read submissions.

Comparison of sequencing methods and data processing pipelines for whole genome sequencing and minority single nucleotide variant (mSNV) analysis during an influenza A/H5N8 outbreak.

Marjolein J Poen‎ et al.
PloS one‎
2020‎

As high-throughput sequencing technologies are becoming more widely adopted for analysing pathogens in disease outbreaks there needs to be assurance that the different sequencing technologies and approaches to data analysis will yield reliable and comparable results. Conversely, understanding where agreement cannot be achieved provides insight into the limitations of these approaches and also allows efforts to be focused on areas of the process that need improvement. This manuscript describes the next-generation sequencing of three closely related viruses, each analysed using different sequencing strategies, sequencing instruments and data processing pipelines. In order to determine the comparability of consensus sequences and minority (sub-consensus) single nucleotide variant (mSNV) identification, the biological samples, the sequence data from 3 sequencing platforms and the *.bam quality-trimmed alignment files of raw data of 3 influenza A/H5N8 viruses were shared. This analysis demonstrated that variation in the final result could be attributed to all stages in the process, but the most critical were the well-known homopolymer errors introduced by 454 sequencing, and the alignment processes in the different data processing pipelines which affected the consistency of mSNV detection. However, homopolymer errors aside, there was generally a good agreement between consensus sequences that were obtained for all combinations of sequencing platforms and data processing pipelines. Nevertheless, minority variant analysis will need a different level of careful standardization and awareness about the possible limitations, as shown in this study.

Value, but high costs in post-deposition data curation.

Petra ten Hoopen‎ et al.
Database : the journal of biological databases and curation‎
2016‎

Discoverability of sequence data in primary data archives is proportional to the richness of contextual information associated with the data. Here, we describe an exercise in the improvement of contextual information surrounding sample records associated with metagenomics sequence reads available in the European Nucleotide Archive. We outline the annotation process and summarize findings of this effort aimed at increasing usability of publicly available environmental data. Furthermore, we emphasize the benefits of such an exercise and detail its costs. We conclude that such a third party annotation approach is expensive and has value as an element of curation, but should form only part of a more sustainable submitter-driven approach. Database URL: http://www.ebi.ac.uk/ena.

Facing growth in the European Nucleotide Archive.

Guy Cochrane‎ et al.
Nucleic acids research‎
2013‎

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/) collects, maintains and presents comprehensive nucleic acid sequence and related information as part of the permanent public scientific record. Here, we provide brief updates on ENA content developments and major service enhancements in 2012 and describe in more detail two important areas of development and policy that are driven by ongoing growth in sequencing technologies. First, we describe the ENA data warehouse, a resource for which we provide a programmatic entry point to integrated content across the breadth of ENA. Second, we detail our plans for the deployment of CRAM data compression technology in ENA.

Assembly information services in the European Nucleotide Archive.

Nima Pakseresht‎ et al.
Nucleic acids research‎
2014‎

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the world public domain nucleotide sequence data output. ENA content covers a spectrum of data types including raw reads, assembly data and functional annotation. ENA has faced a dramatic growth in genome assembly submission rates, data volumes and complexity of datasets. This has prompted a broad reworking of assembly submission services, for which we now reach the end of a major programme of work and many enhancements have already been made available over the year to components of the submission service. In this article, we briefly review ENA content and growth over 2013, describe our rapidly developing services for genome assembly information and outline further major developments over the last year.

The European Nucleotide Archive in 2017.

Nicole Silvester‎ et al.
Nucleic acids research‎
2018‎

For 35 years the European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) has been responsible for making the world's public sequencing data available to the scientific community. Advances in sequencing technology have driven exponential growth in the volume of data to be processed and stored and a substantial broadening of the user community. Here, we outline ENA services and content in 2017 and provide insight into a selection of current key areas of development in ENA driven by challenges arising from the above growth.

Minimum Information about an Uncultivated Virus Genome (MIUViG).

Simon Roux‎ et al.
Nature biotechnology‎
2019‎

We present an extension of the Minimum Information about any (x) Sequence (MIxS) standard for reporting sequences of uncultivated virus genomes. Minimum Information about an Uncultivated Virus Genome (MIUViG) standards were developed within the Genomic Standards Consortium framework and include virus origin, genome quality, genome annotation, taxonomic classification, biogeographic distribution and in silico host prediction. Community-wide adoption of MIUViG standards, which complement the Minimum Information about a Single Amplified Genome (MISAG) and Metagenome-Assembled Genome (MIMAG) standards for uncultivated bacteria and archaea, will improve the reporting of uncultivated virus genomes in public databases. In turn, this should enable more robust comparative studies and a systematic exploration of the global virosphere.

European Nucleotide Archive in 2016.

Ana Luisa Toribio‎ et al.
Nucleic acids research‎
2017‎

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) offers a rich platform for data sharing, publishing and archiving and a globally comprehensive data set for onward use by the scientific community. With a broad scope spanning raw sequencing reads, genome assemblies and functional annotation, the resource provides extensive data submission, search and download facilities across web and programmatic interfaces. Here, we outline ENA content and major access modalities, highlight major developments in 2016 and outline a number of examples of data reuse from ENA.

Content discovery and retrieval services at the European Nucleotide Archive.

Nicole Silvester‎ et al.
Nucleic acids research‎
2015‎

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary resource for nucleotide sequence information. With the growing volume and diversity of public sequencing data comes the need for increased sophistication in data organisation, presentation and search services so as to maximise its discoverability and usability. In response to this, ENA has been introducing and improving checklists for use during submission and expanding its search facilities to provide targeted search results. Here, we give a brief update on ENA content and some major developments undertaken in data submission services during 2014. We then describe in more detail the services we offer for data discovery and retrieval.

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

The European Nucleotide Archive in 2018.

Manual annotation and analysis of the defensin gene cluster in the C57BL/6J mouse reference genome.

The COMPARE Data Hubs.

The European Nucleotide Archive in 2019.

Structural and functional annotation of the porcine immunome.

Global monitoring of antimicrobial resistance based on metagenomics analyses of urban sewage.

Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition.

Integrative annotation of 21,037 human genes validated by full-length cDNA clones.

Accelerating surveillance and research of antimicrobial resistance - an online repository for sharing of antimicrobial susceptibility data associated with whole-genome sequences.

Metagenomics-Based Proficiency Test of Smoked Salmon Spiked with a Mock Community.

Beyond the Genome: genomics research ten years after the human genome sequence.

Major submissions tool developments at the European Nucleotide Archive.

Comparison of sequencing methods and data processing pipelines for whole genome sequencing and minority single nucleotide variant (mSNV) analysis during an influenza A/H5N8 outbreak.

Value, but high costs in post-deposition data curation.

Facing growth in the European Nucleotide Archive.

Assembly information services in the European Nucleotide Archive.

The European Nucleotide Archive in 2017.

Minimum Information about an Uncultivated Virus Genome (MIUViG).

European Nucleotide Archive in 2016.

Content discovery and retrieval services at the European Nucleotide Archive.

SciCrunch.org Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Facets

Options

Further Questions

About

Recent News Entries

Contact Us

SciCrunch

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Log in

Log in

Literature

Current Facets and Filters

Options

Facets

Recent searches

.in-collection { color: green; } The European Nucleotide Archive in 2018.

.in-collection { color: green; } Manual annotation and analysis of the defensin gene cluster in the C57BL/6J mouse reference genome.

.in-collection { color: green; } The COMPARE Data Hubs.

.in-collection { color: green; } The European Nucleotide Archive in 2019.

.in-collection { color: green; } Structural and functional annotation of the porcine immunome.

.in-collection { color: green; } Global monitoring of antimicrobial resistance based on metagenomics analyses of urban sewage.

.in-collection { color: green; } Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition.

.in-collection { color: green; } Integrative annotation of 21,037 human genes validated by full-length cDNA clones.

.in-collection { color: green; } Accelerating surveillance and research of antimicrobial resistance - an online repository for sharing of antimicrobial susceptibility data associated with whole-genome sequences.

.in-collection { color: green; } Metagenomics-Based Proficiency Test of Smoked Salmon Spiked with a Mock Community.

.in-collection { color: green; } Beyond the Genome: genomics research ten years after the human genome sequence.

.in-collection { color: green; } Major submissions tool developments at the European Nucleotide Archive.

.in-collection { color: green; } Comparison of sequencing methods and data processing pipelines for whole genome sequencing and minority single nucleotide variant (mSNV) analysis during an influenza A/H5N8 outbreak.

.in-collection { color: green; } Value, but high costs in post-deposition data curation.

.in-collection { color: green; } Facing growth in the European Nucleotide Archive.

.in-collection { color: green; } Assembly information services in the European Nucleotide Archive.

.in-collection { color: green; } The European Nucleotide Archive in 2017.

.in-collection { color: green; } Minimum Information about an Uncultivated Virus Genome (MIUViG).

.in-collection { color: green; } European Nucleotide Archive in 2016.

.in-collection { color: green; } Content discovery and retrieval services at the European Nucleotide Archive.

SciCrunch.org Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Facets

Options

Further Questions

Publications Per Year

About

Recent News Entries

Contact Us

SciCrunch

The European Nucleotide Archive in 2018.

Manual annotation and analysis of the defensin gene cluster in the C57BL/6J mouse reference genome.

The COMPARE Data Hubs.

The European Nucleotide Archive in 2019.

Structural and functional annotation of the porcine immunome.

Global monitoring of antimicrobial resistance based on metagenomics analyses of urban sewage.

Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition.

Integrative annotation of 21,037 human genes validated by full-length cDNA clones.

Accelerating surveillance and research of antimicrobial resistance - an online repository for sharing of antimicrobial susceptibility data associated with whole-genome sequences.

Metagenomics-Based Proficiency Test of Smoked Salmon Spiked with a Mock Community.

Beyond the Genome: genomics research ten years after the human genome sequence.

Major submissions tool developments at the European Nucleotide Archive.

Comparison of sequencing methods and data processing pipelines for whole genome sequencing and minority single nucleotide variant (mSNV) analysis during an influenza A/H5N8 outbreak.

Value, but high costs in post-deposition data curation.

Facing growth in the European Nucleotide Archive.

Assembly information services in the European Nucleotide Archive.

The European Nucleotide Archive in 2017.

Minimum Information about an Uncultivated Virus Genome (MIUViG).

European Nucleotide Archive in 2016.

Content discovery and retrieval services at the European Nucleotide Archive.