Mouse E14 embryonic stem cells (ESCs) are the most used ESC line, often employed for genome-wide studies involving next generation sequencing analysis [1-5]. More than 2 × 10 E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were used to build a database of about 2.7 × 10 E6 single nucleotide variant [6]. The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identified variants are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the molecular biology of these cells. We created a new E14 genome assembly including the new identified variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because it could be a target of DNA methylation. Data were deposited in GEO datasets under reference GSM1283021 and here: http://epigenetics.hugef-research.org/data.php.
Pubmed ID: 26484140 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Software package for working with VCF files. Used to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.Implements various utilities for processing Variant Call Format files, including validation, merging, comparing. Provides general Perl API.
View all literature mentionsSoftware tool as collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
View all literature mentionsA software package to analyze next-generation resequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of any size. This software library makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner. (entry from Genetic Analysis Software)
View all literature mentionsQuality control software that perform checks on raw sequence data coming from high throughput sequencing pipelines. This software also provides a modular set of analyses which can give a quick impression of the quality of the data prior to further analysis.
View all literature mentionsSoftware package that creates genomic builds, calls SNPs, detects indels, and counts reads from data generated from one or more sequencing runs. In addition, CASAVA automatically generates a range of statistics, such as mean depth and percentage chromosome coverage, to enable comparison with previous builds or other samples. CASAVA analyzes sequencing reads in three stages: * FASTQ file generation and demultiplexing * Alignment to a reference genome * Variant detection and counting
View all literature mentionsOriginal SAMTOOLS package has been split into three separate repositories including Samtools, BCFtools and HTSlib. Samtools for manipulating next generation sequencing data used for reading, writing, editing, indexing,viewing nucleotide alignments in SAM,BAM,CRAM format. BCFtools used for reading, writing BCF2,VCF, gVCF files and calling, filtering, summarising SNP and short indel sequence variants. HTSlib used for reading, writing high throughput sequencing data.
View all literature mentionsSoftware ultrafast memory efficient tool for aligning sequencing reads. Bowtie is short read aligner.
View all literature mentions