Chronic lymphocytic leukaemia (CLL) is characterized by substantial clinical heterogeneity, despite relatively few genetic alterations. To provide a basis for studying epigenome deregulation in CLL, here we present genome-wide chromatin accessibility maps for 88 CLL samples from 55 patients measured by the ATAC-seq assay. We also performed ChIPmentation and RNA-seq profiling for ten representative samples. Based on the resulting data set, we devised and applied a bioinformatic method that links chromatin profiles to clinical annotations. Our analysis identified sample-specific variation on top of a shared core of CLL regulatory regions. IGHV mutation status-which distinguishes the two major subtypes of CLL-was accurately predicted by the chromatin profiles and gene regulatory networks inferred for IGHV-mutated versus IGHV-unmutated samples identified characteristic differences between these two disease subtypes. In summary, we discovered widespread heterogeneity in the chromatin landscape of CLL, established a community resource for studying epigenome deregulation in leukaemia and demonstrated the feasibility of large-scale chromatin accessibility mapping in cancer cohorts and clinical research.
Pubmed ID: 27346425 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
scikit-learn: machine learning in Python
View all literature mentionsOpen source database of curated, non-redundant set of profiles derived from published collections of experimentally defined transcription factor binding sites for multicellular eukaryotes. Consists of open data access, non-redundancy and quality. JASPAR CORE is smaller set that is non-redundant and curated. Collection of transcription factor DNA-binding preferences, modeled as matrices. These can be converted into Position Weight Matrices (PWMs or PSSMs), used for scanning genomic sequences. Web interface for browsing, searching and subset selection, online sequence analysis utility and suite of programming tools for genome-wide and comparative genomic analysis of regulatory regions. New functions include clustering of matrix models by similarity, generation of random matrices by sampling from selected sets of existing models and a language-independent Web Service applications programming interface for matrix retrieval.
View all literature mentionsOpen-source software for network visualization and analysis helping data analysts to intuitively reveal patterns and trends, highlight outliers and tells stories with their data. It uses a 3D render engine to display large graphs in real-time and to speed up the exploration. Gephi combines built-in functionalities and flexible architecture to: explore, analyze, spatialize, filter, cluterize, manipulate and export all types of networks. Gephi runs on Windows, Linux and Mac OS X. Gephi is based on a visualize-and-manipulate paradigm which allow any user to discover networks and data properties. Moreover, it is designed to follow the chain of a case study, from data file to nice printable maps. It is open-source and free (GNU General Public License). Applications: * Exploratory Data Analysis: intuition-oriented analysis by networks manipulations in real time. * Link Analysis: revealing the underlying structures of associations between objects, in particular in scale-free networks. * Social Network Analysis: easy creation of social data connectors to map community organizations and small-world networks. * Biological Network analysis: representing patterns of biological data. * Poster creation: scientific work promotion with hi-quality printable maps. Gephi 0.7 architecture is modular and therefore allows developers to add and extend functionalities with ease. New features like Metrics, Layout, Filters, Data sources and more can be easily packaged in plugins and shared. The built-in Plugins Center automatically gets the list of plugins available from the Gephi Plugin portal and takes care of all software updates. Download, comment, and rate plugins provided by community members and third-party companies, or post your own contributions!
View all literature mentionsA powerful toolset for genome arithmetic allowing one to address common genomics tasks such as finding feature overlaps and computing coverage. Bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line.
View all literature mentionsA Python-based environment of open-source software for mathematics, science, and engineering. The core packages of SciPy include: NumPy, a base N-dimensional array package; SciPy Library, a fundamental library for scientific computing; and IPython, an enhanced interactive console.
View all literature mentionsNumPy is the fundamental package needed for scientific computing with Python. It contains among other things: * a powerful N-dimensional array object * sophisticated (broadcasting) functions * tools for integrating C/C and Fortran code * useful linear algebra, Fourier transform, and random number capabilities. Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases. Sponsored by ENTHOUGHT
View all literature mentionsA software application for inferring expression levels of individual transcripts from sequencing (RNA-Seq) data and estimating differential expression (DE) between conditions.
View all literature mentionsSoftware package for differential gene expression analysis based on the negative binomial distribution. Used for analyzing RNA-seq data for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates.
View all literature mentions