More than 25 inherited human disorders are caused by the unstable expansion of repetitive DNA sequences termed short tandem repeats (STRs). A fundamental unresolved question is why some STRs are susceptible to pathologic expansion, whereas thousands of repeat tracts across the human genome are relatively stable. Here, we discover that nearly all disease-associated STRs (daSTRs) are located at boundaries demarcating 3D chromatin domains. We identify a subset of boundaries with markedly higher CpG island density compared to the rest of the genome. daSTRs specifically localize to ultra-high-density CpG island boundaries, suggesting they might be hotspots for epigenetic misregulation or topological disruption linked to STR expansion. Fragile X syndrome patients exhibit severe boundary disruption in a manner that correlates with local loss of CTCF occupancy and the degree of FMR1 silencing. Our data uncover higher-order chromatin architecture as a new dimension in understanding repeat expansion disorders.
Pubmed ID: 30173918 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Collection of curated, non-redundant genomic DNA, transcript RNA, and protein sequences produced by NCBI. Provides a reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis, expression studies, and comparative analyses. Accessed through the Nucleotide and Protein databases.
View all literature mentionsThis unknown targets Rabbit IgG Control
View all literature mentionsPython based tools to process, visualize and analyse high-throughput sequencing data, such as ChIP-seq, RNA-seq or MNase-seq. Implemented within Galaxy framework. Used to perform complete bioinformatic workflows ranging from quality controls and normalizations of aligned reads to integrative analyses, including clustering and visualization approaches.
View all literature mentionsSuite of motif-based sequence analysis tools to discover motifs using MEME, DREME (DNA only) or GLAM2 on groups of related DNA or protein sequences; search sequence databases with motifs using MAST, FIMO, MCAST or GLAM2SCAN; compare a motif to all motifs in a database of motifs; associate motifs with Gene Ontology terms via their putative target genes, and analyze motif enrichment using SpaMo or CentriMo. Source code, binaries and a web server are freely available for noncommercial use.
View all literature mentionsA powerful toolset for genome arithmetic allowing one to address common genomics tasks such as finding feature overlaps and computing coverage. Bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line.
View all literature mentionsSoftware Python package for identifying transcript factor binding sites. Used to evaluate significance of enriched ChIP regions. Improves spatial resolution of binding sites through combining information of both sequencing tag position and orientation. Can be used for ChIP-Seq data alone, or with control sample with increase of specificity.
View all literature mentionsProgramming language for all operating systems that lets users work more quickly and integrate their systems more effectively. Often compared to Tcl, Perl, Ruby, Scheme or Java. Some of its key distinguishing features include very clear and readable syntax, strong introspection capabilities, intuitive object orientation, natural expression of procedural code, full modularity, exception-based error handling, high level dynamic data types, extensive standard libraries and third party modules for virtually every task, extensions and modules easily written in C, C (or Java for Python, or .NET languages for IronPython), and embeddable within applications as a scripting interface.
View all literature mentions