Chromatin loops are a major component of 3D nuclear organization, visually apparent as intense point-to-point interactions in Hi-C maps. Identification of these loops is a critical part of most Hi-C analyses. However, current methods often miss visually evident CTCF loops in Hi-C data sets from mammals, and they completely fail to identify high intensity loops in other organisms. We present SIP, Significant Interaction Peak caller, and SIPMeta, which are platform independent programs to identify and characterize these loops in a time- and memory-efficient manner. We show that SIP is resistant to noise and sequencing depth, and can be used to detect loops that were previously missed in human cells as well as loops in other organisms. SIPMeta corrects for a common visualization artifact by accounting for Manhattan distance to create average plots of Hi-C and HiChIP data. We then demonstrate that the use of SIP and SIPMeta can lead to biological insights by characterizing the contribution of several transcription factors to CTCF loop stability in human cells. We also annotate loops associated with the SMC component of the dosage compensation complex (DCC) in Caenorhabditis elegans and demonstrate that loop anchors represent bidirectional blocks for symmetrical loop extrusion. This is in contrast to the asymmetrical extrusion until unidirectional blockage by CTCF that is presumed to occur in mammals. Using HiChIP and multiway ligation events, we then show that DCC loops form a network of strong interactions that may contribute to X Chromosome-wide condensation in C. elegans hermaphrodites.
Pubmed ID: 32127418 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Functional genomics data repository supporting MIAME-compliant data submissions. Includes microarray-based experiments measuring the abundance of mRNA, genomic DNA, and protein molecules, as well as non-array-based technologies such as serial analysis of gene expression (SAGE) and mass spectrometry proteomic technology. Array- and sequence-based data are accepted. Collection of curated gene expression DataSets, as well as original Series and Platform records. The database can be searched using keywords, organism, DataSet type and authors. DataSet records contain additional resources including cluster tools and differential expression queries.
View all literature mentionsCell line HCT 116 is a Cancer cell line with a species of origin Homo sapiens (Human)
View all literature mentionsCell line GM12878 is a Transformed cell line with a species of origin Homo sapiens (Human)
View all literature mentions