This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.
An avalanche of next generation sequencing (NGS) studies has generated an unprecedented amount of genomic structural variation data. These studies have also identified many novel gene fusion candidates with more detailed resolution than previously achieved. However, in the excitement and necessity of publishing the observations from this recently developed cutting-edge technology, no community standardization approach has arisen to organize and represent the data with the essential attributes in an interchangeable manner. As transcriptome studies have been widely used for gene fusion discoveries, the current non-standard mode of data representation could potentially impede data accessibility, critical analyses, and further discoveries in the near future.
Gene fusion technology is a key tool in facilitating gene function studies. Hybrid molecules in which all the components are joined precisely, without the presence of intervening and unwanted extraneous sequences, enable accurate studies of molecules and the characterization of individual components. This article reviews situations in which seamlessly fused genes and proteins are required or desired and describes molecular approaches that are available for generating these hybrid molecules.
Fusion genes are both useful cancer biomarkers and important drug targets. Finding relevant fusion genes is challenging due to genomic instability resulting in a high number of passenger events. To reveal and prioritize relevant gene fusion events we have developed FUsionN Gene Identification toolset (FUNGI) that uses an ensemble of fusion detection algorithms with prioritization and visualization modules.
Double-stranded DNA breaks occur on a regular basis in the human genome as a consequence of genotoxic stress and errors during replication. Usually these breaks are rapidly and faithfully repaired, but occasionally different chromosomes, or different regions of the same chromosome, are fused to each other. Some of these aberrant chromosomal translocations yield functional recombinant genes, which have been implicated as the cause of a number of lymphomas, leukemias, sarcomas, and solid tumors. Reliable methods are needed for the in situ detection of the transcripts encoded by these recombinant genes. We have developed just such a method, utilizing single-molecule fluorescence in situ hybridization (sm-FISH), in which approximately 50 short fluorescent probes bind to adjacent sites on the same mRNA molecule, rendering each target mRNA molecule visible as a diffraction-limited spot in a fluorescence microscope. Utilizing this method, gene fusion transcripts are detected with two differently colored probe sets, each specific for one of the two recombinant segments of a target mRNA; enabling the fusion transcripts to be seen in the microscope as distinct spots that fluoresce in both colors. We demonstrate this method by detecting the BCR-ABL fusion transcripts that occur in chronic myeloid leukemia cells, and by detecting the EWSR1-FLI1 fusion transcripts that occur in Ewing's sarcoma cells. This technology should pave the way for accurate in situ typing of many cancers that are associated with, or caused by, fusion transcripts.
Rhabdomyosarcoma is subclassified by the presence or absence of a recurrent chromosome translocation that fuses the FOXO1 and PAX3 or PAX7 genes. The fusion protein (FOXO1-PAX3/7) retains both binding domains and becomes a novel and potent transcriptional regulator in rhabdomyosarcoma subtypes. Many studies have characterized and integrated genomic, transcriptomic, and epigenomic differences among rhabdomyosarcoma subtypes that contain the FOXO1-PAX3/7 gene fusion and those that do not; however, few investigations have investigated how gene co-expression networks are altered by FOXO1-PAX3/7. Although transcriptional data offer insight into one level of functional regulation, gene co-expression networks have the potential to identify biological interactions and pathways that underpin oncogenesis and tumorigenicity. Thus, we examined gene co-expression networks for rhabdomyosarcoma that were FOXO1-PAX3 positive, FOXO1-PAX7 positive, or fusion negative. Gene co-expression networks were mined using local maximum Quasi-Clique Merger (lmQCM) and analyzed for co-expression differences among rhabdomyosarcoma subtypes. This analysis observed 41 co-expression modules that were shared between fusion negative and positive samples, of which 17/41 showed significant up- or down-regulation in respect to fusion status. Fusion positive and negative rhabdomyosarcoma showed differing modularity of co-expression networks with fusion negative (n = 109) having significantly more individual modules than fusion positive (n = 53). Subsequent analysis of gene co-expression networks for PAX3 and PAX7 type fusions observed 17/53 were differentially expressed between the two subtypes. Gene list enrichment analysis found that gene ontology terms were poorly matched with biological processes and molecular function for most co-expression modules identified in this study; however, co-expressed modules were frequently localized to cytobands on chromosomes 8 and 11. Overall, we observed substantial restructuring of co-expression networks relative to fusion status and fusion type in rhabdomyosarcoma and identified previously overlooked genes and pathways that may be targeted in this pernicious disease.
Myofibroma is a benign pericytic tumour affecting young children. The presence of multicentric myofibromas defines infantile myofibromatosis (IMF), which is a life-threatening condition when associated with visceral involvement. The disease pathophysiology remains poorly characterized. In this study, we performed deep RNA sequencing on eight myofibroma samples, including two from patients with IMF. We identified five different in-frame gene fusions in six patients, including three previously described fusion transcripts, SRF-CITED1, SRF-ICA1L and MTCH2-FNBP4, and a fusion of unknown significance, FN1-TIMP1. We found a novel COL4A1-VEGFD gene fusion in two cases, one of which also carried a PDGFRB mutation. We observed a robust expression of VEGFD by immunofluorescence on the corresponding tumour sections. Finally, we showed that the COL4A1-VEGFD chimeric protein was processed to mature VEGFD growth factor by proteases, such as the FURIN proprotein convertase. In conclusion, our results unravel a new recurrent gene fusion that leads to VEGFD production under the control of the COL4A1 gene promoter in myofibroma. This fusion is highly reminiscent of the COL1A1-PDGFB oncogene associated with dermatofibrosarcoma protuberans. This work has implications for the diagnosis and, possibly, the treatment of a subset of myofibromas.
Gene fusions can play important roles in tumor initiation and progression. While fusion detection so far has been from bulk samples, full-length single-cell RNA sequencing (scRNA-seq) offers the possibility of detecting gene fusions at the single-cell level. However, scRNA-seq data have a high noise level and contain various technical artifacts that can lead to spurious fusion discoveries. Here, we present a computational tool, scFusion, for gene fusion detection based on scRNA-seq. We evaluate the performance of scFusion using simulated and five real scRNA-seq datasets and find that scFusion can efficiently and sensitively detect fusions with a low false discovery rate. In a T cell dataset, scFusion detects the invariant TCR gene recombinations in mucosal-associated invariant T cells that many methods developed for bulk data fail to detect; in a multiple myeloma dataset, scFusion detects the known recurrent fusion IgH-WHSC1, which is associated with overexpression of the WHSC1 oncogene. Our results demonstrate that scFusion can be used to investigate cellular heterogeneity of gene fusions and their transcriptional impact at the single-cell level.
Among the diverse sources of neoantigens (i.e. single-nucleotide variants (SNVs), insertions or deletions (Indels) and fusion genes), fusion gene-derived neoantigens are generally more immunogenic, have multiple targets per mutation and are more widely distributed across various cancer types. Therefore, fusion gene-derived neoantigens are a potential source of highly immunogenic neoantigens and hold great promise for cancer immunotherapy. However, the lack of fusion protein sequence resources and knowledge prevents this application. We introduce 'FusionNeoAntigen', a dedicated resource for fusion-specific neoantigens, accessible at https://compbio.uth.edu/FusionNeoAntigen. In this resource, we provide fusion gene breakpoint crossing neoantigens focused on ∼43K fusion proteins of ∼16K in-frame fusion genes from FusionGDB2.0. FusionNeoAntigen provides fusion gene information, corresponding fusion protein sequences, fusion breakpoint peptide sequences, fusion gene-derived neoantigen prediction, virtual screening between fusion breakpoint peptides having potential fusion neoantigens and human leucocyte antigens (HLAs), fusion breakpoint RNA/protein sequences for developing vaccines, information on samples with fusion-specific neoantigen, potential CAR-T targetable cell-surface fusion proteins and literature curation. FusionNeoAntigen will help to develop fusion gene-based immunotherapies. We will report all potential fusion-specific neoantigens from all possible open reading frames of ∼120K human fusion genes in future versions.
Genomic instability is a hallmark of cancer and, as such, structural alterations and fusion genes are common events in the cancer landscape. RNA sequencing (RNA-Seq) is a powerful method for profiling cancers, but current methods for identifying fusion genes are optimised for short reads. JAFFA (https://github.com/Oshlack/JAFFA/wiki) is a sensitive fusion detection method that outperforms other methods with reads of 100 bp or greater. JAFFA compares a cancer transcriptome to the reference transcriptome, rather than the genome, where the cancer transcriptome is inferred using long reads directly or by de novo assembling short reads.
The Drosophila trachea is a premier genetic system to investigate the fundamental mechanisms of tubular organ formation. Tracheal fusion cells lead the branch fusion process to form an interconnected tubular network. Therefore, fusion cells in the Drosophila trachea will be an excellent model to study branch fusion in mammalian tubular organs, such as kidneys and blood vessels. The fusion process is a dynamic cellular process involving cell migration, adhesion, vesicle trafficking, cytoskeleton rearrangement, and membrane fusion. To understand how these cellular events are coordinated, we initiated the critical step to assemble a gene expression profile of fusion cells. For this study, we analyzed the expression of 234 potential tracheal-expressed genes in fusion cells during fusion cell development. 143 Tracheal genes were found to encode transcription factors, signal proteins, cytoskeleton and matrix proteins, transporters, and proteins with unknown function. These genes were divided into four subgroups based on their levels of expression in fusion cells compared to neighboring non-fusion cells revealed by in situ hybridization: (1) genes that have relative high abundance in fusion cells, (2) genes that are dynamically expressed in fusion cells, (3) genes that have relative low abundance in fusion cells, and (4) genes that are expressed at similar levels in fusion cells and non-fusion tracheal cells. This study identifies the expression profile of fusion cells and hypothetically suggests genes which are necessary for the fusion process and which play roles in distinct stages of fusion, as indicated by the location and timing of expression. These data will provide the basis for a comprehensive understanding of the molecular and cellular mechanisms of branch fusion.
We present the functional characterization of a pseudogene associated recurrent gene fusion in prostate cancer. The fusion gene KLK4-KLKP1 is formed by the fusion of the protein coding gene KLK4 with the noncoding pseudogene KLKP1. Screening of a cohort of 659 patients (380 Caucasian American; 250 African American, and 29 patients from other races) revealed that the KLK4-KLKP1 is expressed in about 32% of prostate cancer patients. Correlative analysis with other ETS gene fusions and SPINK1 revealed a concomitant expression pattern of KLK4-KLKP1 with ERG and a mutually exclusive expression pattern with SPINK1, ETV1, ETV4, and ETV5. Development of an antibody specific to KLK4-KLKP1 fusion protein confirmed the expression of the full-length KLK4-KLKP1 protein in prostate tissues. The in vitro and in vivo functional assays to study the oncogenic properties of KLK4-KLKP1 confirmed its role in cell proliferation, cell invasion, intravasation, and tumor formation. Presence of strong ERG and AR binding sites located at the fusion junction in KLK4-KLKP1 suggests that the fusion gene is regulated by ERG and AR. Correlative analysis of clinical data showed an association of KLK4-KLKP1 with lower preoperative PSA values and in young men (<50 years) with prostate cancer. Screening of patient urine samples showed that KLK4-KLKP1 can be detected noninvasively in urine. Taken together, we present KLK4-KLKP1 as a class of pseudogene associated fusion transcript in cancer with potential applications as a biomarker for routine screening of prostate cancer.
We report the first case of a primary renal undifferentiated sarcoma harboring an SS18::POU5F1 gene fusion. The patient was a 38 year-old male diagnosed with a 5 cm renal tumor which invaded the adrenal gland and extended into the renal vein. Microscopically, the neoplasm had a predominantly undifferentiated round cell morphology, with areas of rhabdoid and spindle cell growth. Similar to the previously reported cases with this fusion, by immunohistochemistry the neoplasm expressed S100 protein and epithelial markers (diffuse EMA, focal cytokeratin), suggesting the possibility of a myoepithelial phenotype. This report documents another example of a fusion-positive undifferentiated soft tissue sarcoma occurring as a primary renal neoplasm, adding to the already broad list of such entities. It highlights the crucial role of molecular analysis in establishing a specific diagnosis given the overlapping morphology and immunophenotypes such entities may exhibit.
Fusion proteins have unique oncogenic properties and their identification can be useful either as diagnostic or therapeutic targets. Next generation sequencing data have previously shown a fusion gene formed between Rad51C and ATXN7 genes in the MCF7 breast cancer cell line. However, the existence of this fusion gene in colorectal patient tumor tissues is largely still unknown.
Data integration procedures combine heterogeneous data sets into predictive models, but they are limited to data explicitly related to the target object type, such as genes. Collage is a new data fusion approach to gene prioritization. It considers data sets of various association levels with the prediction task, utilizes collective matrix factorization to compress the data, and chaining to relate different object types contained in a data compendium. Collage prioritizes genes based on their similarity to several seed genes. We tested Collage by prioritizing bacterial response genes in Dictyostelium as a novel model system for prokaryote-eukaryote interactions. Using 4 seed genes and 14 data sets, only one of which was directly related to the bacterial response, Collage proposed 8 candidate genes that were readily validated as necessary for the response of Dictyostelium to Gram-negative bacteria. These findings establish Collage as a method for inferring biological knowledge from the integration of heterogeneous and coarsely related data sets.
Cytology samples are suitable for the study of genotypic and phenotypic changes observed in different tumors. Being a minimally invasive technique, cytology sampling has been used as an acceptable alternative to track the alterations associated with tumor progression. Although the detection of gene mutations is well-established on cytology, in the last few years, gene fusion detections are becoming mandatory, especially in some tumor types such as lung cancer. Different technologies are available such as immunocytochemistry, fluorescence in situ hybridization, reverse transcription-polymerase chain reaction, and massive parallel sequencing approaches. Considering that many new drugs targeted fusion proteins, cytological samples can be of use to detect gene fusions in solid and lymphoproliferative tumor patients. In this article, we revised the use of several techniques utilized to check gene fusions in cytological material.
Gene fusions are known to play critical roles in tumor pathogenesis. Yet, sensitive and specific algorithms to detect gene fusions in cancer do not currently exist. In this paper, we present a new statistical algorithm, MACHETE (Mismatched Alignment CHimEra Tracking Engine), which achieves highly sensitive and specific detection of gene fusions from RNA-Seq data, including the highest Positive Predictive Value (PPV) compared to the current state-of-the-art, as assessed in simulated data. We show that the best performing published algorithms either find large numbers of fusions in negative control data or suffer from low sensitivity detecting known driving fusions in gold standard settings, such as EWSR1-FLI1. As proof of principle that MACHETE discovers novel gene fusions with high accuracy in vivo, we mined public data to discover and subsequently PCR validate novel gene fusions missed by other algorithms in the ovarian cancer cell line OVCAR3. These results highlight the gains in accuracy achieved by introducing statistical models into fusion detection, and pave the way for unbiased discovery of potentially driving and druggable gene fusions in primary tumors.
Background: Fusion genes play an important role in the tumorigenesis of many cancers. Next-generation sequencing (NGS) technologies have been successfully applied in fusion gene detection for the last several years, and a number of NGS-based tools have been developed for identifying fusion genes during this period. Most fusion gene detection tools based on RNA-seq data report a large number of candidates (mostly false positives), making it hard to prioritize candidates for experimental validation and further analysis. Selection of reliable fusion genes for downstream analysis becomes very important in cancer research. We therefore developed confFuse, a scoring algorithm to reliably select high-confidence fusion genes which are likely to be biologically relevant. Results: confFuse takes multiple parameters into account in order to assign each fusion candidate a confidence score, of which score ≥8 indicates high-confidence fusion gene predictions. These parameters were manually curated based on our experience and on certain structural motifs of fusion genes. Compared with alternative tools, based on 96 published RNA-seq samples from different tumor entities, our method can significantly reduce the number of fusion candidates (301 high-confidence from 8,083 total predicted fusion genes) and keep high detection accuracy (recovery rate 85.7%). Validation of 18 novel, high-confidence fusions detected in three breast tumor samples resulted in a 100% validation rate. Conclusions: confFuse is a novel downstream filtering method that allows selection of highly reliable fusion gene candidates for further downstream analysis and experimental validations. confFuse is available at https://github.com/Zhiqin-HUANG/confFuse.
Zinc finger nucleases (ZFNs) consist of zinc fingers as DNA-binding module and the non-specific DNA-cleavage domain of the restriction endonuclease FokI as DNA-cleavage module. This architecture is also used by TALE nucleases (TALENs), in which the DNA-binding modules of the ZFNs have been replaced by DNA-binding domains based on transcription activator like effector (TALE) proteins. Both TALENs and ZFNs are programmable nucleases which rely on the dimerization of FokI to induce double-strand DNA cleavage at the target site after recognition of the target DNA by the respective DNA-binding module. TALENs seem to have an advantage over ZFNs, as the assembly of TALE proteins is easier than that of ZFNs. Here, we present evidence that variant TALENs can be produced by replacing the catalytic domain of FokI with the restriction endonuclease PvuII. These fusion proteins recognize only the composite recognition site consisting of the target site of the TALE protein and the PvuII recognition sequence (addressed site), but not isolated TALE or PvuII recognition sites (unaddressed sites), even at high excess of protein over DNA and long incubation times. In vitro, their preference for an addressed over an unaddressed site is > 34,000-fold. Moreover, TALE-PvuII fusion proteins are active in cellula with minimal cytotoxicity.
Despite the availability of numerous gene fusion systems, recombinant protein expression in Escherichia coli remains difficult. Establishing the best fusion partner for difficult-to-express proteins remains empirical. To determine which fusion tags are best suited for difficult-to-express proteins, a comparative analysis of the newly described SUMO fusion system with a variety of commonly used fusion systems was completed. For this study, three model proteins, enhanced green fluorescent protein (eGFP), matrix metalloprotease-13 (MMP13), and myostatin (growth differentiating factor-8, GDF8), were fused to the C termini of maltose-binding protein (MBP), glutathione S-transferase (GST), thioredoxin (TRX), NUS A, ubiquitin (Ub), and SUMO tags. These constructs were expressed in E. coli and evaluated for expression and solubility. As expected, the fusion tags varied in their ability to produce tractable quantities of soluble eGFP, MMP13, and GDF8. SUMO and NUS A fusions enhanced expression and solubility of recombinant proteins most dramatically. The ease at which SUMO and NUS A fusion tags were removed from their partner proteins was then determined. SUMO fusions are cleaved by the natural SUMO protease, while an AcTEV protease site had to be engineered between NUS A and its partner protein. A kinetic analysis showed that the SUMO and AcTEV proteases had similar KM values, but SUMO protease had a 25-fold higher kcat than AcTEV protease, indicating a more catalytically efficient enzyme. Taken together, these results demonstrate that SUMO is superior to commonly used fusion tags in enhancing expression and solubility with the distinction of generating recombinant protein with native sequences.
Despite the increasing quantity of tools for accurately predicting gene fusion candidates from sequencing data, we are still faced with the critical challenge of visualizing the corresponding gene fusion products to infer their biological consequence (i.e. novel protein and increased gene expression). This is currently accomplished by manually inspecting and inferring the biological consequence of top scoring gene fusion candidates. This labor-intensive process could be made easier by automating the annotation of gene fusion products and generating easily interpretable visualizations. We developed a gene fusion visualization tool, called INTEGRATE-Vis, that generates comprehensive, highly customizable, publication-quality graphics focused on annotating each gene fusion at the transcript- and protein-level and assessing expression within an individual sample or across a patient cohort. INTEGRATE-Vis is the first comprehensive gene fusion visualization tool to help a user infer the potential consequence of a gene fusion event. It has potential utility in both research and clinical settings. INTEGRATE-Vis is available at https://github.com/ChrisMaherLab/INTEGRATE-Vis .
Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the facets that you can filter your papers by.
From here we'll present any options for the literature, such as exporting your current results.
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.
Year:
Count: