This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.
Tandem repeats (TRs) represent one of the most prevalent features of genomic sequences. Due to their abundance and functional significance, a plethora of detection tools has been devised over the last two decades. Despite the longstanding interest, TR detection is still not resolved. Our large-scale tests reveal that current detectors produce different, often nonoverlapping inferences, reflecting characteristics of the underlying algorithms rather than the true distribution of TRs in genomic data. Our simulations show that the power of detecting TRs depends on the degree of their divergence, and repeat characteristics such as the length of the minimal repeat unit and their number in tandem. To reconcile the diverse predictions of current algorithms, we propose and evaluate several statistical criteria for measuring the quality of predicted repeat units. In particular, we propose a model-based phylogenetic classifier, entailing a maximum-likelihood estimation of the repeat divergence. Applied in conjunction with the state of the art detectors, our statistical classification scheme for inferred repeats allows to filter out false-positive predictions. Since different algorithms appear to specialize at predicting TRs with certain properties, we advise applying multiple detectors with subsequent filtering to obtain the most complete set of genuine repeats.
Hop (Humulus lupulus L.) is known for its use as a bittering agent in beer and has a rich history of cultivation, beginning in Europe and now spanning the globe. There are five wild varieties worldwide, which may have been introgressed with cultivated varieties. As a dioecious species, its obligate outcrossing, non-Mendelian inheritance, and genomic structural variability have confounded directed breeding efforts. Consequently, understanding the hop genome represents a considerable challenge, requiring additional resources. In order to facilitate investigations into the transmission genetics of hop, we report here a tandem repeat discovery pipeline developed using k-mer filtering and dot plot analysis of PacBio long-read sequences from the hop cultivar Apollo. From this we identified 17 new and distinct tandem repeat sequence families, which represent candidates for FISH probe development. For two of these candidates, HuluTR120 and HuluTR225, we produced oligonucleotide FISH probes from conserved regions of and demonstrated their utility by staining meiotic chromosomes from wild hop, var. neomexicanus to address, for example, questions about hop transmission genetics. Collectively, these tandem repeat sequence families represent new resources suitable for development of additional cytogenomic tools for hop research.
Tandem repeat (TR) expansion is the underlying cause of over 40 neurological disorders. Long-read sequencing offers an exciting avenue over conventional technologies for detecting TR expansions. Here, we present Straglr, a robust software tool for both targeted genotyping and novel expansion detection from long-read alignments. We benchmark Straglr using various simulations, targeted genotyping data of cell lines carrying expansions of known diseases, and whole genome sequencing data with chromosome-scale assembly. Our results suggest that Straglr may be useful for investigating disease-associated TR expansions using long-read sequencing.
The availability of the genome sequence of the unisexual (male-female) Caenorhabditis nigoni offers an opportunity to compare its non-coding features with the related hermaphroditic species Caenorhabditis briggsae; to understand the evolutionary dynamics of their tandem repeat sequences (satellites), as a result of evolution from the unisexual ancestor. We take advantage of the previously developed SATFIND program to build satellite families defined by a consensus sequence. The relative number of satellites (satellites/Mb) in C. nigoni is 24.6% larger than in C. briggsae. Some satellites in C. nigoni have developed from a proto-repeat present in the ancestor species and are conserved as an isolated sequence in C. briggsae. We also identify unique satellites which occur only once and joint satellite families with a related sequence in both species. Some of these families are only found in C. nigoni, which indicates a recent appearance; they contain conserved adjacent 5' and 3' regions, which may favor transposition. Our results show that the number, length and turnover of satellites are restricted in the hermaphrodite C. briggsae when compared with the unisexual C. nigoni. We hypothesize that this results from differences in unequal recombination during meiotic chromosome pairing, which limits satellite turnover in hermaphrodites.
Animal models of bone marrow transplantation (BMT) allow evaluation of new experimental treatment strategies. One potential strategy involves the treatment of donor marrow with ultra-violet B light to allow transplantation across histocompatibility boundaries without an increase in graft rejection or graft-versus-host disease. A major requirement for a new experimental protocol, particularly if it involves manipulation of the donor marrow, is that the manipulated marrow gives rise to long-term multilineage engraftment. DNA based methodologies are now routinely used by many centres to evaluate engraftment and degree of chimaerism post-BMT in humans. We report the adaptation of this methodology to the serial study of engraftment in rodents. Conditions have been defined which allow analysis of serial tail vein samples using PCR of short tandem repeat sequences (STR-PCR). These markers have been used to evaluate the contribution of ultraviolet B treated marrow to engraftment following BMT in rodents without compromising the health of the animals under study. Chimaerism data from sequential tail vein samples and bone marrow from selected sacrificed animals showed excellent correlation, thus confirming the validity of this approach in analysing haemopoietic tissue. Thus the use of this assay may facilitate experimental studies in animal BMT.
We assumed that targeted next-generation sequencing (NGS) of mismatch repair-associated genes could improve the detection of driving mutations in colorectal cancers (CRC) with microsatellite instability (MSI) and microsatellite alterations at selected tetranucleotide repeats (EMAST) and clarify the somatic mutation patterns of CRC subtypes.
The advent of long-read DNA sequencing is allowing complete assembly of highly repetitive genomic regions for the first time, including the megabase-scale satellite repeat arrays found in many eukaryotic centromeres. The assembly of such repetitive regions creates a need for their de novo annotation, including patterns of higher order repetition. To annotate tandem repeats, methods are required that can be widely applied to diverse genome sequences, without prior knowledge of monomer sequences.
Tandemly repeated DNA is highly mutable and causes at least 31 diseases, but it is hard to detect pathogenic repeat expansions genome-wide. Here, we report robust detection of human repeat expansions from careful alignments of long but error-prone (PacBio and nanopore) reads to a reference genome. Our method is robust to systematic sequencing errors, inexact repeats with fuzzy boundaries, and low sequencing coverage. By comparing to healthy controls, we prioritize pathogenic expansions within the top 10 out of 700,000 tandem repeats in whole genome sequencing data. This may help to elucidate the many genetic diseases whose causes remain unknown.
Strain discrimination within genetically highly similar bacteria is critical for epidemiological studies and forensic applications. An electrochemically driven melting curve analysis monitored by SERS has been utilised to reliably discriminate strains of the bacterial pathogen Yersinia pestis, the causative agent of plague. DNA amplicons containing Variable Number Tandem Repeats (VNTRs) were generated from three strains of Y. pestis: CO92, Harbin 35 and Kim. These amplicons contained a 10 base pair VNTR repeated 6, 5, and 4 times in CO92, Harbin 35 and Kim respectively. The assay also included a blocker oligonucleotide comprising 3 repeats of the 10-mer VNTR sequence. The use of the blocker reduced the effective length of the target sequence available to bind to the surface bound probe and significantly improved the sensitivity of the discrimination. The results were consistent during three replicates that were carried out on different days, using different batches of PCR product and different SERS sphere segment void (SSV) substrate. This methodology which combines low cost, speed and sensitivity is a promising alternative to the time consuming current electrophoretic methods.
Variable number tandem repeats (VNTRs) are composed of consecutive repetitive DNA with hypervariable repeat count and composition. They include protein coding sequences and associations with clinical disorders. It has been difficult to incorporate VNTR analysis in disease studies that use short-read sequencing because the traditional approach of mapping to the human reference is less effective for repetitive and divergent sequences. In this work, we solve VNTR mapping for short reads with a repeat-pangenome graph (RPGG), a data structure that encodes both the population diversity and repeat structure of VNTR loci from multiple haplotype-resolved assemblies. We develop software to build a RPGG, and use the RPGG to estimate VNTR composition with short reads. We use this to discover VNTRs with length stratified by continental population, and expression quantitative trait loci, indicating that RPGG analysis of VNTRs will be critical for future studies of diversity and disease.
Microsatellite mining is a common outcome of the in silico approach to genomic studies. The resulting short tandemly repeated DNA could be used as molecular markers for studying polymorphism, genotyping and forensics. The omni short tandem repeat finder and primer designer (OSTRFPD) is among the few versatile, platform-independent open-source tools written in Python that enables researchers to identify and analyse genome-wide short tandem repeats in both nucleic acids and protein sequences. OSTRFPD is designed to run either in a user-friendly fully featured graphical interface or in a command line interface mode for advanced users. OSTRFPD can detect both perfect and imperfect repeats of low complexity with customisable scores. Moreover, the software has built-in architecture to simultaneously filter selection of flanking regions in DNA and generate microsatellite-targeted primers implementing the Primer3 platform. The software has built-in motif-sequence generator engines and an additional option to use the dictionary mode for custom motif searches. The software generates search results including general statistics containing motif categorisation, repeat frequencies, densities, coverage, guanine-cytosine (GC) content, and simple text-based imperfect alignment visualisation. Thus, OSTRFPD presents users with a quick single-step solution package to assist development of microsatellite markers and categorise tandemly repeated amino acids in proteome databases. Practical implementation of OSTRFPD was demonstrated using publicly available whole-genome sequences of selected Plasmodium species. OSTRFPD is freely available and open-sourced for improvement and user-specific adaptation.
Knowing the three-dimensional (3D) structure of the chromatin is important for obtaining a complete picture of the regulatory landscape. Changes in the 3D structure have been implicated in diseases. While there exist approaches that attempt to predict the long-range chromatin interactions, they focus only on interactions between specific genomic regions - the promoters and enhancers, neglecting other possibilities, for instance, the so-called structural interactions involving intervening chromatin.
Progressive multifocal leukoencephalopathy (PML) is an often fatal demyelinating disease caused by lytic infection of oligodendrocytes with JC virus (JCV). The development of PML in non-immunosuppressed individuals is a growing concern with reports of mortality in patients treated with mAb therapies. JCV can persist in the kidneys, lymphoid tissue and bone marrow. JCV gene expression is restricted by non-coding viral regulatory region sequence variation and cellular transcription factors. Because JCV latency has been associated with cells undergoing haematopoietic development, transcription factors previously reported as lymphoid specific may regulate JCV gene expression. This study demonstrates that one such transcription factor, Spi-B, binds to sequences present in the JCV promoter/enhancer and may affect early virus gene expression in cells obtained from human brain tissue. We identified four potential Spi-B-binding sites present in the promoter/enhancer elements of JCV sequences from PML variants and the non-pathogenic archetype. Spi-B sites present in the promoter/enhancers of PML variants alone bound protein expressed in JCV susceptible brain and lymphoid-derived cell lines by electromobility shift assays. Expression of exogenous Spi-B in semi- and non-permissive cells increased early viral gene expression. Strikingly, mutation of the Spi-B core in a binding site unique to the Mad-4 variant was sufficient to abrogate viral activity in progenitor-derived astrocytes. These results suggest that Spi-B could regulate JCV gene expression in susceptible cells, and may play an important role in JCV activity in the immune and nervous systems.
De novo protein design methods can create proteins with folds not yet seen in nature. These methods largely focus on optimizing the compatibility between the designed sequence and the intended conformation, without explicit consideration of protein folding pathways. Deeply knotted proteins, whose topologies may introduce substantial barriers to folding, thus represent an interesting test case for protein design. Here we report our attempts to design proteins with trefoil (31) and pentafoil (51) knotted topologies. We extended previously described algorithms for tandem repeat protein design in order to construct deeply knotted backbones and matching designed repeat sequences (N = 3 repeats for the trefoil and N = 5 for the pentafoil). We confirmed the intended conformation for the trefoil design by X ray crystallography, and we report here on this protein's structure, stability, and folding behaviour. The pentafoil design misfolded into an asymmetric structure (despite a 5-fold symmetric sequence); two of the four repeat-repeat units matched the designed backbone while the other two diverged to form local contacts, leading to a trefoil rather than pentafoil knotted topology. Our results also provide insights into the folding of knotted proteins.
Two distinct classes of repetitive sequences, interspersed mobile elements and satellite DNAs, shape eukaryotic genomes and drive their evolution. Short arrays of tandem repeats can also be present within nonautonomous miniature inverted repeat transposable elements (MITEs). In the clam Donax trunculus, we characterized a composite, high copy number MITE, named DTC84. It is composed of a central region built of up to five core repeats linked to a microsatellite segment at one array end and flanked by sequences holding short inverted repeats. The modular composition and the conserved putative target site duplication sequence AA at the element termini are equivalent to the composition of several elements found in the cupped oyster Crassostrea virginica and in some insects. A unique feature of D. trunculus element is ordered array of core repeat variants, distinctive by diagnostic changes. Position of variants in the array is fixed, regardless of alterations in the core repeat copy number. Each repeat harbors a palindrome near the junction with the following unit, being a potential hotspot responsible for array length variations. As a consequence, variations in number of tandem repeats and variations in flanking sequences make every sequenced element unique. Core repeats may be thus considered as individual units within the MITE, with flanking sequences representing a "cassette" for internal repeats. Our results demonstrate that onset and spread of tandem repeats can be more intimately linked to processes of transposition than previously thought and suggest that genomes are shaped by interplays within a complex network of repetitive sequences.
Tandem-repeat protein domains, composed of repeated units of conserved stretches of 20-40 amino acids, are required for a wide array of biological functions. Despite their diverse and fundamental functions, there has been no comprehensive assessment of their taxonomic distribution, incidence, and associations with organismal lifestyle and phylogeny. In this study, we assess for the first time the abundance of armadillo (ARM) and tetratricopeptide (TPR) repeat domains across all three domains in the tree of life and compare the results to our previous analysis on ankyrin (ANK) repeat domains in this journal. All eukaryotes and a majority of the bacterial and archaeal genomes analyzed have a minimum of one TPR and ARM repeat. In eukaryotes, the fraction of ARM-containing proteins is approximately double that of TPR and ANK-containing proteins, whereas bacteria and archaea are enriched in TPR-containing proteins relative to ARM- and ANK-containing proteins. We show in bacteria that phylogenetic history, rather than lifestyle or pathogenicity, is a predictor of TPR repeat domain abundance, while neither phylogenetic history nor lifestyle predicts ARM repeat domain abundance. Surprisingly, pathogenic bacteria were not enriched in TPR-containing proteins, which have been associated within virulence factors in certain species. Taken together, this comparative analysis provides a newly appreciated view of the prevalence and diversity of multiple types of tandem-repeat protein domains across the tree of life. A central finding of this analysis is that tandem repeat domain-containing proteins are prevalent not just in eukaryotes, but also in bacterial and archaeal species.
Tandem repeat expansions (TREs) can cause neurological diseases but their impact in schizophrenia is unclear. Here we analyzed genome sequences of adults with schizophrenia and found that they have a higher burden of TREs that are near exons and rare in the general population, compared with non-psychiatric controls. These TREs are disproportionately found at loci known to be associated with schizophrenia from genome-wide association studies, in individuals with clinically-relevant genetic variants at other schizophrenia loci, and in families where multiple individuals have schizophrenia. We showed that rare TREs in schizophrenia may impact synaptic functions by disrupting the splicing process of their associated genes in a loss-of-function manner. Our findings support the involvement of genome-wide rare TREs in the polygenic nature of schizophrenia.
Tandem repeats (TRs) represent one of the largest sources of genetic variation in humans and are implicated in a range of phenotypes. Here we present a deep characterization of TR variation based on high coverage whole genome sequencing from 3,550 diverse individuals from the 1000 Genomes Project and H3Africa cohorts. We develop a method, EnsembleTR, to integrate genotypes from four separate methods resulting in high-quality genotypes at more than 1.7 million TR loci. Our catalog reveals novel sequence features influencing TR heterozygosity, identifies population-specific trinucleotide expansions, and finds hundreds of novel eQTL signals. Finally, we generate a phased haplotype panel which can be used to impute most TRs from nearby single nucleotide polymorphisms (SNPs) with high accuracy. Overall, the TR genotypes and reference haplotype panel generated here will serve as valuable resources for future genome-wide and population-wide studies of TRs and their role in human phenotypes.
Short tandem repeats (STRs) have a wide range of applications, including medical genetics, forensics, and genetic genealogy. High-throughput sequencing (HTS) has the potential to profile hundreds of thousands of STR loci. However, mainstream bioinformatics pipelines are inadequate for the task. These pipelines treat STR mapping as gapped alignment, which results in cumbersome processing times and a biased sampling of STR alleles. Here, we present lobSTR, a novel method for profiling STRs in personal genomes. lobSTR harnesses concepts from signal processing and statistical learning to avoid gapped alignment and to address the specific noise patterns in STR calling. The speed and reliability of lobSTR exceed the performance of current mainstream algorithms for STR profiling. We validated lobSTR's accuracy by measuring its consistency in calling STRs from whole-genome sequencing of two biological replicates from the same individual, by tracing Mendelian inheritance patterns in STR alleles in whole-genome sequencing of a HapMap trio, and by comparing lobSTR results to traditional molecular techniques. Encouraged by the speed and accuracy of lobSTR, we used the algorithm to conduct a comprehensive survey of STR variations in a deeply sequenced personal genome. We traced the mutation dynamics of close to 100,000 STR loci and observed more than 50,000 STR variations in a single genome. lobSTR's implementation is an end-to-end solution. The package accepts raw sequencing reads and provides the user with the genotyping results. It is written in C/C++, includes multi-threading capabilities, and is compatible with the BAM format.
Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the facets that you can filter your papers by.
From here we'll present any options for the literature, such as exporting your current results.
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.
Year:
Count: