This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.
With the exponential growth of biological sequence data (DNA or Protein Sequence), DNA sequence analysis has become an essential task for biologist to understand the features, functions, structures, and evolution of species. Encoding DNA sequences is an effective method to extract the features from DNA sequences. It is commonly used for visualizing DNA sequences and analyzing similarities/dissimilarities between different species or cells. Although there have been many encoding approaches proposed for DNA sequence analysis, we require more elegant approaches for higher accuracy. In this paper, we propose a noble encoding approach for measuring the degree of similarity/dissimilarity between different species. Our approach can preserve the physiochemical properties, positional information, and the codon usage bias of nucleotides. An extensive performance study shows that our approach provides higher accuracy than existing approaches in terms of the degree of similarity.
Identification and clustering of orthologous genes plays an important role in developing evolutionary models such as validating convergent and divergent phylogeny and predicting functional proteins in newly sequenced species of unverified nucleotide protein mappings. Here, we introduce an application of subspace clustering as applied to orthologous gene sequences and discuss the initial results. The working hypothesis is based upon the concept that genetic changes between nucleotide sequences coding for proteins among selected species and groups may lie within a union of subspaces for clusters of the orthologous groups. Estimates for the subspace dimensions were computed for a small population sample. A series of experiments was performed to cluster randomly selected sequences. The experimental design allows for both false positives and false negatives, and estimates for the statistical significance are provided. The clustering results are consistent with the main hypothesis. A simple random mutation binary tree model is used to simulate speciation events that show the interdependence of the subspace rank versus time and mutation rates. The simple mutation model is found to be largely consistent with the observed subspace clustering singular value results. Our study indicates that the subspace clustering method may be applied in orthology analysis.
Following the miniaturization of integrated circuitry and other computer hardware over the past several decades, DNA sequencing is on a similar path. Leading this trend is the Oxford Nanopore sequencing platform, which currently offers the hand-held MinION instrument and even smaller instruments on the horizon. This technology has been used in several important applications, including the analysis of genomes of major pathogens in remote stations around the world. However, despite the simplicity of the sequencer, an equally simple and portable analysis platform is not yet available.
Delineation of underlying genomic and genetic factors in a specific disease may be valuable in establishing a definitive diagnosis and may guide patient management and counseling. In addition, genetic information may be useful in identification of at risk family members. Gene mapping and initial genome sequencing data enabled the development of microarrays to analyze genomic variants. The goal of this review is to consider different generations of sequencing techniques and their application to exome sequencing and whole genome sequencing and their clinical applications. In recent decades, exome sequencing has primarily been used in patient studies. Discussed in some detail, are important measures that have been developed to standardize variant calling and to assess pathogenicity of variants. Examples of cases where exome sequencing has facilitated diagnosis and led to improved medical management are presented. Whole genome sequencing and its clinical relevance are presented particularly in the context of analysis of nucleotide and structural genomic variants in large population studies and in certain patient cohorts. Applications involving analysis of cell free DNA in maternal blood for prenatal diagnosis of specific autosomal trisomies are reviewed. Applications of DNA sequencing to diagnosis and therapeutics of cancer are presented. Also discussed are important recent diagnostic applications of DNA sequencing in cancer, including analysis of tumor derived cell free DNA and exosomes that are present in body fluids. Insights gained into underlying pathogenetic mechanisms of certain complex common diseases, including schizophrenia, macular degeneration, neurodegenerative disease are presented. The relevance of different types of variants, rare, uncommon, and common to disease pathogenesis, and the continuum of causality, are addressed. Pharmogenetic variants detected by DNA sequence analysis are gaining in importance and are particularly relevant to personalized and precision medicine.
Prestin, encoded by the gene SLC26A5, is a transmembrane protein of the cochlear outer hair cell (OHC). Prestin is required for the somatic electromotile activity of OHCs, which is absent in OHCs and causes severe hearing impairment in mice lacking prestin. In humans, the role of sequence variations in SLC26A5 in hearing loss is less clear. Although prestin is expected to be required for functional human OHCs, the clinical significance of reported putative mutant alleles in humans is uncertain.
In a general computational context for biomedical data analysis, DNA sequence classification is a crucial challenge. Several machine learning techniques have used to complete this task in recent years successfully. Identification and classification of viruses are essential to avoid an outbreak like COVID-19. Regardless, the feature selection process remains the most challenging aspect of the issue. The most commonly used representations worsen the case of high dimensionality, and sequences lack explicit features. It also helps in detecting the effect of viruses and drug design. In recent days, deep learning (DL) models can automatically extract the features from the input. In this work, we employed CNN, CNN-LSTM, and CNN-Bidirectional LSTM architectures using Label and K-mer encoding for DNA sequence classification. The models are evaluated on different classification metrics. From the experimental results, the CNN and CNN-Bidirectional LSTM with K-mer encoding offers high accuracy with 93.16% and 93.13%, respectively, on testing data.
Bite mark injuries often feature in violent crimes. Conventional morphometric methods for the forensic analysis of bite marks involve elements of subjective interpretation that threaten the credibility of this field. Human DNA recovered from bite marks has the highest evidentiary value, however recovery can be compromised by salivary components. This study assessed the feasibility of matching bacterial DNA sequences amplified from experimental bite marks to those obtained from the teeth responsible, with the aim of evaluating the capability of three genomic regions of streptococcal DNA to discriminate between participant samples. Bite mark and teeth swabs were collected from 16 participants. Bacterial DNA was extracted to provide the template for PCR primers specific for streptococcal 16S ribosomal RNA (16S rRNA) gene, 16S-23S intergenic spacer (ITS) and RNA polymerase beta subunit (rpoB). High throughput sequencing (GS FLX 454), followed by stringent quality filtering, generated reads from bite marks for comparison to those generated from teeth samples. For all three regions, the greatest overlaps of identical reads were between bite mark samples and the corresponding teeth samples. The average proportions of reads identical between bite mark and corresponding teeth samples were 0.31, 0.41 and 0.31, and for non-corresponding samples were 0.11, 0.20 and 0.016, for 16S rRNA, ITS and rpoB, respectively. The probabilities of correctly distinguishing matching and non-matching teeth samples were 0.92 for ITS, 0.99 for 16S rRNA and 1.0 for rpoB. These findings strongly support the tenet that bacterial DNA amplified from bite marks and teeth can provide corroborating information in the identification of assailants.
Eukaryotic gene expression is often under the control of cooperatively acting transcription factors whose binding is limited by structural constraints. By determining these structural constraints, we can understand the "rules" that define functional cooperativity. Conversely, by understanding the rules of binding, we can infer structural characteristics. We have developed an information theory based method for approximating the physical limitations of cooperative interactions by comparing sequence analysis to microarray expression data. When applied to the coordinated binding of the sulfur amino acid regulatory protein Met4 by Cbf1 and Met31, we were able to create a combinatorial model that can correctly identify Met4 regulated genes. Interestingly, we found that the major determinant of Met4 regulation was the sum of the strength of the Cbf1 and Met31 binding sites and that the energetic costs associated with spacing appeared to be minimal.
Hepatitis B virus (HBV) integration into the host cell genome occurs early on in infection and reportedly induces pro-oncogenic changes in hepatocytes that drive HCC initiation. However, it remains unclear when these changes occur during hepatocarcinogenesis. Extensive expansion of hepatocyte clones with a selective advantage was shown to occur prior to cancer formation during the HBeAg-seroconversion phase of chronic HBV infection. We hypothesized that since integrations occur during the early stages of infection, cell phenotype could be altered and induce a selection advantage (e.g., through insertional mutagenesis or cis-mediated activation of downstream genes). Here, we analyzed the enrichment of genomic and functional patterns in the cellular host sequence adjacent to HBV DNA integration events. We examined 717 unique integration events detected in patients who have and have not undergone HBeAg-seroconversion (n = 41) or in an in vitro model system. We also used an in silico model to control for detection biases. We showed that the sites of HBV DNA integration were distributed throughout the entire host genome without obvious enrichment of specific structural or functional genomic features in the adjacent cellular genome during HBeAg-seroconversion. Currently, this is the most comprehensive characterization of HBV DNA integration events prior to hepatocarcinogenesis. Our results suggest no significant selection for (or against) specific cellular sites of HBV DNA integration occur during the clonal expansion phase of chronic HBV infection. Thus, HBV DNA integration events likely represent passenger events rather than active drivers of liver cancer, which was previously suggested.
Telomeric and subtelomeric regions are essential for genome stability and regular chromosome replication. In this work, we have characterized the wheat BAC (bacterial artificial chromosome) clones containing Spelt1 and Spelt52 sequences, which belong to the subtelomeric repeats of the B/G genomes of wheats and Aegilops species from the section Sitopsis.
The ubiquitous, eukaryotic, high-mobility group box (HMGB) chromosomal proteins promote many chromatin-mediated cellular activities through their non-sequence-specific binding and bending of DNA. Minor-groove DNA binding by the HMG box results in substantial DNA bending toward the major groove owing to electrostatic interactions, shape complementarity, and DNA intercalation that occurs at two sites. Here, the structures of the complexes formed with DNA by a partially DNA intercalation-deficient mutant of Drosophila melanogaster HMGD have been determined by X-ray crystallography at a resolution of 2.85 Å. The six proteins and 50 bp of DNA in the crystal structure revealed a variety of bound conformations. All of the proteins bound in the minor groove, bridging DNA molecules, presumably because these DNA regions are easily deformed. The loss of the primary site of DNA intercalation decreased overall DNA bending and shape complementarity. However, DNA bending at the secondary site of intercalation was retained and most protein-DNA contacts were preserved. The mode of binding resembles the HMGB1 box A-cisplatin-DNA complex, which also lacks a primary intercalating residue. This study provides new insights into the binding mechanisms used by HMG boxes to recognize varied DNA structures and sequences as well as modulate DNA structure and DNA bending.
Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.
Centromeric and pericentromeric chromosome regions are occupied by satellite DNA. Satellite DNAs play essential roles in chromosome segregation, and, thanks to their extensive sequence variability, to some extent, they can also be used as phylogenetic markers. In this paper, we isolated and sequenced satellite DNA I-IV in 11 species of Cervidae. The obtained satellite DNA sequences and their chromosomal distribution were compared among the analysed representatives of cervid subfamilies Cervinae and Capreolinae. Only satI and satII sequences are probably present in all analysed species with high abundance. On the other hand, fluorescence in situ hybridisation (FISH) with satIII and satIV probes showed signals only in a part of the analysed species, indicating interspecies copy number variations. Several indices, including FISH patterns, the high guanine and cytosine (GC) content, and the presence of centromere protein B (CENP-B) binding motif, suggest that the satII DNA may represent the most important satellite DNA family that might be involved in the centromeric function in Cervidae. The absence or low intensity of satellite DNA FISH signals on biarmed chromosomes probably reflects the evolutionary reduction of heterochromatin following the formation of chromosome fusions. The phylogenetic trees constructed on the basis of the satellite I-IV DNA relationships generally support the present cervid taxonomy.
Using non-conventional markers, DNA metabarcoding allows biodiversity assessment from complex substrates. In this article, we present ecoPrimers, a software for identifying new barcode markers and their associated PCR primers. ecoPrimers scans whole genomes to find such markers without a priori knowledge. ecoPrimers optimizes two quality indices measuring taxonomical range and discrimination to select the most efficient markers from a set of reference sequences, according to specific experimental constraints such as marker length or specifically targeted taxa. The key step of the algorithm is the identification of conserved regions among reference sequences for anchoring primers. We propose an efficient algorithm based on data mining, that allows the analysis of huge sets of sequences. We evaluate the efficiency of ecoPrimers by running it on three different sequence sets: mitochondrial, chloroplast and bacterial genomes. Identified barcode markers correspond either to barcode regions already in use for plants or animals, or to new potential barcodes. Results from empirical experiments carried out on a promising new barcode for analyzing vertebrate diversity fully agree with expectations based on bioinformatics analysis. These tests demonstrate the efficiency of ecoPrimers for inferring new barcodes fitting with diverse experimental contexts. ecoPrimers is available as an open source project at: http://www.grenoble.prabi.fr/trac/ecoPrimers.
According to recent archeological evidence, turkey (Meleagris gallopavo gallopavo) domestication may have occurred in Mexico around 2000 years ago. However, little is known about the phylogenetic and genealogical background underlying domestic turkey populations. This study aimed to further understand the domestication process and identify inter- or intraspecific connections between turkey populations to determine their origins, trace their global expansion, and define the species' genetic value. Ninety-three domestic turkeys (local breeds) were sampled from populations in Brazil, Mexico, USA, Spain, Italy, Iran, and Egypt. Publicly available sequences from previous studies were also included. Standard mitochondrial DNA, genetic diversity, and haplotype network analyses were performed. Seventy-six polymorphic sites were identified. Turkeys from Mexico showed the greatest number of polymorphic sites (40), while turkeys from Italy and Brazil reported only one site each. Nucleotide diversity was also highest in Mexico and the USA (π = 0.0175 and 0.0102, respectively) and lowest in Brazil and Italy. Of the six major haplogroups defined, the Mexican and USA populations appeared to have remained more stable and diverse than the other populations. This may be due to conservative husbandry policies in the rural areas of other populations, which have prevented the introduction of commercial turkey lines.
The three-dimensional organization of DNA is increasingly understood to play a decisive role in vital cellular processes. Many studies focus on the role of DNA-packaging proteins, crowding, and confinement in arranging chromatin, but structural information might also be directly encoded in bare DNA itself. Here, we visualize plectonemes (extended intertwined DNA structures formed upon supercoiling) on individual DNA molecules. Remarkably, our experiments show that the DNA sequence directly encodes the structure of supercoiled DNA by pinning plectonemes at specific sequences. We develop a physical model that predicts that sequence-dependent intrinsic curvature is the key determinant of pinning strength and demonstrate this simple model provides very good agreement with the data. Analysis of several prokaryotic genomes indicates that plectonemes localize directly upstream of promoters, which we experimentally confirm for selected promotor sequences. Our findings reveal a hidden code in the genome that helps to spatially organize the chromosomal DNA.
Salmonella enterica serovar Heidelberg is among the most detected serovars in swine and poultry, ranks among the top five serotypes associated with human salmonellosis and is disproportionately associated with invasive infections and mortality in humans. Salmonella are known to carry plasmids associated with antimicrobial resistance and virulence. To identify plasmid-associated genes in multidrug resistant S. enterica serovar Heidelberg, antimicrobial resistance plasmids from five isolates were sequenced using the 454 LifeSciences pyrosequencing technology. Four of the isolates contained incompatibility group (Inc) A/C multidrug resistance plasmids harboring at least eight antimicrobial resistance genes. Each of these strains also carried a second resistance plasmid including two IncFIB, an IncHI2 and a plasmid lacking an identified Inc group. The fifth isolate contained an IncI1 plasmid, encoding resistance to gentamicin, streptomycin and sulfonamides. Some of the IncA/C plasmids lacked the full concert of transfer genes and yet were able to be conjugally transferred, likely due to the transfer genes carried on the companion plasmids in the strains. Several non-IncA/C resistance plasmids also carried putative virulence genes. When the sequences were compared to previously sequenced plasmids, it was found that while all plasmids demonstrated some similarity to other plasmids, they were unique, often due to differences in mobile genetic elements in the plasmids. Our study suggests that Salmonella Heidelberg isolates harbor plasmids that co-select for antimicrobial resistance and virulence, along with genes that can mediate the transfer of plasmids within and among other bacterial isolates. Prevalence of such plasmids can complicate efforts to control the spread of S. enterica serovar Heidelberg in food animal and human populations.
Outbreaks of antibiotic-resistant bacterial infections emphasize the importance of surveillance of potentially pathogenic bacteria. Genomic sequencing of clinical microbiological specimens expands our capacity to study cultivable, fastidious and uncultivable members of the bacterial community. Herein, we compared the primary data collected by the NIH's Human Microbiome Project (HMP) with published epidemiological surveillance data of Staphylococcus aureus.
Von Willebrand disease (VWD) is a common inherited bleeding disorder caused by quantitative (types 1 and 3) and qualitative (type 2) defects in von Willebrand factor (VWF). The VWF gene is a large gene containing 52 exons; except for type 2 VWD, the majority of mutations causing VWD are not localized to specific exons. We have used denaturing high performance liquid chromatography (DHPLC) to scan the coding region of the VWF gene for sequence variations. Primers were designed to amplify all 52 exons while avoiding amplification of the VWF pseudogene. Exon-specific primers were designed with sequencing primers, allowing direct sequencing of each VWF exon. Sequence variations in 33 previously characterized von Willebrand disease (VWD) samples were all detected using DHPLC demonstrating the high sensitivity of this technique. In addition, we analyzed 42 patients or family members with VWD. Thirty-two novel sequence variations were identified (2 deletions, 2 nonsense, 15 missense, 6 silent, and 7 intronic), some with clear functional consequences. A previously described deletion in exon 18, 2435delC, was also found in two unrelated type 3 patients. This DHPLC and DNA sequencing technique will enable the full length assessment of the VWF gene necessary to detect mutations causing types 1 and 3 VWD.
Simple sequence repeats (SSRs), microsatellites or polymeric sequences are common in DNA and are important biologically. From mononucleotide to trinucleotide repeats and beyond, they can be found in long (> 6 repeating units) tracts and may be characterized by quantifying the frequencies in which they are found and their tract lengths. However, most of the existing computer programs that find SSR tracts do not include these methods.
Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the facets that you can filter your papers by.
From here we'll present any options for the literature, such as exporting your current results.
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.
Year:
Count: