Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.

Search

Type in a keyword to search

On page 1 showing 1 ~ 20 papers out of 13,745 papers

Proteogenomic Analysis of Protein Sequence Alterations in Breast Cancer Cells.

  • Iulia M Lazar‎ et al.
  • Scientific reports‎
  • 2019‎

Cancer evolves as a result of an accumulation of mutations and chromosomal aberrations. Developments in sequencing technologies have enabled the discovery and cataloguing of millions of such mutations. The identification of protein-level alterations, typically by using reversed-phase protein arrays or mass spectrometry, has lagged, however, behind gene and transcript-level observations. In this study, we report the use of mass spectrometry for detecting the presence of mutations-missense, indels and frame shifts-in MCF7 and SKBR3 breast cancer, and non-tumorigenic MCF10A cells. The mutations were identified by expanding the database search process of raw mass spectrometry files by including an in-house built database of mutated peptides (XMAn-v1) that complemented a minimally redundant, canonical database of Homo sapiens proteins. The work resulted in the identification of nearly 300 mutated peptide sequences, of which ~50 were characterized by quality tandem mass spectra. We describe the criteria that were used to select the mutated peptide sequences, evaluate the parameters that characterized these peptides, and assess the artifacts that could have led to false peptide identifications. Further, we discuss the functional domains and biological processes that may be impacted by the observed peptide alterations, and how protein-level detection can support the efforts of identifying cancer driving mutations and genes. Mass spectrometry data are available via ProteomeXchange with identifier PXD014458.


Protein evolution analysis of S-hydroxynitrile lyase by complete sequence design utilizing the INTMSAlign software.

  • Shogo Nakano‎ et al.
  • Scientific reports‎
  • 2015‎

Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.


Exploring the limitations of biophysical propensity scales coupled with machine learning for protein sequence analysis.

  • Daniele Raimondi‎ et al.
  • Scientific reports‎
  • 2019‎

Machine learning (ML) is ubiquitous in bioinformatics, due to its versatility. One of the most crucial aspects to consider while training a ML model is to carefully select the optimal feature encoding for the problem at hand. Biophysical propensity scales are widely adopted in structural bioinformatics because they describe amino acids properties that are intuitively relevant for many structural and functional aspects of proteins, and are thus commonly used as input features for ML methods. In this paper we reproduce three classical structural bioinformatics prediction tasks to investigate the main assumptions about the use of propensity scales as input features for ML methods. We investigate their usefulness with different randomization experiments and we show that their effectiveness varies among the ML methods used and the tasks. We show that while linear methods are more dependent on the feature encoding, the specific biophysical meaning of the features is less relevant for non-linear methods. Moreover, we show that even among linear ML methods, the simpler one-hot encoding can surprisingly outperform the "biologically meaningful" scales. We also show that feature selection performed with non-linear ML methods may not be able to distinguish between randomized and "real" propensity scales by properly prioritizing to the latter. Finally, we show that learning problem-specific embeddings could be a simple, assumptions-free and optimal way to perform feature learning/engineering for structural bioinformatics tasks.


Amalgamation of 3D structure and sequence information for protein-protein interaction prediction.

  • Kanchan Jha‎ et al.
  • Scientific reports‎
  • 2020‎

Protein is the primary building block of living organisms. It interacts with other proteins and is then involved in various biological processes. Protein-protein interactions (PPIs) help in predicting and hence help in understanding the functionality of the proteins, causes and growth of diseases, and designing new drugs. However, there is a vast gap between the available protein sequences and the identification of protein-protein interactions. To bridge this gap, researchers proposed several computational methods to reveal the interactions between proteins. These methods merely depend on sequence-based information of proteins. With the advancement of technology, different types of information related to proteins are available such as 3D structure information. Nowadays, deep learning techniques are adopted successfully in various domains, including bioinformatics. So, current work focuses on the utilization of different modalities, such as 3D structures and sequence-based information of proteins, and deep learning algorithms to predict PPIs. The proposed approach is divided into several phases. We first get several illustrations of proteins using their 3D coordinates information, and three attributes, such as hydropathy index, isoelectric point, and charge of amino acids. Amino acids are the building blocks of proteins. A pre-trained ResNet50 model, a subclass of a convolutional neural network, is utilized to extract features from these representations of proteins. Autocovariance and conjoint triad are two widely used sequence-based methods to encode proteins, which are used here as another modality of protein sequences. A stacked autoencoder is utilized to get the compact form of sequence-based information. Finally, the features obtained from different modalities are concatenated in pairs and fed into the classifier to predict labels for protein pairs. We have experimented on the human PPIs dataset and Saccharomyces cerevisiae PPIs dataset and compared our results with the state-of-the-art deep-learning-based classifiers. The results achieved by the proposed method are superior to those obtained by the existing methods. Extensive experimentations on different datasets indicate that our approach to learning and combining features from two different modalities is useful in PPI prediction.


PDNAsite: Identification of DNA-binding Site from Protein Sequence by Incorporating Spatial and Sequence Context.

  • Jiyun Zhou‎ et al.
  • Scientific reports‎
  • 2016‎

Protein-DNA interactions are involved in many fundamental biological processes essential for cellular function. Most of the existing computational approaches employed only the sequence context of the target residue for its prediction. In the present study, for each target residue, we applied both the spatial context and the sequence context to construct the feature space. Subsequently, Latent Semantic Analysis (LSA) was applied to remove the redundancies in the feature space. Finally, a predictor (PDNAsite) was developed through the integration of the support vector machines (SVM) classifier and ensemble learning. Results on the PDNA-62 and the PDNA-224 datasets demonstrate that features extracted from spatial context provide more information than those from sequence context and the combination of them gives more performance gain. An analysis of the number of binding sites in the spatial context of the target site indicates that the interactions between binding sites next to each other are important for protein-DNA recognition and their binding ability. The comparison between our proposed PDNAsite method and the existing methods indicate that PDNAsite outperforms most of the existing methods and is a useful tool for DNA-binding site identification. A web-server of our predictor (http://hlt.hitsz.edu.cn:8080/PDNAsite/) is made available for free public accessible to the biological research community.


Analysis of sequence diversity in Plasmodium falciparum glutamic acid-rich protein (PfGARP), an asexual blood stage vaccine candidate.

  • Rattanaporn Rojrung‎ et al.
  • Scientific reports‎
  • 2023‎

Glutamic acid-rich protein of Plasmodium falciparum (PfGARP) binds to erythrocyte band 3 and may enhance cytoadherence of infected erythrocytes. Naturally acquired anti-PfGARP antibodies could confer protection against high parasitemia and severe symptoms. While whole genome sequencing analysis has suggested high conservation in this locus, little is known about repeat polymorphism in this vaccine candidate antigen. Direct sequencing was performed from the PCR-amplified complete PfGARP gene of 80 clinical isolates from four malaria endemic provinces in Thailand and an isolate from a Guinean patient. Publicly available complete coding sequences of this locus were included for comparative analysis. Six complex repeat (RI-RVI) and two homopolymeric glutamic acid repeat (E1 and E2) domains were identified in PfGARP. The erythrocyte band 3-binding ligand in domain RIV and the epitope for mAB7899 antibody eliciting in vitro parasite killing property were perfectly conserved across isolates. Repeat lengths in domains RIII and E1-RVI-E2 seemed to be correlated with parasite density of the patients. Sequence variation in PfGARP exhibited genetic differentiation across most endemic areas of Thailand. Phylogenetic tree inferred from this locus has shown that most Thai isolates formed closely related lineages, suggesting local expansion/contractions of repeat-encoding regions. Positive selection was observed in non-repeat region preceding domain RII which corresponded to a helper T cell epitope predicted to be recognized by a common HLA class II among Thai population. Predicted linear B cell epitopes were identified in both repeat and non-repeat domains. Besides length variation in some repeat domains, sequence conservation in non-repeat regions and almost all predicted immunogenic epitopes have suggested that PfGARP-derived vaccine may largely elicit strain-transcending immunity.


Sequence Analysis of the Fusion Protein Gene of Human Respiratory Syncytial Virus Circulating in China from 2003 to 2014.

  • Jinhua Song‎ et al.
  • Scientific reports‎
  • 2018‎

The human respiratory syncytial virus (HRSV) fusion (F) protein is important for HRSV infection, but few studies have examined the genetic diversity of the F gene from Chinese samples. In this study, a total of 330 HRSV F sequences collected from different regions of China between 2003 and 2014 were analyzed to understand their genetic characteristics. In addition, these sequences were compared with 1150 HRSV F sequences in Genbank from 18 other countries. In phylogenetic analysis, Chinese HRSV F sequences sorted into a number of clusters containing sequences from China as well as other countries. F sequences from different genotypes (as determined based on the G gene sequences) within a HRSV subgroup could be found in the same clusters in phylogenetic trees generated based on F gene sequences. Amino acid analysis showed that HRSV F sequences from China and other countries were highly conserved. Of interest, F protein sequences from all Chinese samples were completely conserved at the palivizumab binding site, thus predicting the susceptibility of these strains to this neutralizing antibody. In conclusion, HRSV F sequences from China between 2003 and 2014, similar to those from other countries, were highly conserved.


Assembly and Analysis of Unmapped Genome Sequence Reads Reveal Novel Sequence and Variation in Dogs.

  • Lindsay A Holden‎ et al.
  • Scientific reports‎
  • 2018‎

Dogs are excellent animal models for human disease. They have extensive veterinary histories, pedigrees, and a unique genetic system due to breeding practices. Despite these advantages, one factor limiting their usefulness is the canine genome reference (CGR) which was assembled using a single purebred Boxer. Although a common practice, this results in many high-quality reads remaining unmapped. To address this whole-genome sequence data from three breeds, Border Collie (n = 26), Bearded Collie (n = 7), and Entlebucher Sennenhund (n = 8), were analyzed to identify novel, non-CGR genomic contigs using the previously validated pseudo-de novo assembly pipeline. We identified 256,957 novel contigs and paired-end relationships together with BLAT scores provided 126,555 (49%) high-quality contigs with genomic coordinates containing 4.6 Mb of novel sequence absent from the CGR. These contigs close 12,503 known gaps, including 2.4 Mb containing partially missing sequences for 11.5% of Ensembl, 16.4% of RefSeq and 12.2% of canFam3.1+ CGR annotated genes and 1,748 unmapped contigs containing 2,366 novel gene variants. Examples for six disease-associated genes (SCARF2, RD3, COL9A3, FAM161A, RASGRP1 and DLX6) containing gaps or alternate splice variants missing from the CGR are also presented. These findings from non-reference breeds support the need for improvement of the current Boxer-only CGR to avoid missing important biological information. The inclusion of the missing gene sequences into the CGR will facilitate identification of putative disease mutations across diverse breeds and phenotypes.


The whole-genome sequence analysis of Morchella sextelata.

  • Mei-Han‎ et al.
  • Scientific reports‎
  • 2019‎

Morchella are macrofungi and are also called morels, as they exhibit a morel-like upper cap structure. Morels contain abundant essential amino acids, vitamins and biologically active compounds, which provide substantial health benefits. Approximately 80 species of Morchella have been reported, and even more species have been isolated. However, the lack of wild Morchella resources and the difficulties associated with culturing Morchella have caused a shortage in the morels available for daily consumption. Additionally, in-depth genomic and morphological studies are still needed. In this study, to provide genomic data for further investigations of culturing techniques and the biological functions of Morchella sextelata (M. sextelata), de novo genome sequencing was carried out on the Illumina HiSeq. 4000 platform using both the Illumina 150 and PacBio systems. The final estimated genome size of M. sextelata was 52.93 Mb, containing 59 contigs and a GC content of 47.37%. A total of 9,550 protein-coding genes were annotated. In addition, the repeat sequences, gene components and gene functions were analyzed using various databases. Furthermore, the secondary metabolite gene clusters and the predicted structures of their products were analyzed. Finally, a genomic comparison of different species of Morchella was performed.


Conserved differences in protein sequence determine the human pathogenicity of Ebolaviruses.

  • Morena Pappalardo‎ et al.
  • Scientific reports‎
  • 2016‎

Reston viruses are the only Ebolaviruses that are not pathogenic in humans. We analyzed 196 Ebolavirus genomes and identified specificity determining positions (SDPs) in all nine Ebolavirus proteins that distinguish Reston viruses from the four human pathogenic Ebolaviruses. A subset of these SDPs will explain the differences in human pathogenicity between Reston and the other four ebolavirus species. Structural analysis was performed to identify those SDPs that are likely to have a functional effect. This analysis revealed novel functional insights in particular for Ebolavirus proteins VP40 and VP24. The VP40 SDP P85T interferes with VP40 function by altering octamer formation. The VP40 SDP Q245P affects the structure and hydrophobic core of the protein and consequently protein function. Three VP24 SDPs (T131S, M136L, Q139R) are likely to impair VP24 binding to human karyopherin alpha5 (KPNA5) and therefore inhibition of interferon signaling. Since VP24 is critical for Ebolavirus adaptation to novel hosts, and only a few SDPs distinguish Reston virus VP24 from VP24 of other Ebolaviruses, human pathogenic Reston viruses may emerge. This is of concern since Reston viruses circulate in domestic pigs and can infect humans, possibly via airborne transmission.


A Structurally-Validated Multiple Sequence Alignment of 497 Human Protein Kinase Domains.

  • Vivek Modi‎ et al.
  • Scientific reports‎
  • 2019‎

Studies on the structures and functions of individual kinases have been used to understand the biological properties of other kinases that do not yet have experimental structures. The key factor in accurate inference by homology is an accurate sequence alignment. We present a parsimonious, structure-based multiple sequence alignment (MSA) of 497 human protein kinase domains excluding atypical kinases. The alignment is arranged in 17 blocks of conserved regions and unaligned blocks in between that contain insertions of varying lengths present in only a subset of kinases. The aligned blocks contain well-conserved elements of secondary structure and well-known functional motifs, such as the DFG and HRD motifs. From pairwise, all-against-all alignment of 272 human kinase structures, we estimate the accuracy of our MSA to be 97%. The remaining inaccuracy comes from a few structures with shifted elements of secondary structure, and from the boundaries of aligned and unaligned regions, where compromises need to be made to encompass the majority of kinases. A new phylogeny of the protein kinase domains in the human genome based on our alignment indicates that ten kinases previously labeled as "OTHER" can be confidently placed into the CAMK group. These kinases comprise the Aurora kinases, Polo kinases, and calcium/calmodulin-dependent kinase kinases.


Non-neutral evolution of H3.3-encoding genes occurs without alterations in protein sequence.

  • Brejnev M Muhire‎ et al.
  • Scientific reports‎
  • 2019‎

Histone H3.3 is a developmentally essential variant encoded by two independent genes in human (H3F3A and H3F3B). While this two-gene arrangement is evolutionarily conserved, its origins and function remain unknown. Phylogenetics, synteny and gene structure analyses of H3.3 genes from 32 metazoan genomes indicate independent evolutionary paths for H3F3A and H3F3B. While H3F3B bears similarities with H3.3 genes in distant organisms and with canonical H3 genes, H3F3A is sarcopterygian-specific and evolves under strong purifying selection. Additionally, H3F3B codon-usage preferences resemble those of broadly expressed genes and 'cell differentiation-induced' genes, while codon-usage of H3F3A resembles that of 'cell proliferation-induced' genes. We infer that H3F3B is more similar to the ancestral H3.3 gene and likely evolutionarily adapted for a broad expression pattern in diverse cellular programs, while H3F3A adapted for a subset of gene expression programs. Thus, the arrangement of two independent H3.3 genes facilitates fine-tuning of H3.3 expression across cellular programs.


Sequence and expression analysis of HSP70 family genes in Artemia franciscana.

  • Wisarut Junprung‎ et al.
  • Scientific reports‎
  • 2019‎

Thus far, only one gene from the heat shock protein 70 (HSP70) family has been identified in Artemia franciscana. Here, we used the draft Artemia transcriptome database to search for other genes in the HSP70 family. Four novel HSP70 genes were identified and designated heat shock cognate 70 (HSC70), heat shock 70 kDa cognate 5 (HSC70-5), Immunoglobulin heavy-chain binding protein (BIP), and hypoxia up-regulated protein 1 (HYOU1). For each of these genes, we obtained nucleotide and deduced amino acid sequences, and reconstructed a phylogenetic tree. Expression analysis revealed that in the juvenile state, the transcription of HSP70 and HSC70 was significantly (P < 0.05) higher in a population of A. franciscana selectively bred for increased induced thermotolerance (TF12) relative to a control population (CF12). Following non-lethal heat shock treatment at the nauplius stage, transcription of HSP70, HSC70, and HSC70-5 were significantly (P < 0.05) up-regulated in TF12. In contrast, transcription of the other HSP70 family members in A. franciscana (BIP, HYOU1, and HSPA4) showed no significant (P > 0.05) induction. Gene expression analysis demonstrated that not all members of the HSP70 family are involved in the response to heat stress and selection and that especially altered expression of HSC70 plays a role in a population selected for increased thermotolerance.


Respiratory syncytial virus B sequence analysis reveals a novel early genotype.

  • Juan C Muñoz-Escalante‎ et al.
  • Scientific reports‎
  • 2021‎

Respiratory syncytial virus (RSV) is a major cause of respiratory infections and is classified in two main groups, RSV-A and RSV-B, with multiple genotypes within each of them. For RSV-B, more than 30 genotypes have been described, without consensus on their definition. The lack of genotype assignation criteria has a direct impact on viral evolution understanding, development of viral detection methods as well as vaccines design. Here we analyzed the totality of complete RSV-B G gene ectodomain sequences published in GenBank until September 2018 (n = 2190) including 478 complete genome sequences using maximum likelihood and Bayesian phylogenetic analyses, as well as intergenotypic and intragenotypic distance matrices, in order to generate a systematic genotype assignation. Individual RSV-B genes were also assessed using maximum likelihood phylogenetic analyses and multiple sequence alignments were used to identify molecular markers associated to specific genotypes. Analyses of the complete G gene ectodomain region, sequences clustering patterns, and the presence of molecular markers of each individual gene indicate that the 37 previously described genotypes can be classified into fifteen distinct genotypes: BA, BA-C, BA-CC, CB1-THB, GB1-GB4, GB6, JAB1-NZB2, SAB1, SAB2, SAB4, URU2 and a novel early circulating genotype characterized in the present study and designated GB0.


An optimistic protein assembly from sequence reads salvaged an uncharacterized segment of mouse picobirnavirus.

  • Gabriel Gonzalez‎ et al.
  • Scientific reports‎
  • 2017‎

Advances in Next Generation Sequencing technologies have enabled the generation of millions of sequences from microorganisms. However, distinguishing the sequence of a novel species from sequencing errors remains a technical challenge when the novel species is highly divergent from the closest known species. To solve such a problem, we developed a new method called Optimistic Protein Assembly from Reads (OPAR). This method is based on the assumption that protein sequences could be more conserved than the nucleotide sequences encoding them. By taking advantage of metagenomics, bioinformatics and conventional Sanger sequencing, our method successfully identified all coding regions of the mouse picobirnavirus for the first time. The salvaged sequences indicated that segment 1 of this virus was more divergent from its homologues in other Picobirnaviridae species than segment 2. For this reason, only segment 2 of mouse picobirnavirus has been detected in previous studies. OPAR web tool is available at http://bioinformatics.czc.hokudai.ac.jp/opar/.


Sequence-specific detection of single-stranded DNA with a gold nanoparticle-protein nanopore approach.

  • Loredana Mereuta‎ et al.
  • Scientific reports‎
  • 2020‎

Fast, cheap and easy to use nucleic acids detection methods are crucial to mitigate adverse impacts caused by various pathogens, and are essential in forensic investigations, food safety monitoring or evolution of infectious diseases. We report here a method based on the α-hemolysin (α-HL) nanopore, working in conjunction to unmodified citrate anion-coated gold nanoparticles (AuNPs), to detect nanomolar concentrations of short single-stranded DNA sequences (ssDNA). The core idea was to use charge neutral peptide nucleic acids (PNA) as hybridization probe for complementary target ssDNAs, and monitor at the single-particle level the PNA-induced aggregation propensity AuNPs during PNA-DNA duplexes formation, by recording ionic current blockades signature of AuNP-α-HL interactions. This approach offers advantages including: (1) a simple to operate platform, producing clear-cut readout signals based on distinct size differences of PNA-induced AuNPs aggregates, in relation to the presence in solution of complementary ssDNAs to the PNA fragments (2) sensitive and selective detection of target ssDNAs (3) specific ssDNA detection in the presence of interference DNA, without sample labeling or signal amplification. The powerful synergy of protein nanopore-based nanoparticle detection and specific PNA-DNA hybridization introduces a new strategy for nucleic acids biosensing with short detection time and label-free operation.


Prediction of linear B-cell epitopes based on protein sequence features and BERT embeddings.

  • Fang Liu‎ et al.
  • Scientific reports‎
  • 2024‎

Linear B-cell epitopes (BCEs) play a key role in the development of peptide vaccines and immunodiagnostic reagents. Therefore, the accurate identification of linear BCEs is of great importance in the prevention of infectious diseases and the diagnosis of related diseases. The experimental methods used to identify BCEs are both expensive and time-consuming and they do not meet the demand for identification of large-scale protein sequence data. As a result, there is a need to develop an efficient and accurate computational method to rapidly identify linear BCE sequences. In this work, we developed the new linear BCE prediction method LBCE-BERT. This method is based on peptide chain sequence information and natural language model BERT embedding information, using an XGBoost classifier. The models were trained on three benchmark datasets. The model was training on three benchmark datasets for hyperparameter selection and was subsequently evaluated on several test datasets. The result indicate that our proposed method outperforms others in terms of AUROC and accuracy. The LBCE-BERT model is publicly available at: https://github.com/Lfang111/LBCE-BERT .


A sequence-based evolutionary distance method for Phylogenetic analysis of highly divergent proteins.

  • Wei Cao‎ et al.
  • Scientific reports‎
  • 2023‎

Because of the limited effectiveness of prevailing phylogenetic methods when applied to highly divergent protein sequences, the phylogenetic analysis problem remains challenging. Here, we propose a sequence-based evolutionary distance algorithm termed sequence distance (SD), which innovatively incorporates site-to-site correlation within protein sequences into the distance estimation. In protein superfamilies, SD can effectively distinguish evolutionary relationships both within and between protein families, producing phylogenetic trees that closely align with those based on structural information, even with sequence identity less than 20%. SD is highly correlated with the similarity of the protein structure, and can calculate evolutionary distances for thousands of protein pairs within seconds using a single CPU, which is significantly faster than most protein structure prediction methods that demand high computational resources and long run times. The development of SD will significantly advance phylogenetics, providing researchers with a more accurate and reliable tool for exploring evolutionary relationships.


Sequence analysis of SARS-CoV-2 genome reveals features important for vaccine design.

  • Jacob Kames‎ et al.
  • Scientific reports‎
  • 2020‎

As the SARS-CoV-2 pandemic is rapidly progressing, the need for the development of an effective vaccine is critical. A promising approach for vaccine development is to generate, through codon pair deoptimization, an attenuated virus. This approach carries the advantage that it only requires limited knowledge specific to the virus in question, other than its genome sequence. Therefore, it is well suited for emerging viruses, for which we may not have extensive data. We performed comprehensive in silico analyses of several features of SARS-CoV-2 genomic sequence (e.g., codon usage, codon pair usage, dinucleotide/junction dinucleotide usage, RNA structure around the frameshift region) in comparison with other members of the coronaviridae family of viruses, the overall human genome, and the transcriptome of specific human tissues such as lung, which are primarily targeted by the virus. Our analysis identified the spike (S) and nucleocapsid (N) proteins as promising targets for deoptimization and suggests a roadmap for SARS-CoV-2 vaccine development, which can be generalizable to other viruses.


Identification of protein structural elements responsible for the diversity of sequence preferences among Mini-III RNases.

  • Dawid Głów‎ et al.
  • Scientific reports‎
  • 2016‎

Many known endoribonucleases select their substrates based on the presence of one or a few specific nucleotides at or near the cleavage site. In some cases, selectivity is also determined by the structural features of the substrate. We recently described the sequence-specific cleavage of double-stranded RNA by Mini-III RNase from Bacillus subtilis in vitro. Here, we characterized the sequence specificity of eight other members of the Mini-III RNase family from different bacterial species. High-throughput analysis of the cleavage products of Φ6 bacteriophage dsRNA indicated subtle differences in sequence preference between these RNases, which were confirmed and characterized by systematic analysis of the cleavage kinetics of a set of short dsRNA substrates. We also showed that the sequence specificities of Mini-III RNases are not reflected by different binding affinities for cognate and non-cognate sequences, suggesting that target selection occurs predominantly at the cleavage step. We were able to identify two structural elements, the α4 helix and α5b-α6 loop that were involved in target selection. Characterization of the sequence specificity of the eight Mini-III RNases may provide a basis for better understanding RNA substrate recognition by Mini-III RNases and adopting these enzymes and their engineered derivatives as tools for RNA research.


  1. SciCrunch.org Resources

    Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.

  2. Navigation

    You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.

  3. Logging in and Registering

    If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.

  4. Searching

    Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:

    1. Use quotes around phrases you want to match exactly
    2. You can manually AND and OR terms to change how we search between words
    3. You can add "-" to terms to make sure no results return with that term in them (ex. Cerebellum -CA1)
    4. You can add "+" to terms to require they be in the data
    5. Using autocomplete specifies which branch of our semantics you with to search and can help refine your search
  5. Save Your Search

    You can save any searches you perform for quick access to later from here.

  6. Query Expansion

    We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.

  7. Collections

    If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.

  8. Facets

    Here are the facets that you can filter your papers by.

  9. Options

    From here we'll present any options for the literature, such as exporting your current results.

  10. Further Questions

    If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.

Publications Per Year

X

Year:

Count: