This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.
Eight amino acid permease genes from the protozoan parasite Leishmania donovani (AAPLDs) were cloned, sequenced, and shown to be expressed in promastigotes. Seven of these belong to the amino acid transporter-1 and one to the amino acid polyamino-choline superfamilies. Using these sequences as well as known and characterized amino acid permease genes from all kingdoms, a training set was established and used to search for motifs, using the MEME motif discovery tool. This study revealed two motifs that are specific to the genus Leishmania, four to the family trypanosomatidae, and a single motif that is common between trypanosomatidae and mammalian systems A1 and N. Interestingly, most of these motifs are clustered in two regions of 50-60 amino acids. Blast search analyses indicated a close relationship between the L. donovani and Trypanosoma brucei amino acid permeases. The results of this work describe the cloning of the first amino acid permease genes in parasitic protozoa and contribute to the understanding of amino acid permease evolution in these organisms. Furthermore, the identification of genus-specific motifs in these proteins might be useful to better understand parasite physiology within its hosts.
Despite the vast diversity of the antibody repertoire, infected individuals often mount antibody responses to precisely the same epitopes within antigens. The immunological mechanisms underpinning this phenomenon remain unknown. By mapping 376 immunodominant "public epitopes" at high resolution and characterizing several of their cognate antibodies, we concluded that germline-encoded sequences in antibodies drive recurrent recognition. Systematic analysis of antibody-antigen structures uncovered 18 human and 21 partially overlapping mouse germline-encoded amino acid-binding (GRAB) motifs within heavy and light V gene segments that in case studies proved critical for public epitope recognition. GRAB motifs represent a fundamental component of the immune system's architecture that promotes recognition of pathogens and leads to species-specific public antibody responses that can exert selective pressure on pathogens.
Plants are constantly exposed to environmental stresses and in part due to their sessile nature, they have evolved signal perception and adaptive strategies that are distinct from those of other eukaryotes. This is reflected at the cellular level where receptors and signalling molecules cannot be identified using standard homology-based searches querying with proteins from prokaryotes and other eukaryotes. One of the reasons for this is the complex domain architecture of receptor molecules. In order to discover hidden plant signalling molecules, we have developed a motif-based approach designed specifically for the identification of functional centers in plant molecules. This has made possible the discovery of novel components involved in signalling and stimulus-response pathways; the molecules include cyclic nucleotide cyclases, a nitric oxide sensor and a novel target for the hormone abscisic acid. Here, we describe the major steps of the method and illustrate it with recent and experimentally confirmed molecules as examples. We foresee that carefully curated search motifs supported by structural and bioinformatic assessments will uncover many more structural and functional aspects, particularly of signalling molecules.
Reverse vaccinology is an evolving approach for improving vaccine effectiveness and minimizing adverse responses by limiting immunizations to critical epitopes. Towards this goal, we sought to identify immunogenic amino acid motifs and linear epitopes of the SARS-CoV-2 spike protein that elicit IgG in COVID-19 mRNA vaccine recipients. Paired pre/post vaccination samples from N = 20 healthy adults, and post-vaccine samples from an additional N = 13 individuals were used to immunoprecipitate IgG targets expressed by a bacterial display random peptide library, and preferentially recognized peptides were mapped to the spike primary sequence. The data identify several distinct amino acid motifs recognized by vaccine-induced IgG, a subset of those targeted by IgG from natural infection, which may mimic 3-dimensional conformation (mimotopes). Dominant linear epitopes were identified in the C-terminal domains of the S1 and S2 subunits (aa 558-569, 627-638, and 1148-1159) which have been previously associated with SARS-CoV-2 neutralization in vitro and demonstrate identity to bat coronavirus and SARS-CoV, but limited homology to non-pathogenic human coronavirus. The identified COVID-19 mRNA vaccine epitopes should be considered in the context of variants, immune escape and vaccine and therapy design moving forward.
The triggers of autoimmune diseases such as multiple sclerosis (MS) remain elusive. Epidemiological studies suggest that common pathogens can exacerbate and also induce MS, but it has been difficult to pinpoint individual organisms. Here we demonstrate that in vivo clonally expanded CD4+ T cells isolated from the cerebrospinal fluid of a MS patient during disease exacerbation respond to a poly-arginine motif of the nonpathogenic and ubiquitous Torque Teno virus. These T cell clones also can be stimulated by arginine-enriched protein domains from other common viruses and recognize multiple autoantigens. Our data suggest that repeated infections with common pathogenic and even nonpathogenic viruses could expand T cells specific for conserved protein domains that are able to cross-react with tissue-derived and ubiquitous autoantigens.
In recent years, hundreds of novel RNA-binding proteins (RBPs) have been identified, leading to the discovery of novel RNA-binding domains. Furthermore, unstructured or disordered low-complexity regions of RBPs have been identified to play an important role in interactions with nucleic acids. However, these advances in understanding RBPs are limited mainly to eukaryotic species and we only have limited tools to faithfully predict RNA-binders in bacteria. Here, we describe a support vector machine-based method, called TriPepSVM, for the prediction of RNA-binding proteins. TriPepSVM applies string kernels to directly handle protein sequences using tri-peptide frequencies. Testing the method in human and bacteria, we find that several RBP-enriched tri-peptides occur more often in structurally disordered regions of RBPs. TriPepSVM outperforms existing applications, which consider classical structural features of RNA-binding or homology, in the task of RBP prediction in both human and bacteria. Finally, we predict 66 novel RBPs in Salmonella Typhimurium and validate the bacterial proteins ClpX, DnaJ and UbiG to associate with RNA in vivo.
The extraordinary mechanical properties of spider dragline silk are dependent on the highly repetitive sequences of the component proteins, major ampullate spidroin 1 and 2 (MaSp2 and MaSp2). MaSp sequences are dominated by repetitive modules composed of short amino acid motifs; however, the patterns of motif conservation through evolution and their relevance to silk characteristics are not well understood. We performed a systematic analysis of MaSp sequences encompassing infraorder Araneomorphae based on the conservation of explicitly defined motifs, with the aim of elucidating the essential elements of MaSp1 and MaSp2. The results show that the GGY motif is nearly ubiquitous in the two types of MaSp, while MaSp2 is invariably associated with GP and di-glutamine (QQ) motifs. Further analysis revealed an extended MaSp2 consensus sequence in family Araneidae, with implications for the classification of the archetypal spidroins ADF3 and ADF4. Additionally, the analysis of RNA-seq data showed the expression of a set of distinct MaSp-like variants in genus Tetragnatha. Finally, an apparent association was uncovered between web architecture and the abundance of GP, QQ, and GGY motifs in MaSp2, which suggests a co-expansion of these motifs in response to the evolution of spiders' prey capture strategy.
Dysferlin (Dysf) and mitsugumin53 (MG53) are two key proteins involved in membrane repair of muscle cells which are efficiently recruited to the sarcolemma upon lesioning. Plasma membrane localization and recruitment of a Dysf fragment to membrane lesions in zebrafish myofibers relies on the presence of a short, polybasic amino acid motif, WRRFK. Here we show that the positive charges carried by this motif are responsible for this function. In mouse MG53, we have identified a similar motif with multiple basic residues, WKKMFR. A single amino acid replacement, K279A, leads to severe aggregation of MG53 in inclusion bodies in HeLa cells. This result is due to the loss of positive charge, as shown by studying the effects of other neutral amino acids at position 279. Consequently, our data suggest that positively charged amino acid stretches play an essential role in the localization and function of Dysf and MG53.
Comparative studies using hundreds of sequences can give a detailed picture of the evolution of a given gene family. Nevertheless, retrieving only the sequences of interest from public databases can be difficult, in particular, when working with highly divergent sequences. The difficulty increases substantially when one wants to include in the study sequences from many (or less well studied) species whose genomes are non-annotated or incompletely annotated.
The widespread thioredoxin superfamily enzymes typically share the following features: a characteristic α-β fold, the presence of a Cys-X-X-Cys (or Cys-X-X-Ser) redox-active motif, and a proline in the cis configuration abutting the redox-active site in the tertiary structure. The Cys-X-X-Cys motif is at the solvent-exposed amino terminus of an α-helix, allowing the first cysteine to engage in nucleophilic attack on substrates, or substrates to attack the Cys-X-X-Cys disulfide, depending on whether the enzyme functions to reduce, isomerize, or oxidize its targets. We report here the X-ray crystal structure of an enzyme that breaks many of our assumptions regarding the sequence-structure relationship of thioredoxin superfamily proteins. The yeast Protein Disulfide Isomerase family member Eps1p has Cys-X-X-Cys motifs and proline residues at the appropriate primary structural positions in its first two predicted thioredoxin-fold domains. However, crystal structures show that the Cys-X-X-Cys of the second domain is buried and that the adjacent proline is in the trans, rather than the cis isomer. In these configurations, neither the "active-site" disulfide nor the backbone carbonyl preceding the proline is available to interact with substrate. The Eps1p structures thus expand the documented diversity of the PDI oxidoreductase family and demonstrate that conserved sequence motifs in common folds do not guarantee structural or functional conservation.
We investigated association between HLA class I and class II alleles and haplotypes, and KIR loci and their HLA class I ligands, with multiple sclerosis (MS) in 412 European American MS patients and 419 ethnically matched controls, using next-generation sequencing. The DRB1*15:01~DQB1*06:02 haplotype was highly predisposing (odds ratio (OR) = 3.98; 95% confidence interval (CI) = 3-5.31; p-value (p) = 2.22E-16), as was DRB1*03:01~DQB1*02:01 (OR = 1.63; CI = 1.19-2.24; p = 1.41E-03). Hardy-Weinberg (HW) analysis in MS patients revealed a significant DRB1*03:01~DQB1*02:01 homozyote excess (15 observed; 8.6 expected; p = 0.016). The OR for this genotype (5.27; CI = 1.47-28.52; p = 0.0036) suggests a recessive MS risk model. Controls displayed no HW deviations. The C*03:04~B*40:01 haplotype (OR = 0.27; CI = 0.14-0.51; p = 6.76E-06) was highly protective for MS, especially in haplotypes with A*02:01 (OR = 0.15; CI = 0.04-0.45; p = 6.51E-05). By itself, A*02:01 is moderately protective, (OR = 0.69; CI = 0.54-0.87; p = 1.46E-03), and haplotypes of A*02:01 with the HLA-B Thr80 Bw4 variant (Bw4T) more so (OR = 0.53; CI = 0.35-0.78; p = 7.55E-04). Protective associations with the Bw4 KIR ligand resulted from linkage disequilibrium (LD) with DRB1*15:01, but the Bw4T variant was protective (OR = 0.64; CI = 0.49-0.82; p = 3.37-04) independent of LD with DRB1*15:01. The Bw4I variant was not associated with MS. Overall, we find specific class I HLA polymorphisms to be protective for MS, independent of the strong predisposition conferred by DRB1*15:01.
Immunoglobulins are highly diverse protein sequences that are processed and presented to T-cells by B-cells and other antigen presenting cells. We examined a large dataset of immunoglobulin heavy chain variable regions (IGHV) to assess the diversity of T-cell exposed motifs (TCEMs). TCEM comprise those amino acids in a MHC-bound peptide, which face outwards, surrounded by the MHC histotope, and which engage the T-cell receptor. Within IGHV there is a distinct pattern of predicted MHC class II binding and a very high frequency of re-use of the TCEMs. The re-use frequency indicates that only a limited number of different cognate T-cells are required to engage many different clonal B-cells. The amino acids in each outward-facing TCEM are intercalated with the amino acids of inward-facing MHC groove-exposed motifs (GEM). Different GEM may have differing, allele-specific, MHC binding affinities. The intercalation of TCEM and GEM in a peptide allows for a vast combinatorial repertoire of epitopes, each eliciting a different response. Outcome of T-cell receptor binding is determined by overall signal strength, which is a function of the number of responding T-cells and the duration of engagement. Hence, the frequency of TCEM re-use appears to be an important determinant of whether a T-cell response is stimulatory or suppressive. The frequency distribution of TCEMs implies that somatic hypermutation is followed by T-cell clonal expansion that develops along repeated pathways. The observations of TCEM and GEM derived from immunoglobulins suggest a relatively simple, yet powerful, mechanism to correlate T-cell polyspecificity, through re-use of TCEMs, with a very high degree of specificity achieved by combination with a diversity of GEMs. The frequency profile of TCEMs also points to an economical mechanism for maintaining T-cell memory, recall, and self-discrimination based on an endogenously generated profile of motifs.
Equine rotavirus (ERV) strain L338 (G13P[18]) has a unique G and P genotype. However, the evolutionary relationship of L338 with other ERVs is still unknown. Here whole genome analysis of the L338 ERV strain was independently performed. Its genotype constellations were determined as G13-P[18]-I6-R9-C9-M6-A6-N9-T12-E14-H11, confirming previous genotype assignments. The L338 strain only shared the P[18] and I6 genotypes with other ERVs. The nucleotide sequences of the other 9 RNA segments were different from those of cogent genes of all other group A rotavirus (RVA) strains including ERVs and formed unique phylogenetic lineages. The L338 evolutionary footprints were tentatively identified in both VP7 and VP4 amino acid sequences: two regions were found in VP7 and twelve in VP4. The conserved regions shared between L338 and other group A rotavirus strains (RVAs) indicated that L338 was more closely related genomically to animal and human RVAs other than ERVs, suggesting that L338 may not be an endogenous equine RV but have emerged as an interspecies reassortant with other RVA strains. Furthermore, genotype-specific motifs of all 27 G and 37 P types were identified in regions 7-1a (aa 91-100) of VP7 and regions 8-1 (aa146-151) and 8-3 (aa113-118 and 125-135) of VP4 (VP8*).
Amyloids are linked to many debilitating diseases in mammals. Some organisms produce amyloids that have a functional role in the maintenance of their biological processes. Microbes utilize functional bacterial amyloids (FuBA) for pathogenicity and infections. Amyloid biogenesis is regulated differentially in various systems to avoid its toxic accumulation. A familiar feature in the process of amyloid biogenesis from humans to microbes is its regulation by protein-protein interactions (PPI). The spatial arrangement of amino acid residues in proteins generates topologies like flat interface and linear motif, which participate in protein interactions. Motifs and interface residue-mediated interactions have a direct or an indirect impact on amyloid secretion and assembly. Some motifs undergo post-translational modifications (PTM), which effects interactions and dynamics of the amyloid biogenesis cascade. Interaction-induced local changes stimulate global conformational transitions in the PPI complex, which indirectly affects amyloid formation. Perturbation of such motifs and interface residues results in amyloid abolishment. Interface residues, motifs and their respective interactive protein partners could serve as potential targets for intervention to inhibit amyloid biogenesis.
Protein phosphorylation catalyzed by kinases plays crucial regulatory roles in cellular processes. Given the high-throughput mass spectrometry-based experiments, the desire to annotate the catalytic kinases for in vivo phosphorylation sites has motivated. Thus, a variety of computational methods have been developed for performing a large-scale prediction of kinase-specific phosphorylation sites. However, most of the proposed methods solely rely on the local amino acid sequences surrounding the phosphorylation sites. An increasing number of three-dimensional structures make it possible to physically investigate the structural environment of phosphorylation sites.
G-protein coupled receptors (GPCRs) belong to biologically important and functionally diverse and largest super family of membrane proteins. GPCRs retain a characteristic membrane topology of seven alpha helices with three intracellular, three extracellular loops and flanking N' and C' terminal residues. Subtle differences do exist in the helix boundaries (TM-domain), loop lengths, sequence features such as conserved motifs, and substituting amino acid patterns and their physiochemical properties amongst these sequences (clusters) at intra-genomic and inter-genomic level (please re-phrase into 2 statements for clarity). In the current study, we employ prediction of helix boundaries and scores derived from amino acid substitution exchange matrices to identify the conserved amino acid residues (motifs) as consensus in aligned set of homologous GPCR sequences. Co-clustered GPCRs from human and other genomes, organized as 32 clusters, were employed to study the amino acid conservation patterns and species-specific or cluster-specific motifs. Critical analysis on sequence composition and properties provide clues to connect functional relevance within and across genome for vast practical applications such as design of mutations and understanding of disease-causing genetic abnormalities.
Antibodies against posttranslationally modified proteins are a hallmark of rheumatoid arthritis (RA), but the emergence and pathogenicity of these autoantibodies are still incompletely understood. The aim of this study was to analyze the antigen specificities and mutation patterns of monoclonal antibodies (mAb) derived from RA synovial plasma cells and address the question of antigen cross-reactivity.
Auxin plays a central role in growth and plant development. To maintain auxin homeostasis, biological processes such as biosynthesis, transport, degradation, and reversible conjugation are essential. The Gretchen Hagen 3 (GH3) family genes codify for the enzymes that esterify indole-3-acetic acid (IAA) to various amino acids, which is a key process in the induction of somatic embryogenesis (SE). The GH3 family is one of the principal families of early response to auxin genes, exhibiting IAA-amido synthetase activity to maintain optimal levels of free auxin in the cell. In this study, we carried out a systematic identification of the GH3 gene family in the genome of Coffea canephora, determining a total of 18 CcGH3 genes. Analysis of the genetic structures and phylogenetic relationships of CcGH3 genes with GH3 genes from other plant species revealed that they could be clustered in two major categories with groups 1 and 2 of the GH3 family of Arabidopsis. We analyzed the transcriptome expression profiles of the 18 CcGH3 genes using RNA-Seq analysis-based data and qRT-PCR during the different points of somatic embryogenesis induction. Furthermore, the endogenous quantification of free and conjugated indole-3-acetic acid (IAA) suggests that the various members of the CcGH3 genes play a crucial role during the embryogenic process of C. canephora. Three-dimensional modeling of the selected CcGH3 proteins showed that they consist of two domains: an extensive N-terminal domain and a smaller C-terminal domain. All proteins analyzed in the present study shared a unique conserved structural topology. Additionally, we identified conserved regions that could function to bind nucleotides and specific amino acids for the conjugation of IAA during SE in C. canephora. These results provide a better understanding of the C. canephora GH3 gene family for further exploration and possible genetic manipulation.
West Nile virus (WNV) was introduced for the first time in the western hemisphere in 1999 in New York City. In 2002, a phenotype-modifying mutation (Env-V159A) defined the first North American genotype WN02. So far, three genotypes has been described in North America but little is known about WNV evolution in Canada. We report the phylogenetic characterization of twenty-six WNV genomes isolated from mosquitoes in the province of Quebec. WNV strains found in Quebec are phylogenetically related to American strains collected in northern and southern regions. We also noted the presence of two robust monophyletic groups of isolates characterized by distinct conserved amino acid motifs. These emerging genotypes were detected for several years in different ecosystems. These results highlight the need for the maintenance of a nationwide surveillance to follow the dispersion of emergent WNV genotypes.
Protein succinylation is a biochemical reaction in which a succinyl group (-CO-CH2-CH2-CO-) is attached to the lysine residue of a protein molecule. Lysine succinylation plays important regulatory roles in living cells. However, studies in this field are limited by the difficulty in experimentally identifying the substrate site specificity of lysine succinylation. To facilitate this process, several tools have been proposed for the computational identification of succinylated lysine sites. In this study, we developed an approach to investigate the substrate specificity of lysine succinylated sites based on amino acid composition. Using experimentally verified lysine succinylated sites collected from public resources, the significant differences in position-specific amino acid composition between succinylated and non-succinylated sites were represented using the Two Sample Logo program. These findings enabled the adoption of an effective machine learning method, support vector machine, to train a predictive model with not only the amino acid composition, but also the composition of k-spaced amino acid pairs. After the selection of the best model using a ten-fold cross-validation approach, the selected model significantly outperformed existing tools based on an independent dataset manually extracted from published research articles. Finally, the selected model was used to develop a web-based tool, SuccSite, to aid the study of protein succinylation. Two proteins were used as case studies on the website to demonstrate the effective prediction of succinylation sites. We will regularly update SuccSite by integrating more experimental datasets. SuccSite is freely accessible at http://csb.cse.yzu.edu.tw/SuccSite/.
Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the facets that you can filter your papers by.
From here we'll present any options for the literature, such as exporting your current results.
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.
Year:
Count: