2024MAY10: Our hosting provider is experiencing intermittent networking issues. We apologize for any inconvenience.

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.

Search

Type in a keyword to search

On page 1 showing 1 ~ 20 papers out of 235,285 papers

Integrative visual analysis of protein sequence mutations.

  • Nadezhda T Doncheva‎ et al.
  • BMC proceedings‎
  • 2014‎

An important aspect of studying the relationship between protein sequence, structure and function is the molecular characterization of the effect of protein mutations. To understand the functional impact of amino acid changes, the multiple biological properties of protein residues have to be considered together.


MESSA: MEta-Server for protein Sequence Analysis.

  • Qian Cong‎ et al.
  • BMC biology‎
  • 2012‎

Computational sequence analysis, that is, prediction of local sequence properties, homologs, spatial structure and function from the sequence of a protein, offers an efficient way to obtain needed information about proteins under study. Since reliable prediction is usually based on the consensus of many computer programs, meta-severs have been developed to fit such needs. Most meta-servers focus on one aspect of sequence analysis, while others incorporate more information, such as PredictProtein for local sequence feature predictions, SMART for domain architecture and sequence motif annotation, and GeneSilico for secondary and spatial structure prediction. However, as predictions of local sequence properties, three-dimensional structure and function are usually intertwined, it is beneficial to address them together.


Analysis of protein sequence/structure similarity relationships.

  • Hin Hark Gan‎ et al.
  • Biophysical journal‎
  • 2002‎

Current analyses of protein sequence/structure relationships have focused on expected similarity relationships for structurally similar proteins. To survey and explore the basis of these relationships, we present a general sequence/structure map that covers all combinations of similarity/dissimilarity relationships and provide novel energetic analyses of these relationships. To aid our analysis, we divide protein relationships into four categories: expected/unexpected similarity (S and S(?)) and expected/unexpected dissimilarity (D and D(?)) relationships. In the expected similarity region S, we show that trends in the sequence/structure relation can be derived based on the requirement of protein stability and the energetics of sequence and structural changes. Specifically, we derive a formula relating sequence and structural deviations to a parameter characterizing protein stiffness; the formula fits the data reasonably well. We suggest that the absence of data in region S(?) (high structural but low sequence similarity) is due to unfavorable energetics. In contrast to region S, region D(?) (high sequence but low structural similarity) is well-represented by proteins that can accommodate large structural changes. Our analyses indicate that there are several categories of similarity relationships and that protein energetics provide a basis for understanding these relationships.


AlignmentViewer: Sequence Analysis of Large Protein Families.

  • Roc Reguant‎ et al.
  • F1000Research‎
  • 2020‎

AlignmentViewer is a web-based tool to view and analyze multiple sequence alignments of protein families. The particular strengths of AlignmentViewer include flexible visualization at different scales as well as analysis of conservation patterns and of the distribution of proteins in sequence space. The tool is directly accessible in web browsers without the need for software installation. It can handle protein families with tens of thousands of sequences and is particularly suitable for evolutionary coupling analysis, e.g. via EVcouplings.org.


Coevolutionary Analysis of Protein Subfamilies by Sequence Reweighting.

  • Duccio Malinverni‎ et al.
  • Entropy (Basel, Switzerland)‎
  • 2020‎

Extracting structural information from sequence co-variation has become a common computational biology practice in the recent years, mainly due to the availability of large sequence alignments of protein families. However, identifying features that are specific to sub-classes and not shared by all members of the family using sequence-based approaches has remained an elusive problem. We here present a coevolutionary-based method to differentially analyze subfamily specific structural features by a continuous sequence reweighting (SR) approach. We introduce the underlying principles and test its predictive capabilities on the Response Regulator family, whose subfamilies have been previously shown to display distinct, specific homo-dimerization patterns. Our results show that this reweighting scheme is effective in assigning structural features known a priori to subfamilies, even when sequence data is relatively scarce. Furthermore, sequence reweighting allows assessing if individual structural contacts pertain to specific subfamilies and it thus paves the way for the identification specificity-determining contacts from sequence variation data.


Analysis of the Sequence Characteristics of Antifreeze Protein.

  • Yu-Hang Zhang‎ et al.
  • Life (Basel, Switzerland)‎
  • 2021‎

Antifreeze protein (AFP) is a proteinaceous compound with improved antifreeze ability and binding ability to ice to prevent its growth. As a surface-active material, a small number of AFPs have a tremendous influence on the growth of ice. Therefore, identifying novel AFPs is important to understand protein-ice interactions and create novel ice-binding domains. To date, predicting AFPs is difficult due to their low sequence similarity for the ice-binding domain and the lack of common features among different AFPs. Here, a computational engine was developed to predict the features of AFPs and reveal the most important 39 features for AFP identification, such as antifreeze-like/N-acetylneuraminic acid synthase C-terminal, insect AFP motif, C-type lectin-like, and EGF-like domain. With this newly presented computational method, a group of previously confirmed functional AFP motifs was screened out. This study has identified some potential new AFP motifs and contributes to understanding biological antifreeze mechanisms.


Sequence and structural analysis of binding site residues in protein-protein complexes.

  • M Michael Gromiha‎ et al.
  • International journal of biological macromolecules‎
  • 2010‎

The binding sites in protein-protein complexes have been identified with different methods including atomic contacts, reduction in solvent accessibility and interaction energy between the interacting partners. In our earlier work, we have developed an energy-based criteria for identifying the binding sites in protein-protein complexes, which showed that the interacting residues are different from that obtained with distance-based methods. In this work, we analyzed the binding site residues based on sequence and structural properties, such as, neighboring residues, secondary structure, solvent accessibility, conservation of residues, medium and long-range contacts and surrounding hydrophobicity. Our results showed that the neighboring residues of binding sites in proteins and ligands are different from each other although the interacting pairs of residues have a common behavior. The analysis on surrounding hydrophobicity reveals that the binding residues are less hydrophobic than non-binding sites, which suggests that the hydrophobic core are important for folding and stability whereas the surface seeking residues play a critical role in binding. This tendency has been verified with the number of contacts in binding sites. In addition, the binding site residues are highly conserved compared with non-binding residues. We suggest that the incorporation of sequence and structure-based features may improve the prediction accuracy of binding sites in protein-protein complexes.


The Phylogeny of Osteopontin-Analysis of the Protein Sequence.

  • Georg F Weber‎
  • International journal of molecular sciences‎
  • 2018‎

Osteopontin (OPN) is important for tissue remodeling, cellular immune responses, and calcium homeostasis in milk and urine. In pathophysiology, the biomolecule contributes to the progression of multiple cancers. Phylogenetic analysis of 202 osteopontin protein sequences identifies a core block of integrin-binding sites in the center of the protein, which is well conserved. Remarkably, the length of this block varies among species, resulting in differing distances between motifs within. The amino acid sequence SSEE is a candidate phosphorylation site. Two copies of it reside in the far N-terminus and are variably affected by alternative splicing in humans. Between those motifs, birds and reptiles have a histidine-rich domain, which is absent from other species. Just downstream from the thrombin cleavage site, the common motif (Q/I)(Y/S/V)(P/H/Y)D(A/V)(T/S)EED(L/E)(-/S)T has been hitherto unrecognized. While well preserved, it is yet without assigned function. The far C-terminus, although very different between Reptilia/Aves on the one hand and Mammals on the other, is highly conserved within each group of species, suggesting important functional roles that remain to be mapped. Taxonomic variations in the osteopontin sequence include a lack of about 20 amino acids in the downstream portion, a small unique sequence stretch C-terminally, a lack of six amino acids just upstream of the RGD motifs, and variable length insertions far C-terminally.


Proteogenomic Analysis of Protein Sequence Alterations in Breast Cancer Cells.

  • Iulia M Lazar‎ et al.
  • Scientific reports‎
  • 2019‎

Cancer evolves as a result of an accumulation of mutations and chromosomal aberrations. Developments in sequencing technologies have enabled the discovery and cataloguing of millions of such mutations. The identification of protein-level alterations, typically by using reversed-phase protein arrays or mass spectrometry, has lagged, however, behind gene and transcript-level observations. In this study, we report the use of mass spectrometry for detecting the presence of mutations-missense, indels and frame shifts-in MCF7 and SKBR3 breast cancer, and non-tumorigenic MCF10A cells. The mutations were identified by expanding the database search process of raw mass spectrometry files by including an in-house built database of mutated peptides (XMAn-v1) that complemented a minimally redundant, canonical database of Homo sapiens proteins. The work resulted in the identification of nearly 300 mutated peptide sequences, of which ~50 were characterized by quality tandem mass spectra. We describe the criteria that were used to select the mutated peptide sequences, evaluate the parameters that characterized these peptides, and assess the artifacts that could have led to false peptide identifications. Further, we discuss the functional domains and biological processes that may be impacted by the observed peptide alterations, and how protein-level detection can support the efforts of identifying cancer driving mutations and genes. Mass spectrometry data are available via ProteomeXchange with identifier PXD014458.


Sequence analysis of the gliding protein Gli349 in Mycoplasma mobile.

  • Shoichi Metsugi‎ et al.
  • Biophysics (Nagoya-shi, Japan)‎
  • 2005‎

The motile mechanism of Mycoplasma mobile remains unknown but is believed to differ from any previously identified mechanism in bacteria. Gli349 of M. mobile is known to be responsible for both adhesion to glass surfaces and mobility. We therefore carried out sequence analyses of Gli349 and its homolog MYPU2110 from M. pulmonis to decipher their structures. We found that the motif "YxxxxxGF" appears 11 times in Gli349 and 16 times in MYPU2110. Further analysis of the sequences revealed that Gli349 contains 18 repeats of about 100 amino acid residues each, and MYPU2110 contains 22. No sequence homologous to any of the repeats was found in the NCBI RefSeq non-redundant sequence database, and no compatible fold structure was found among known protein structures, suggesting that the repeat found in Gli349 and MYPU2110 is novel and takes a new fold structure. Proteolysis of Gli349 using chymotrypsin revealed that cleavage positions were often located between the repeats, implying that regions connecting repeats are unstructured, flexible and exposed to the solvent. Assuming that each repeat folds into a structural domain, we constructed a model of Gli349 that fits well the shape and size of images obtained with electron microscopy.


Sequence analysis and protein interactions of Arabidopsis CIA2 and CIL proteins.

  • Chun-Yen Yang‎ et al.
  • Botanical studies‎
  • 2020‎

A previous screening of Arabidopsis thaliana for mutants exhibiting dysfunctional chloroplast protein transport identified the chloroplast import apparatus (cia) gene. The cia2 mutant has a pale green phenotype and reduced rate of protein import into chloroplasts, but leaf shape and size are similar to wild-type plants of the same developmental stage. Microarray analysis showed that nuclear CIA2 protein enhances expression of the Toc75, Toc33, CPN10 and cpRPs genes, thereby up-regulating protein import and synthesis efficiency in chloroplasts. CIA2-like (CIL) shares 65% sequence identity to CIA2, suggesting that CIL and CIA2 are homologous proteins in Arabidopsis. Here, we further assess the protein interactions and sequence features of CIA2 and CIL.


Comparative genomics of elastin: Sequence analysis of a highly repetitive protein.

  • David He‎ et al.
  • Matrix biology : journal of the International Society for Matrix Biology‎
  • 2007‎

Due to the low complexity associated with their sequences, uncovering the evolutionary and functional relationships in highly repetitive proteins such as elastin, spider silks, resilin and abductin represents a significant challenge. Using the polymeric extracellular protein elastin as a model system, we present a novel computational approach to the study of sequence, function and evolutionary relationships in repetitive proteins. To address the absence of accurate sequence annotation for repetitive proteins such as elastin, we have constructed a new database repository, ElastoDB (http://theileria.ccb.sickkids.ca/elastin), dedicated to the storage and retrieval of elastin sequence- and meta-data. To analyse their sequence relationships we have devised an innovative new method, based on the identification of overrepresented 'fuzzy' motifs. Applying this method to elastin sequences derived from mammals, chicken, Xenopus and zebrafish resulted in the identification of both highly conserved, and taxon and species specific motifs that likely represent important functional and/or structural elements. The relative spacing and organization of these elements suggest that exon duplication events have played an important role in the evolution of elastin. Clustering of similarity profiles generated for sets of exons and introns, revealed a pattern of putative duplication events involving exons 15-30 in mammalian and chicken elastins, exons 20-31 in both zebrafish elastins, exons 15-20 in fugu elastin and exons 35-50 in Xenopus elastin 1. The success of this approach for elastin offers a promising route to the elucidation of sequence, structure, function and evolutionary relationships for many other proteins with sequences of low complexity.


A Primary Sequence Analysis of the ARGONAUTE Protein Family in Plants.

  • Daniel Rodríguez-Leal‎ et al.
  • Frontiers in plant science‎
  • 2016‎

Small RNA (sRNA)-mediated gene silencing represents a conserved regulatory mechanism controlling a wide diversity of developmental processes through interactions of sRNAs with proteins of the ARGONAUTE (AGO) family. On the basis of a large phylogenetic analysis that includes 206 AGO genes belonging to 23 plant species, AGO genes group into four clades corresponding to the phylogenetic distribution proposed for the ten family members of Arabidopsis thaliana. A primary analysis of the corresponding protein sequences resulted in 50 sequences of amino acids (blocks) conserved across their linear length. Protein members of the AGO4/6/8/9 and AGO1/10 clades are more conserved than members of the AGO5 and AGO2/3/7 clades. In addition to blocks containing components of the PIWI, PAZ, and DUF1785 domains, members of the AGO2/3/7 and AGO4/6/8/9 clades possess other consensus block sequences that are exclusive of members within these clades, suggesting unforeseen functional specialization revealed by their primary sequence. We also show that AGO proteins of animal and plant kingdoms share linear sequences of blocks that include motifs involved in posttranslational modifications such as those regulating AGO2 in humans and the PIWI protein AUBERGINE in Drosophila. Our results open possibilities for exploring new structural and functional aspects related to the evolution of AGO proteins within the plant kingdom, and their convergence with analogous proteins in mammals and invertebrates.


Sequence-function analysis of the Sendai virus L protein domain VI.

  • Andrea M Murphy‎ et al.
  • Virology‎
  • 2010‎

The large (about 2200 amino acids) L polymerase protein of nonsegmented negative-strand RNA viruses (order Mononegavirales) has six conserved sequence regions ("domains") postulated to constitute the specific enzymatic activities involved in viral mRNA synthesis, 5'-end capping, cap methylation, 3' polyadenylation, and genomic RNA replication. Previous studies with vesicular stomatitis virus identified amino acid residues within the L protein domain VI required for mRNA cap methylation. In our recent study we analyzed four amino acid residues within domain VI of the Sendai virus L protein and our data indicated that there could be differences in L protein sequence requirements for cap methylation in two different families of Mononegavirales - rhabdoviruses and paramyxoviruses. In this study, we conducted a more comprehensive mutational analysis by targeting the entire SeV L protein domain VI, creating twenty-four L mutants, and testing these mutations for their effects on viral mRNA synthesis, cap methylation, viral genome replication and virus growth kinetics. Our analysis identified several residues required for successful cap methylation and virus replication and clearly showed the importance of the K-D-K-E tetrad and glycine-rich motif in the SeV cap methylation. This study is the first extensive sequence analysis of the L protein domain VI in the family Paramyxoviridae, and it confirms structural and functional similarity of this domain across different families of the order Mononegavirales.


Computational curation and analysis of publicly available protein sequence data from a single protein family.

  • Kyra Dougherty‎ et al.
  • MethodsX‎
  • 2022‎

The wealth of sequence data available on public databases is increasing at an exponential rate, and while tremendous efforts are being made to make access to these resources easier, these data can be challenging for researchers to reuse because submissions are made from numerous laboratories with different biological objectives, resulting in inconsistent naming conventions and sequence content. Researchers can manually inspect each sequence and curate a dataset by hand but automating some of these steps will reduce this burden. This paper is a step-by-step guide describing how to identify all proteins containing a specific domain with the Conserved Protein Domain Architecture Retrieval Tool, download all associated amino acid sequences from NCBI Entrez, tabulate, and clean the data. I will also describe how to extract the full taxonomic information and computationally predict some physicochemical properties of the proteins based on amino acid sequence. The resulting data are applicable to a wide range of bioinformatic analyses where publicly available data are utilized. • Step-by-step guide to gathering, cleaning, and parsing data from publicly available databases for computational analysis, plus supplementation of taxonomic data and physicochemical characteristics from sequence data. • This strategy allows for reuse of existing large-scale publicly available data for different downstream applications to answer novel biological questions.


Analysis of protein missense alterations by combining sequence- and structure-based methods.

  • Aram Gyulkhandanyan‎ et al.
  • Molecular genetics & genomic medicine‎
  • 2020‎

Different types of in silico approaches can be used to predict the phenotypic consequence of missense variants. Such algorithms are often categorized as sequence based or structure based, when they necessitate 3D structural information. In addition, many other in silico tools, not dedicated to the analysis of variants, can be used to gain additional insights about the possible mechanisms at play.


Sequence and structural analysis of 4SNc-Tudor domain protein from Takifugu Rubripes.

  • Jianzhou Zheng‎ et al.
  • Bioinformation‎
  • 2009‎

The fugu SN4TDR protein belongs to an evolutionarily conserved family, consisting of four repeat staphylococcal nuclease-like domains (SN1-SN4) at the N-terminus followed by Tudor and SN-like domains (TSN). Sequence analysis showed that the C-terminal TSN domain is composed of a complete SN-like domain interdigitated with a Tudor domain. In despite of low level of sequence identities, five SN-like domains have a few conserved amino acids that may play essential roles in the function of the protein. Computer modeling and secondary structural prediction of the SN-like domains revealed the presence of similar structural features of beta1-beta2-beta3-alpha1-beta4-beta5-alpha2-alpha3, which provides a structural basis for oligonucleotides binding. The loop region L(3alpha) for binding sites between beta3 and alpha1 of SN-like domains are different from human p100, implying the divergence in the structures of binding sites. These results indicate that fugu SN4TDR may bind methylated ligands and/or oligonucleotides through its distant domains.


Analysis of protein sequence and interaction data for candidate disease gene prediction.

  • Richard A George‎ et al.
  • Nucleic acids research‎
  • 2006‎

Linkage analysis is a successful procedure to associate diseases with specific genomic regions. These regions are often large, containing hundreds of genes, which make experimental methods employed to identify the disease gene arduous and expensive. We present two methods to prioritize candidates for further experimental study: Common Pathway Scanning (CPS) and Common Module Profiling (CMP). CPS is based on the assumption that common phenotypes are associated with dysfunction in proteins that participate in the same complex or pathway. CPS applies network data derived from protein-protein interaction (PPI) and pathway databases to identify relationships between genes. CMP identifies likely candidates using a domain-dependent sequence similarity approach, based on the hypothesis that disruption of genes of similar function will lead to the same phenotype. Both algorithms use two forms of input data: known disease genes or multiple disease loci. When using known disease genes as input, our combined methods have a sensitivity of 0.52 and a specificity of 0.97 and reduce the candidate list by 13-fold. Using multiple loci, our methods successfully identify disease genes for all benchmark diseases with a sensitivity of 0.84 and a specificity of 0.63. Our combined approach prioritizes good candidates and will accelerate the disease gene discovery process.


Sequence and Structure-Based Analysis of Specificity Determinants in Eukaryotic Protein Kinases.

  • David Bradley‎ et al.
  • Cell reports‎
  • 2021‎

Protein kinases lie at the heart of cell-signaling processes and are often mutated in disease. Kinase target recognition at the active site is in part determined by a few amino acids around the phosphoacceptor residue. However, relatively little is known about how most preferences are encoded in the kinase sequence or how these preferences evolved. Here, we used alignment-based approaches to predict 30 specificity-determining residues (SDRs) for 16 preferences. These were studied with structural models and were validated by activity assays of mutant kinases. Cancer mutation data revealed that kinase SDRs are mutated more frequently than catalytic residues. We have observed that, throughout evolution, kinase specificity has been strongly conserved across orthologs but can diverge after gene duplication, as illustrated by the G protein-coupled receptor kinase family. The identified SDRs can be used to predict kinase specificity from sequence and aid in the interpretation of evolutionary or disease-related genomic variants.


Sequence variation in G-protein-coupled receptors: analysis of single nucleotide polymorphisms.

  • Suganthi Balasubramanian‎ et al.
  • Nucleic acids research‎
  • 2005‎

We assessed the disease-causing potential of single nucleotide polymorphisms (SNPs) based on a simple set of sequence-based features. We focused on SNPs from the dbSNP database in G-protein-coupled receptors (GPCRs), a large class of important transmembrane (TM) proteins. Apart from the location of the SNP in the protein, we evaluated the predictive power of three major classes of features to differentiate between disease-causing mutations and neutral changes: (i) properties derived from amino-acid scales, such as volume and hydrophobicity; (ii) position-specific phylogenetic features reflecting evolutionary conservation, such as normalized site entropy, residue frequency and SIFT score; and (iii) substitution-matrix scores, such as those derived from the BLOSUM62, GRANTHAM and PHAT matrices. We validated our approach using a control dataset consisting of known disease-causing mutations and neutral variations. Logistic regression analyses indicated that position-specific phylogenetic features that describe the conservation of an amino acid at a specific site are the best discriminators of disease mutations versus neutral variations, and integration of all our features improves discrimination power. Overall, we identify 115 SNPs in GPCRs from dbSNP that are likely to be associated with disease and thus are good candidates for genotyping in association studies.


  1. SciCrunch.org Resources

    Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.

  2. Navigation

    You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.

  3. Logging in and Registering

    If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.

  4. Searching

    Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:

    1. Use quotes around phrases you want to match exactly
    2. You can manually AND and OR terms to change how we search between words
    3. You can add "-" to terms to make sure no results return with that term in them (ex. Cerebellum -CA1)
    4. You can add "+" to terms to require they be in the data
    5. Using autocomplete specifies which branch of our semantics you with to search and can help refine your search
  5. Save Your Search

    You can save any searches you perform for quick access to later from here.

  6. Query Expansion

    We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.

  7. Collections

    If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.

  8. Facets

    Here are the facets that you can filter your papers by.

  9. Options

    From here we'll present any options for the literature, such as exporting your current results.

  10. Further Questions

    If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.

Publications Per Year

X

Year:

Count: