Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.

Search

Type in a keyword to search

On page 1 showing 1 ~ 20 papers out of 67,564 papers

ECOD: an evolutionary classification of protein domains.

  • Hua Cheng‎ et al.
  • PLoS computational biology‎
  • 2014‎

Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or "fold"). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.


DCMP: database of cancer mutant protein domains.

  • Isaac Arnold Emerson‎ et al.
  • Database : the journal of biological databases and curation‎
  • 2021‎

Protein domains are functional and structural units of proteins. They are responsible for a particular function that contributes to protein's overall role. Because of this essential role, the majority of the genetic variants occur in the domains. In this study, the somatic mutations across 21 cancer types were mapped to the individual protein domains. To map the mutations to the domains, we employed the whole human proteome to predict the domains in each protein sequence and recognized about 149 668 domains. A novel Perl-API program was developed to convert the protein domain positions into genomic positions, and users can freely access them through GitHub. We determined the distribution of protein domains across 23 chromosomes with the help of these genomic positions. Interestingly, chromosome 19 has more number of protein domains in comparison with other chromosomes. Then, we mapped the cancer mutations to all the protein domains. Around 46-65% of mutations were mapped to their corresponding protein domains, and significantly mutated domains for all the cancer types were determined using the local false discovery ratio (locfdr). The chromosome positions for all the protein domains can be verified using the cross-reference ensemble database. Database URL: https://dcmp.vit.ac.in/.


The Enigmatic Origin of Papillomavirus Protein Domains.

  • Mikk Puustusmaa‎ et al.
  • Viruses‎
  • 2017‎

Almost a century has passed since the discovery of papillomaviruses. A few decades of research have given a wealth of information on the molecular biology of papillomaviruses. Several excellent studies have been performed looking at the long- and short-term evolution of these viruses. However, when and how papillomaviruses originate is still a mystery. In this study, we systematically searched the (sequenced) biosphere to find distant homologs of papillomaviral protein domains. Our data show that, even including structural information, which allows us to find deeper evolutionary relationships compared to sequence-only based methods, only half of the protein domains in papillomaviruses have relatives in the rest of the biosphere. We show that the major capsid protein L1 and the replication protein E1 have relatives in several viral families, sharing three protein domains with Polyomaviridae and Parvoviridae. However, only the E1 replication protein has connections with cellular organisms. Most likely, the papillomavirus ancestor is of marine origin, a biotope that is not very well sequenced at the present time. Nevertheless, there is no evidence as to how papillomaviruses originated and how they became vertebrate and epithelium specific.


LenVarDB: database of length-variant protein domains.

  • Eshita Mutt‎ et al.
  • Nucleic acids research‎
  • 2014‎

Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.


Synthetic protein-protein interaction domains created by shuffling Cys2His2 zinc-fingers.

  • Astrid V Giesecke‎ et al.
  • Molecular systems biology‎
  • 2006‎

Cys2His2 zinc-fingers (C2H2 ZFs) mediate a wide variety of protein-DNA and protein-protein interactions. DNA-binding C2H2 ZFs can be shuffled to yield artificial proteins with different DNA binding specificities. Here we demonstrate that shuffling of C2H2 ZFs from transcription factor dimerization zinc-finger (DZF) domains can also yield two-finger DZFs with novel protein-protein interaction specificities. We show that these synthetic protein-protein interaction domains can be used to mediate activation of a single-copy reporter gene in bacterial cells and of an endogenous gene in human cells. In addition, the synthetic two-finger domains we constructed can also be linked together to create more extended, four-finger interfaces. Our results demonstrate that shuffling of C2H2 ZFs can yield artificial protein-interaction components that should be useful for applications in synthetic biology.


BACPHLIP: predicting bacteriophage lifestyle from conserved protein domains.

  • Adam J Hockenberry‎ et al.
  • PeerJ‎
  • 2021‎

Bacteriophages are broadly classified into two distinct lifestyles: temperate and virulent. Temperate phages are capable of a latent phase of infection within a host cell (lysogenic cycle), whereas virulent phages directly replicate and lyse host cells upon infection (lytic cycle). Accurate lifestyle identification is critical for determining the role of individual phage species within ecosystems and their effect on host evolution. Here, we present BACPHLIP, a BACterioPHage LIfestyle Predictor. BACPHLIP detects the presence of a set of conserved protein domains within an input genome and uses this data to predict lifestyle via a Random Forest classifier that was trained on a dataset of 634 phage genomes. On an independent test set of 423 phages, BACPHLIP has an accuracy of 98% greatly exceeding that of the previously existing tools (79%). BACPHLIP is freely available on GitHub (https://github.com/adamhockenberry/bacphlip) and the code used to build and test the classifier is provided in a separate repository (https://github.com/adamhockenberry/bacphlip-model-dev) for users wishing to interrogate and re-train the underlying classification model.


Functional domains of the Drosophila bicaudal-D protein.

  • J Oh‎ et al.
  • Genetics‎
  • 2000‎

The localization of oocyte-specific determinants in the form of mRNAs to the pro-oocyte is essential for the establishment of oocyte identity. Localization of the Bicaudal-D (Bic-D) protein to the presumptive oocyte is required for the accumulation of Bic-D and other mRNAs to the pro-oocyte. The Bic-D protein contains four well-defined heptad repeat domains characteristic of intermediate filament proteins, and several of the mutations in Bic-D map to these conserved domains. We have undertaken a structure-function analysis of Bic-D by testing the function of mutant Bic-D transgenes (Bic-D(H)) deleted for each of the heptad repeat domains in a Bic-D null background. Our transgenic studies indicate that only the C-terminal heptad repeat deletion results in a protein that has lost zygotic and ovarian functions. The three other deletions result in proteins with full zygotic function, but with affected ovarian function. The functional importance of each domain is well correlated with its conservation in evolution. The analysis of females heterozygous for Bic-D(H) and the existing alleles Bic-D(PA66) or Bic-D(R26) reveals that Bic-D(R26) as well as some of Bic-D(H) transgenes have antimorphic effects. The yeast two-hybrid interaction assay shows that Bic-D forms homodimers. Furthermore, we found that Bic-D exists as a multimeric protein complex consisting of Egl and at least two Bic-D monomers.


Protein diffusion from microwells with contrasting hydrogel domains.

  • Elaine J Su‎ et al.
  • APL bioengineering‎
  • 2019‎

Understanding and controlling molecular transport in hydrogel materials is important for biomedical tools, including engineered tissues and drug delivery, as well as life sciences tools for single-cell analysis. Here, we scrutinize the ability of microwells-micromolded in hydrogel slabs-to compartmentalize lysate from single cells. We consider both (i) microwells that are "open" to a large fluid (i.e., liquid) reservoir and (ii) microwells that are "closed," having been capped with either a slab of high-density polyacrylamide gel or an impermeable glass slide. We use numerical modeling to gain insight into the sensitivity of time-dependent protein concentration distributions on hydrogel partition and protein diffusion coefficients and open and closed microwell configurations. We are primarily concerned with diffusion-driven protein loss from the microwell cavity. Even for closed microwells, confocal fluorescence microscopy reports that a fluid (i.e., liquid) film forms between the hydrogel slabs (median thickness of 1.7 μm). Proteins diffuse from the microwells and into the fluid (i.e., liquid) layer, yet concentration distributions are sensitive to the lid layer partition coefficients and the protein diffusion coefficient. The application of a glass lid or a dense hydrogel retains protein in the microwell, increasing the protein solute concentration in the microwell by ∼7-fold for the first 15 s. Using triggered release of Protein G from microparticles, we validate our simulations by characterizing protein diffusion in a microwell capped with a high-density polyacrylamide gel lid (p > 0.05, Kolmogorov-Smirnov test). Here, we establish and validate a numerical model useful for understanding protein transport in and losses from a hydrogel microwell across a range of boundary conditions.


PathFams: statistical detection of pathogen-associated protein domains.

  • Briallen Lobb‎ et al.
  • BMC genomics‎
  • 2021‎

A substantial fraction of genes identified within bacterial genomes encode proteins of unknown function. Identifying which of these proteins represent potential virulence factors, and mapping their key virulence determinants, is a challenging but important goal.


Genomic scale sub-family assignment of protein domains.

  • Julian Gough‎
  • Nucleic acids research‎
  • 2006‎

Many classification schemes for proteins and domains are either hierarchical or semi-hierarchical yet most databases, especially those offering genome-wide analysis, only provide assignments to sequences at one level of their hierarchy. Given an established hierarchy, the problem of assigning new sequences to lower levels of that existing hierarchy is less hard (but no less important) than the initial top level assignment which requires the detection of the most distant relationships. A solution to this problem is described here in the form of a new procedure which can be thought of as a hybrid between pairwise and profile methods. The hybrid method is a general procedure that can be applied to any pre-defined hierarchy, at any level, including in principle multiple sub-levels. It has been tested on the SCOP classification via the SUPERFAMILY database and performs significantly better than either pairwise or profile methods alone. Perhaps the greatest advantage of the hybrid method over other possible approaches to the problem is that within the framework of an existing profile library, the assignments are fully automatic and come at almost no additional computational cost. Hence it has already been applied at the SCOP family level to all genomes in the SUPERFAMILY database, providing a wealth of new data to the biological and bioinformatics communities.


CDD: conserved domains and protein three-dimensional structure.

  • Aron Marchler-Bauer‎ et al.
  • Nucleic acids research‎
  • 2013‎

CDD, the Conserved Domain Database, is part of NCBI's Entrez query and retrieval system and is also accessible via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml. CDD provides annotation of protein sequences with the location of conserved domain footprints and functional sites inferred from these footprints. Pre-computed annotation is available via Entrez, and interactive search services accept single protein or nucleotide queries, as well as batch submissions of protein query sequences, utilizing RPS-BLAST to rapidly identify putative matches. CDD incorporates several protein domain and full-length protein model collections, and maintains an active curation effort that aims at providing fine grained classifications for major and well-characterized protein domain families, as supported by available protein three-dimensional (3D) structure and the published literature. To this date, the majority of protein 3D structures are represented by models tracked by CDD, and CDD curators are characterizing novel families that emerge from protein structure determination efforts.


Functional domains of the FSHD-associated DUX4 protein.

  • Hiroaki Mitsuhashi‎ et al.
  • Biology open‎
  • 2018‎

Aberrant expression of the full-length isoform of DUX4 (DUX4-FL) appears to underlie pathogenesis in facioscapulohumeral muscular dystrophy (FSHD). DUX4-FL is a transcription factor and ectopic expression of DUX4-FL is toxic to most cells. Previous studies showed that DUX4-FL-induced pathology requires intact homeodomains and that transcriptional activation required the C-terminal region. In this study, we further examined the functional domains of DUX4 by generating mutant, deletion, and fusion variants of DUX4. We compared each construct to DUX4-FL for (i) activation of a DUX4 promoter reporter, (ii) expression of the DUX4-FL target gene ZSCAN4, (iii) effect on cell viability, (iv) activation of endogenous caspases, and (v) level of protein ubiquitination. Each construct produced a similarly sized effect (or lack of effect) in each assay. Thus, the ability to activate transcription determined the extent of change in multiple molecular and cellular properties that may be relevant to FSHD pathology. Transcriptional activity was mediated by the C-terminal 80 amino acids of DUX4-FL, with most activity located in the C-terminal 20 amino acids. We also found that non-toxic constructs with both homeodomains intact could act as inhibitors of DUX4-FL transcriptional activation, likely due to competition for promoter sites.This article has an associated First Person interview with the first author of the paper.


Functional dynamics in replication protein A DNA binding and protein recruitment domains.

  • Chris A Brosey‎ et al.
  • Structure (London, England : 1993)‎
  • 2015‎

Replication Protein A (RPA) is an essential scaffold for many DNA processing machines; its function relies on its modular architecture. Here, we report (15)N-nuclear magnetic resonance heteronuclear relaxation analysis to characterize the movements of single-stranded (ss) DNA binding and protein interaction modules in the RPA70 subunit. Our results provide direct evidence for coordination of the motion of the tandem RPA70AB ssDNA binding domains. Moreover, binding of ssDNA substrate is found to cause dramatic reorientation and full coupling of inter-domain motion. In contrast, the RPA70N protein interaction domain remains structurally and dynamically independent of RPA70AB regardless of binding of ssDNA. This autonomy of motion between the 70N and 70AB modules supports a model in which the two binding functions of RPA are mediated fully independently, but remain differentially coordinated depending on the length of their flexible tethers. A critical role for linkers between the globular domains in determining the functional dynamics of RPA is proposed.


EVEREST: automatic identification and classification of protein domains in all protein sequences.

  • Elon Portugaly‎ et al.
  • BMC bioinformatics‎
  • 2006‎

Proteins are comprised of one or several building blocks, known as domains. Such domains can be classified into families according to their evolutionary origin. Whereas sequencing technologies have advanced immensely in recent years, there are no matching computational methodologies for large-scale determination of protein domains and their boundaries. We provide and rigorously evaluate a novel set of domain families that is automatically generated from sequence data. Our domain family identification process, called EVEREST (EVolutionary Ensembles of REcurrent SegmenTs), begins by constructing a library of protein segments that emerge in an all vs. all pairwise sequence comparison. It then proceeds to cluster these segments into putative domain families. The selection of the best putative families is done using machine learning techniques. A statistical model is then created for each of the chosen families. This procedure is then iterated: the aforementioned statistical models are used to scan all protein sequences, to recreate a library of segments and to cluster them again.


EvoProDom: evolutionary modeling of protein families by assessing translocations of protein domains.

  • Gon Carmi‎ et al.
  • FEBS open bio‎
  • 2021‎

Here, we introduce a novel 'evolution of protein domains' (EvoProDom) model for describing the evolution of proteins based on the 'mix and merge' of protein domains. We assembled and integrated genomic and proteomic data comprising protein domain content and orthologous proteins from 109 organisms. In EvoProDom, we characterized evolutionary events, particularly, translocations, as reciprocal exchanges of protein domains between orthologous proteins in different organisms. We showed that protein domains that translocate with highly frequency are generated by transcripts enriched in trans-splicing events, that is, the generation of novel transcripts from the fusion of two distinct genes. In EvoProDom, we describe a general method to collate orthologous protein annotation from KEGG, and protein domain content from protein sequences using tools such as KoFamKOAL and Pfam. To summarize, EvoProDom presents a novel model for protein evolution based on the 'mix and merge' of protein domains rather than DNA-based evolution models. This confers the advantage of considering chromosomal alterations as drivers of protein evolutionary events.


Peptide binding by catalytic domains of the protein disulfide isomerase-related protein ERp46.

  • Andreas Funkner‎ et al.
  • Journal of molecular biology‎
  • 2013‎

The protein disulfide isomerase (PDI) family member ERp46/endoPDI/thioredoxin domain-containing protein 5 is preferentially expressed in a limited number of tissues, where it may function as a survival factor for nitrosative stress in vivo. It is involved in insulin production as well as in adiponectin signaling and interacts specifically with the redox-regulatory endoplasmic reticulum proteins endoplasmic oxidoreductin 1α (Ero1α) and peroxiredoxin-4. Here, we show that ERp46, although lacking a PDI-like redox-inactive b'-thioredoxin domain with its hydrophobic substrate binding site, is able to bind to a large pool of peptides containing aromatic and basic residues via all three of its catalytic domains (a(0), a and a'), though the a(0) domain may contain the primary binding site. ERp46, which shows relatively higher activity as a disulfide-reductase than as an oxidase/isomerase in vitro compared to PDI and ERp57, possesses chaperone activity in vivo, a property also shared by the C-terminal a' domain. A crystal structure of the a' domain is also presented, offering a view of possible substrate binding sites within catalytic domains of PDI proteins.


Elucidating the interacting domains of chandipura virus nucleocapsid protein.

  • Kapila Kumar‎ et al.
  • Advances in virology‎
  • 2013‎

The nucleocapsid (N) protein of Chandipura virus (CHPV) plays a crucial role in viral life cycle, besides being an important structural component of the virion through proper organization of its interactions with other viral proteins. In a recent study, the authors had mapped the associations among CHPV proteins and shown that N protein interacts with four of the viral proteins: N, phosphoprotein (P), matrix protein (M), and glycoprotein (G). The present study aimed to distinguish the regions of CHPV N protein responsible for its interactions with other viral proteins. In this direction, we have generated the structure of CHPV N protein by homology modeling using SWISS-MODEL workspace and Accelrys Discovery Studio client 2.55 and mapped the domains of N protein using PiSQRD. The interactions of N protein fragments with other proteins were determined by ZDOCK rigid-body docking method and validated by yeast two-hybrid and ELISA. The study revealed a unique binding site, comprising of amino acids 1-30 at the N terminus of the nucleocapsid protein (N1) that is instrumental in its interactions with N, P, M, and G proteins. It was also observed that N2 associates with N and G proteins while N3 interacts with N, P, and M proteins.


Protein domains and architectural innovation in plant-associated Proteobacteria.

  • David J Studholme‎ et al.
  • BMC genomics‎
  • 2005‎

Evolution of new complex biological behaviour tends to arise by novel combinations of existing building blocks. The functional and evolutionary building blocks of the proteome are protein domains, the function of a protein being dependent on its constituent domains. We clustered completely-sequenced proteomes of prokaryotes on the basis of their protein domain content, as defined by Pfam (release 16.0). This revealed that, although there was a correlation between phylogeny and domain content, other factors also have an influence. This observation motivated an investigation of the relationship between an organism's lifestyle and the complement of domains and domain architectures found within its proteome.


Extending Protein Domain Boundary Predictors to Detect Discontinuous Domains.

  • Zhidong Xue‎ et al.
  • PloS one‎
  • 2015‎

A variety of protein domain predictors were developed to predict protein domain boundaries in recent years, but most of them cannot predict discontinuous domains. Considering nearly 40% of multidomain proteins contain one or more discontinuous domains, we have developed DomEx to enable domain boundary predictors to detect discontinuous domains by assembling the continuous domain segments. Discontinuous domains are predicted by matching the sequence profile of concatenated continuous domain segments with the profiles from a single-domain library derived from SCOP and CATH, and Pfam. Then the matches are filtered by similarity to library templates, a symmetric index score and a profile-profile alignment score. DomEx recalled 32.3% discontinuous domains with 86.5% precision when tested on 97 non-homologous protein chains containing 58 continuous and 99 discontinuous domains, in which the predicted domain segments are within ±20 residues of the boundary definitions in CATH 3.5. Compared with our recently developed predictor, ThreaDom, which is the state-of-the-art tool to detect discontinuous-domains, DomEx recalled 26.7% discontinuous domains with 72.7% precision in a benchmark with 29 discontinuous-domain chains, where ThreaDom failed to predict any discontinuous domains. Furthermore, combined with ThreaDom, the method ranked number one among 10 predictors. The source code and datasets are available at https://github.com/xuezhidong/DomEx.


Computational modelling of chromosomally clustering protein domains in bacteria.

  • Chiara E Cotroneo‎ et al.
  • BMC bioinformatics‎
  • 2021‎

In bacteria, genes with related functions-such as those involved in the metabolism of the same compound or in infection processes-are often physically close on the genome and form groups called clusters. The enrichment of such clusters over various distantly related bacteria can be used to predict the roles of genes of unknown function that cluster with characterised genes. There is no obvious rule to define a cluster, given their variability in size and intergenic distances, and the definition of what comprises a "gene", since genes can gain and lose domains over time. Protein domains can cluster within a gene, or in adjacent genes of related function, and in both cases these are chromosomally clustered. Here, we model the distances between pairs of protein domain coding regions across a wide range of bacteria and archaea via a probabilistic two component mixture model, without imposing arbitrary thresholds in terms of gene numbers or distances.


  1. SciCrunch.org Resources

    Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.

  2. Navigation

    You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.

  3. Logging in and Registering

    If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.

  4. Searching

    Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:

    1. Use quotes around phrases you want to match exactly
    2. You can manually AND and OR terms to change how we search between words
    3. You can add "-" to terms to make sure no results return with that term in them (ex. Cerebellum -CA1)
    4. You can add "+" to terms to require they be in the data
    5. Using autocomplete specifies which branch of our semantics you with to search and can help refine your search
  5. Save Your Search

    You can save any searches you perform for quick access to later from here.

  6. Query Expansion

    We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.

  7. Collections

    If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.

  8. Facets

    Here are the facets that you can filter your papers by.

  9. Options

    From here we'll present any options for the literature, such as exporting your current results.

  10. Further Questions

    If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.

Publications Per Year

X

Year:

Count: