This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.
The analysis of disulphide bond containing proteins in the Protein Data Bank (PDB) revealed that out of 27,209 protein structures analyzed, 12,832 proteins contain at least one intra-chain disulphide bond and 811 proteins contain at least one inter-chain disulphide bond. The intra-chain disulphide bond containing proteins can be grouped into 256 categories based on the number of disulphide bonds and the disulphide bond connectivity patterns (DBCPs) that were generated according to the position of half-cystine residues along the protein chain. The PDB entries corresponding to these 256 categories represent 509 unique SCOP superfamilies. A simple web-based computational tool is made freely available at the website that allows flexible queries to be made on the database in order to retrieve useful information on the disulphide bond containing proteins in the PDB. The database is useful to identify the different SCOP superfamilies associated with a particular disulphide bond connectivity pattern or vice versa. It is possible to define a query based either on a single field or a combination of the following fields, i.e., PDB code, protein name, SCOP superfamily name, number of disulphide bonds, disulphide bond connectivity pattern and the number of amino acid residues in a protein chain and retrieve information that match the criterion. Thereby, the database may be useful to select suitable protein structural templates in order to model the more distantly related protein homologs/analogs using the comparative modeling methods.
We discuss the use of the regularized linear discriminant analysis (LDA) as a model reduction technique combined with particle swarm optimization (PSO) in protein tertiary structure prediction, followed by structure refinement based on singular value decomposition (SVD) and PSO. The algorithm presented in this paper corresponds to the category of template-based modeling. The algorithm performs a preselection of protein templates before constructing a lower dimensional subspace via a regularized LDA. The protein coordinates in the reduced spaced are sampled using a highly explorative optimization algorithm, regressive-regressive PSO (RR-PSO). The obtained structure is then projected onto a reduced space via singular value decomposition and further optimized via RR-PSO to carry out a structure refinement. The final structures are similar to those predicted by best structure prediction tools, such as Rossetta and Zhang servers. The main advantage of our methodology is that alleviates the ill-posed character of protein structure prediction problems related to high dimensional optimization. It is also capable of sampling a wide range of conformational space due to the application of a regularized linear discriminant analysis, which allows us to expand the differences over a reduced basis set.
DNA structure functions as an overlapping code to the DNA sequence. Rapid progress in understanding the role of DNA structure in gene regulation, DNA damage recognition and genome stability has been made. The three dimensional structure of both proteins and DNA plays a crucial role for their specific interaction, and proteins can recognise the chemical signature of DNA sequence ("base readout") as well as the intrinsic DNA structure ("shape recognition"). These recognition mechanisms do not exist in isolation but, depending on the individual interaction partners, are combined to various extents. Driving force for the interaction between protein and DNA remain the unique thermodynamics of each individual DNA-protein pair. In this review we focus on the structures and conformations adopted by DNA, both influenced by and influencing the specific interaction with the corresponding protein binding partner, as well as their underlying thermodynamics.
Protein structure homology modelling has become a routine technique to generate 3D models for proteins when experimental structures are not available. Fully automated servers such as SWISS-MODEL with user-friendly web interfaces generate reliable models without the need for complex software packages or downloading large databases. Here, we describe the latest version of the SWISS-MODEL expert system for protein structure modelling. The SWISS-MODEL template library provides annotation of quaternary structure and essential ligands and co-factors to allow for building of complete structural models, including their oligomeric structure. The improved SWISS-MODEL pipeline makes extensive use of model quality estimation for selection of the most suitable templates and provides estimates of the expected accuracy of the resulting models. The accuracy of the models generated by SWISS-MODEL is continuously evaluated by the CAMEO system. The new web site allows users to interactively search for templates, cluster them by sequence similarity, structurally compare alternative templates and select the ones to be used for model building. In cases where multiple alternative template structures are available for a protein of interest, a user-guided template selection step allows building models in different functional states. SWISS-MODEL is available at http://swissmodel.expasy.org/.
Transcription factors are key protein effectors in the regulation of gene transcription, and in many cases their activity is regulated via a complex network of protein-protein interactions (PPI). The chemical modulation of transcription factor activity is a long-standing goal in drug discovery but hampered by the difficulties associated with the targeting of PPIs, in particular when extended and flat protein interfaces are involved. Peptidomimetics have been applied to inhibit PPIs, however with variable success, as for certain interfaces the mimicry of a single secondary structure element is insufficient to obtain high binding affinities. Here, we describe the design and characterization of a stabilized protein tertiary structure that acts as an inhibitor of the interaction between the transcription factor TEAD and its co-repressor VGL4, both playing a central role in the Hippo signalling pathway. Modification of the inhibitor with a cell-penetrating entity yielded a cell-permeable proteomimetic that activates cell proliferation via regulation of the Hippo pathway, highlighting the potential of protein tertiary structure mimetics as an emerging class of PPI modulators.
A tertiary structure governs, to a great extent, the biological activity of a protein in the living cell and is consequently a central focus of numerous studies aiming to shed light on cellular processes central to human health. Here, we aim to elucidate the structure of the Rift Valley fever virus (RVFV) L protein using a combination of in silico techniques. Due to its large size and multiple domains, elucidation of the tertiary structure of the L protein has so far challenged both dry and wet laboratories. In this work, we leverage complementary perspectives and tools from the computational-molecular-biology and bioinformatics domains for constructing, refining, and evaluating several atomistic structural models of the L protein that are physically realistic. All computed models have very flexible termini of about 200 amino acids each, and a high proportion of helical regions. Properties such as potential energy, radius of gyration, hydrodynamics radius, flexibility coefficient, and solvent-accessible surface are reported. Structural characterization of the L protein enables our laboratories to better understand viral replication and transcription via further studies of L protein-mediated protein-protein interactions. While results presented a focus on the RVFV L protein, the following workflow is a more general modeling protocol for discovering the tertiary structure of multidomain proteins consisting of thousands of amino acids.
To date only a handful of duplicated genes have been described in RNA viruses. This shortage can be attributed to different factors, including the RNA viruses with high mutation rate that would make a large genome more prone to acquire deleterious mutations. This may explain why sequence-based approaches have only found duplications in their most recent evolutionary history. To detect earlier duplications, we performed protein tertiary structure comparisons for every RNA virus family represented in the Protein Data Bank. We present a list of thirty pairs of possible paralogs with <30 per cent sequence identity. It is argued that these pairs are the outcome of six duplication events. These include the α and β subunits of the fungal toxin KP6 present in the dsRNA Ustilago maydis virus (family Totiviridae), the SARS-CoV (Coronaviridae) nsp3 domains SUD-N, SUD-M and X-domain, the Picornavirales (families Picornaviridae, Dicistroviridae, Iflaviridae and Secoviridae) capsid proteins VP1, VP2 and VP3, and the Enterovirus (family Picornaviridae) 3C and 2A cysteine-proteases. Protein tertiary structure comparisons may reveal more duplication events as more three-dimensional protein structures are determined and suggests that, although still rare, gene duplications may be more frequent in RNA viruses than previously thought. Keywords: gene duplications; RNA viruses.
Following our previous work on the analysis of 'structural plasticity' associated with the beta-propeller structural motifs, we have now developed a simple method that can automatically detect all the known beta-propellers in protein tertiary structure, given a list of Protein Data Bank (PDB) codes as input to the computer program. Our beta-propeller detection (BPD) method identifies the location of beta-propellers in the protein structure, specifies the beta-propeller type, the beta-sheet associated beta-strand pattern and the structurally similar beta-propellers observed in other proteins. When tested on 21,566 proteins in the PDB, the BPD method was capable of correctly identifying all the known 245 beta-propellers described in the structural classification of proteins (SCOP) with the number of false positives detected being less than 0.2%. Forty-one false positives were detected that correspond to eight known protein families. When compared with some of the popular web-based programs that can automatically detect 'structural similarities' between the query and target proteins, our method has the advantage of also being capable of detecting beta-propellers associated with 'structural plasticity' and in situations where the target and query proteins differ in amino acid sequence length.
Substantial progresses in protein structure prediction have been made by utilizing deep-learning and residue-residue distance prediction since CASP13. Inspired by the advances, we improve our CASP14 MULTICOM protein structure prediction system by incorporating three new components: (a) a new deep learning-based protein inter-residue distance predictor to improve template-free (ab initio) tertiary structure prediction, (b) an enhanced template-based tertiary structure prediction method, and (c) distance-based model quality assessment methods empowered by deep learning. In the 2020 CASP14 experiment, MULTICOM predictor was ranked seventh out of 146 predictors in tertiary structure prediction and ranked third out of 136 predictors in inter-domain structure prediction. The results demonstrate that the template-free modeling based on deep learning and residue-residue distance prediction can predict the correct topology for almost all template-based modeling targets and a majority of hard targets (template-free targets or targets whose templates cannot be recognized), which is a significant improvement over the CASP13 MULTICOM predictor. Moreover, the template-free modeling performs better than the template-based modeling on not only hard targets but also the targets that have homologous templates. The performance of the template-free modeling largely depends on the accuracy of distance prediction closely related to the quality of multiple sequence alignments. The structural model quality assessment works well on targets for which enough good models can be predicted, but it may perform poorly when only a few good models are predicted for a hard target and the distribution of model quality scores is highly skewed. MULTICOM is available at https://github.com/jianlin-cheng/MULTICOM_Human_CASP14/tree/CASP14_DeepRank3 and https://github.com/multicom-toolbox/multicom/tree/multicom_v2.0.
The trRosetta structure prediction method employs deep learning to generate predicted residue-residue distance and orientation distributions from which 3D models are built. We sought to improve the method by incorporating as inputs (in addition to sequence information) both language model embeddings and template information weighted by sequence similarity to the target. We also developed a refinement pipeline that recombines models generated by template-free and template utilizing versions of trRosetta guided by the DeepAccNet accuracy predictor. Both benchmark tests and CASP results show that the new pipeline is a considerable improvement over the original trRosetta, and it is faster and requires less computing resources, completing the entire modeling process in a median < 3 h in CASP14. Our human group improved results with this pipeline primarily by identifying additional homologous sequences for input into the network. We also used the DeepAccNet accuracy predictor to guide Rosetta high-resolution refinement for submissions in the regular and refinement categories; although performance was quite good on a CASP relative scale, the overall improvements were rather modest in part due to missing inter-domain or inter-chain contacts.
The number of entries in a structural database of proteins is increasing day by day. Methods for retrieving protein tertiary structures from such a large database have turn out to be the key to comparative analysis of structures that plays an important role to understand proteins and their functions. In this paper, we present fast and accurate methods for the retrieval of proteins having tertiary structures similar to a query protein from a large database. Our proposed methods borrow ideas from the field of computer vision. The speed and accuracy of our methods come from the two newly introduced features- the co-occurrence matrix of the oriented gradient and pyramid histogram of oriented gradient- and the use of Euclidean distance as the distance measure. Experimental results clearly indicate the superiority of our approach in both running time and accuracy. Our method is readily available for use from this website: http://research.buet.ac.bd:8080/Comograd/.
The GTPase Center (GAC) RNA domain in bacterial 23S rRNA is directly bound by ribosomal protein L11, and this complex is essential to ribosome function. Previous cocrystal structures of the 58-nucleotide GAC RNA bound to L11 revealed the intricate tertiary fold of the RNA domain, with one monovalent and several divalent ions located in specific sites within the structure. Here, we report a new crystal structure of the free GAC that is essentially identical to the L11-bound structure, which retains many common sites of divalent ion occupation. This new structure demonstrates that RNA alone folds into its tertiary structure with bound divalent ions. In solution, we find that this tertiary structure is not static, but rather is best described as an ensemble of states. While L11 protein cannot bind to the GAC until the RNA has adopted its tertiary structure, new experimental data show that L11 binds to Mg2+-dependent folded states, which we suggest lie along the folding pathway of the RNA. We propose that L11 stabilizes a specific GAC RNA tertiary state, corresponding to the crystal structure, and that this structure reflects the functionally critical conformation of the rRNA domain in the fully assembled ribosome.
Animal mammary glands have been successfully employed to produce therapeutic recombinant human proteins. However, considerable variation in animal mammary transgene expression efficiency has been reported. We now consider whether aspects of codon usage and/or protein tertiary structure underlie this variation in mammary transgene expression.
Selenocysteine (Sec) is translationally incorporated into proteins in response to the UGA codon. The tRNA specific to Sec (tRNA(Sec)) is first ligated with serine by seryl-tRNA synthetase (SerRS). In the present study, we determined the 3.1 Å crystal structure of the tRNA(Sec) from the bacterium Aquifex aeolicus, in complex with the heterologous SerRS from the archaeon Methanopyrus kandleri. The bacterial tRNA(Sec) assumes the L-shaped structure, from which the long extra arm protrudes. Although the D-arm conformation and the extra-arm orientation are similar to those of eukaryal/archaeal tRNA(Sec)s, A. aeolicus tRNA(Sec) has unique base triples, G14:C21:U8 and C15:G20a:G48, which occupy the positions corresponding to the U8:A14 and R15:Y48 tertiary base pairs of canonical tRNAs. Methanopyrus kandleri SerRS exhibited serine ligation activity toward A. aeolicus tRNA(Sec) in vitro. The SerRS N-terminal domain interacts with the extra-arm stem and the outer corner of tRNA(Sec). Similar interactions exist in the reported tRNA(Ser) and SerRS complex structure from the bacterium Thermus thermophilus. Although the catalytic C-terminal domain of M. kandleri SerRS lacks interactions with A. aeolicus tRNA(Sec) in the present complex structure, the conformational flexibility of SerRS is likely to allow the CCA terminal region of tRNA(Sec) to enter the SerRS catalytic site.
We present a novel method of protein fold decoy discrimination using machine learning, more specifically using neural networks. Here, decoy discrimination is represented as a machine learning problem, where neural networks are used to learn the native-like features of protein structures using a set of positive and negative training examples. A set of native protein structures provides the positive training examples, while negative training examples are simulated decoy structures obtained by reversing the sequences of native structures. Various features are extracted from the training dataset of positive and negative examples and used as inputs to the neural networks.
Predicting residue-residue distance relationships (eg, contacts) has become the key direction to advance protein structure prediction since 2014 CASP11 experiment, while deep learning has revolutionized the technology for contact and distance distribution prediction since its debut in 2012 CASP10 experiment. During 2018 CASP13 experiment, we enhanced our MULTICOM protein structure prediction system with three major components: contact distance prediction based on deep convolutional neural networks, distance-driven template-free (ab initio) modeling, and protein model ranking empowered by deep learning and contact prediction. Our experiment demonstrates that contact distance prediction and deep learning methods are the key reasons that MULTICOM was ranked 3rd out of all 98 predictors in both template-free and template-based structure modeling in CASP13. Deep convolutional neural network can utilize global information in pairwise residue-residue features such as coevolution scores to substantially improve contact distance prediction, which played a decisive role in correctly folding some free modeling and hard template-based modeling targets. Deep learning also successfully integrated one-dimensional structural features, two-dimensional contact information, and three-dimensional structural quality scores to improve protein model quality assessment, where the contact prediction was demonstrated to consistently enhance ranking of protein models for the first time. The success of MULTICOM system clearly shows that protein contact distance prediction and model selection driven by deep learning holds the key of solving protein structure prediction problem. However, there are still challenges in accurately predicting protein contact distance when there are few homologous sequences, folding proteins from noisy contact distances, and ranking models of hard targets.
A number of Protein Data Bank (PDB) entries contain heteroatoms defined as HETATM. These include the atomic co-ordinates mainly for heteroatom groups, such as cofactors, coenzymes, prosthetic groups, metal ions, sugars, drugs, peptides, heavy-atom derivatives, non-standard amino acid residues/nucleotides, water molecules and so on. In order to evaluate the different heteroatom (Het) groups and their distribution in protein tertiary structure, we have extracted these from all proteins in the PDB and provided the data in an easily accessible format at the following website. The data can be queried on the PDB code, protein name/description, Het Group code or Het Group name. Further, we have also developed a web-based software application that reports neighbouring atoms evaluated by a "user-defined" distance cut-off value (in Angstrom units), either between a specific Het Group or all Het Groups in a given PDB with amino acid residues and water molecules in the corresponding protein, or neighbours for only all the amino acid residues in the given PDB with respect to Het Groups and water molecules. Together, the database and software applications are useful to gather information that can be further analyzed in order to obtain insights into the preferred interactions of heteroatom groups in proteins, study their binding mode, design novel molecules or to annotate protein function.
SARS coronavirus, SCV, has been recently responsible of a sudden and widespread infection which caused almost 800 victims. The limited amount of SCV protein structural information is partially responsible of the lack of specific drugs against the virus. Coronavirus helicases are very conserved and peculiar proteins which have been proposed as suitable targets for antiviral drugs, such as bananins, which have been recently shown to inhibit the SCV helicase in vitro. Here, the quaternary structure of SCV helicase has been predicted, which will provide a solid foundation for the rational design of other antiviral helicase inhibitors.
The relationship between protein sequence, structure, and dynamics has been elusive. Here, we report a comprehensive analysis using an in-solution experimental approach to study how the conservation of tertiary structure correlates with protein dynamics. Hydrogen exchange measurements of eight processivity clamp proteins from different species revealed that, despite highly similar three-dimensional structures, clamp proteins display a wide range of dynamic behavior. Differences were apparent both for structurally similar domains within proteins and for corresponding domains of different proteins. Several of the clamps contained regions that underwent local unfolding with different half-lives. We also observed a conserved pattern of alternating dynamics of the α helices lining the inner pore of the clamps as well as a correlation between dynamics and the number of salt bridges in these α helices. Our observations reveal that tertiary structure and dynamics are not directly correlated and that primary structure plays an important role in dynamics.
Ab initio phasing of macromolecular structures, from the native intensities alone with no experimental phase information or previous particular structural knowledge, has been the object of a long quest, limited by two main barriers: structure size and resolution of the data. Current approaches to extend the scope of ab initio phasing include use of the Patterson function, density modification and data extrapolation. The authors' approach relies on the combination of locating model fragments such as polyalanine α-helices with the program PHASER and density modification with the program SHELXE. Given the difficulties in discriminating correct small substructures, many putative groups of fragments have to be tested in parallel; thus calculations are performed in a grid or supercomputer. The method has been named after the Italian painter Arcimboldo, who used to compose portraits out of fruit and vegetables. With ARCIMBOLDO, most collections of fragments remain a 'still-life', but some are correct enough for density modification and main-chain tracing to reveal the protein's true portrait. Beyond α-helices, other fragments can be exploited in an analogous way: libraries of helices with modelled side chains, β-strands, predictable fragments such as DNA-binding folds or fragments selected from distant homologues up to libraries of small local folds that are used to enforce nonspecific tertiary structure; thus restoring the ab initio nature of the method. Using these methods, a number of unknown macromolecules with a few thousand atoms and resolutions around 2 Å have been solved. In the 2014 release, use of the program has been simplified. The software mediates the use of massive computing to automate the grid access required in difficult cases but may also run on a single multicore workstation (http://chango.ibmb.csic.es/ARCIMBOLDO_LITE) to solve straightforward cases.
Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the facets that you can filter your papers by.
From here we'll present any options for the literature, such as exporting your current results.
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.
Year:
Count: