iRefIndex: a consolidated protein interaction database with provenance.

BMC bioinformatics | Sep 30, 2008

BACKGROUND: Interaction data for a given protein may be spread across multiple databases. We set out to create a unifying index that would facilitate searching for these data and that would group together redundant interaction data while recording the methods used to perform this grouping. RESULTS: We present a method to generate a key for a protein interaction record and a key for each participant protein. These keys may be generated by anyone using only the primary sequence of the proteins, their taxonomy identifiers and the Secure Hash Algorithm. Two interaction records will have identical keys if they refer to the same set of identical protein sequences and taxonomy identifiers. We define records with identical keys as a redundant group. Our method required that we map protein database references found in interaction records to current protein sequence records. Operations performed during this mapping are described by a mapping score that may provide valuable feedback to source interaction databases on problematic references that are malformed, deprecated, ambiguous or unfound. Keys for protein participants allow for retrieval of interaction information independent of the protein references used in the original records. CONCLUSION: We have applied our method to protein interaction records from BIND, BioGrid, DIP, HPRD, IntAct, MINT, MPact, MPPI and OPHID. The resulting interaction reference index is provided in PSI-MITAB 2.5 format at This index may form the basis of alternative redundant groupings based on gene identifiers or near sequence identity groupings.

Pubmed ID: 18823568 RIS Download

Mesh terms: Abstracting and Indexing as Topic | Amino Acid Sequence | Animals | Data Compression | Database Management Systems | Databases, Protein | Humans | Neural Networks (Computer) | Protein Interaction Mapping | Proteins | Proteome | Proteomics

This is a list of tools and resources that we have found mentioned in this publication.

Interaction Reference Index

An index of protein interactions available in a number of primary interaction databases including BIND, BioGRID, CORUM, DIP, HPRD, IntAct, MINT, MPact, MPPI and OPHID. This index includes multiple interaction types including physical and genetic (mapped to their corresponding protein products) as determined by a multitude of methods. This index allows the user to search for a protein and retrieve a non-redundant list of interactors for that protein. iRefIndex uses the Sequence Global Unique Identifier (SEGUID) to group proteins and interactions into redundant groups. This method allows users to integrate their own data with the iRefIndex in a way that ensures proteins with the exact same sequence will be represented only once. iRefIndex project has three long term objectives: # to facilitate exchange of interaction data between interaction databases. # to consolidate interaction data from multiple sources. # to provide feedback to source interaction databases. iRefIndex is made available in a number of formats: MITAB tab-delimited text files, iRefWeb interface, iRefScape plugin for Cytoscape, PSICQUIC Web services, and an interface for the R programming language environment.


Open source database system and analysis tools for molecular interaction data. All interactions are derived from literature curation or direct user submissions. Direct user submissions of molecular interaction data are encouraged, which may be deposited prior to publication in a peer-reviewed journal. The IntAct Database contains (Jun. 2014): * 447368 Interactions * 33021 experiments * 12698 publications * 82745 Interactors IntAct provides a two-tiered view of the interaction data. The search interface allows the user to iteratively develop complex queries, exploiting the detailed annotation with hierarchical controlled vocabularies. Results are provided at any stage in a simplified, tabular view. Specialized views then allows "zooming in" on the full annotation of interactions, interactors and their properties. IntAct source code and data are freely available.


