One-third of the 4,225 protein-coding genes of Escherichia coli K-12 remain functionally unannotated (orphans). Many map to distant clades such as Archaea, suggesting involvement in basic prokaryotic traits, whereas others appear restricted to E. coli, including pathogenic strains. To elucidate the orphans' biological roles, we performed an extensive proteomic survey using affinity-tagged E. coli strains and generated comprehensive genomic context inferences to derive a high-confidence compendium for virtually the entire proteome consisting of 5,993 putative physical interactions and 74,776 putative functional associations, most of which are novel. Clustering of the respective probabilistic networks revealed putative orphan membership in discrete multiprotein complexes and functional modules together with annotated gene products, whereas a machine-learning strategy based on network integration implicated the orphans in specific biological processes. We provide additional experimental evidence supporting orphan participation in protein synthesis, amino acid metabolism, biofilm formation, motility, and assembly of the bacterial cell envelope. This resource provides a "systems-wide" functional blueprint of a model microbe, with insights into the biological and evolutionary significance of previously uncharacterized proteins.
Pubmed ID: 19402753 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Multi species reference database. Comprehensive plant biochemical pathway database, containing curated information from literature and computational analyses about genes, enzymes, compounds, reactions, and pathways involved in primary and secondary metabolism.
View all literature mentionsOpen source database system and analysis tools for molecular interaction data. All interactions are derived from literature curation or direct user submissions. Direct user submissions of molecular interaction data are encouraged, which may be deposited prior to publication in a peer-reviewed journal. The IntAct Database contains (Jun. 2014): * 447368 Interactions * 33021 experiments * 12698 publications * 82745 Interactors IntAct provides a two-tiered view of the interaction data. The search interface allows the user to iteratively develop complex queries, exploiting the detailed annotation with hierarchical controlled vocabularies. Results are provided at any stage in a simplified, tabular view. Specialized views then allows "zooming in" on the full annotation of interactions, interactors and their properties. IntAct source code and data are freely available.
View all literature mentions