The Protein Information Resource (PIR) is an integrated public bioinformatics resource to support genomic, proteomic and systems biology research and scientific studies. For over four decades the PIR has provided databases and protein sequence analysis tools to the scientific community, including the Protein Sequence Database (PSD), which grew out from the Atlas of Protein Sequence and Structure.
Located both at the University of Delaware and Georgetown University, PIR conducts research in biomedical text mining and ontology, computational systems biology, and bioinformatics cyberinfrastructure. The PIR web sites are freely accessible by researchers worldwide with over 4 million hits per month from over 100,000 unique sites.
In 2002 PIR, along with its international partners, EBI (European Bioinformatics Institute) and SIB (Swiss Institute of Bioinformatics), were awarded a grant from NIH to create UniProt, a single worldwide database of protein sequence and function, by unifying the PIR-PSD, Swiss-Prot, and TrEMBL databases.
Currently, PIR major activities include: i) UniProt (Universal Protein Resource) development, ii) iProClass protein data integration and ID mapping, iii) PRO protein ontology, and iv) iProLINK protein literature mining and ontology development.
The FTP site provides free download for iProClass, PIRSF, and PRO
Resource Type: Resource
Version: Latest Version
annotation, genomic, mining, protein, protein bioinformatics, proteomic, research, sequence, structure, systems biology, gold standard
Additional Resource Types
topical portal, data analysis service, database
Protein Information Resource, PIR - Protein Information Resource, Protein Information Resource
NLM, P41 LM05798
Created 4 years ago by Anonymous
- Wu CH
- Nucleic Acids Res.
- 2003 1
The Protein Information Resource (PIR) is an integrated public resource of protein informatics that supports genomic and proteomic research and scientific discovery. PIR maintains the Protein Sequence Database (PSD), an annotated protein database containing over 283 000 sequences covering the entire taxonomic range. Family classification is used for sensitive identification, consistent annotation, and detection of annotation errors. The superfamily curation defines signature domain architecture and categorizes memberships to improve automated classification. To increase the amount of experimental annotation, the PIR has developed a bibliography system for literature searching, mapping, and user submission, and has conducted retrospective attribution of citations for experimental features. PIR also maintains NREF, a non-redundant reference database, and iProClass, an integrated database of protein family, function, and structure information. PIR-NREF provides a timely and comprehensive collection of protein sequences, currently consisting of more than 1 000 000 entries from PIR-PSD, SWISS-PROT, TrEMBL, RefSeq, GenPept, and PDB. The PIR web site (http://pir.georgetown.edu) connects data analysis tools to underlying databases for information retrieval and knowledge discovery, with functionalities for interactive queries, combinations of sequence and text searches, and sorting and visual exploration of search results. The FTP site provides free download for PSD and NREF biweekly releases and auxiliary databases and files.