Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

InParanoid 7: new algorithms and tools for eukaryotic orthology analysis.

Nucleic acids research | Jan 22, 2010

The InParanoid project gathers proteomes of completely sequenced eukaryotic species plus Escherichia coli and calculates pairwise ortholog relationships among them. The new release 7.0 of the database has grown by an order of magnitude over the previous version and now includes 100 species and their collective 1.3 million proteins organized into 42.7 million pairwise ortholog groups. The InParanoid algorithm itself has been revised and is now both more specific and sensitive. Based on results from our recent benchmarking of low-complexity filters in homology assignment, a two-pass BLAST approach was developed that makes use of high-precision compositional score matrix adjustment, but avoids the alignment truncation that sometimes follows. We have also updated the InParanoid web site ( Several features have been added, the response times have been improved and the site now sports a new, clearer look. As the number of ortholog databases has grown, it has become difficult to compare among these resources due to a lack of standardized source data and incompatible representations of ortholog relationships. To facilitate data exchange and comparisons among ortholog databases, we have developed and are making available two XML schemas: SeqXML for the input sequences and OrthoXML for the output ortholog clusters.

Pubmed ID: 19892828 RIS Download

Mesh terms: Algorithms | Animals | Cluster Analysis | Computational Biology | Databases, Genetic | Databases, Nucleic Acid | Escherichia coli | Eukaryotic Cells | Genome, Bacterial | Humans | Information Storage and Retrieval | Internet | Protein Structure, Tertiary | Proteins | Proteomics | Software

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.

DOE Joint Genome Institute

Institute to advance genomics in support of the DOE missions related to clean energy generation and environmental characterization and cleanup. Supported by the DOE Office of Science, the DOE JGI unites the expertise at Lawrence Berkeley National Laboratory, Lawrence Livermore National Laboratory, and the HudsonAlpha Institute for Biotechnology. The facility provides integrated high-throughput sequencing and computational analysis that enable systems-based scientific approaches to these challenges.


View all literature mentions


Tool for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using probabilistic models called profile hidden Markov models (profile HMMs). Compared to BLAST, FASTA, and other sequence alignment and database search tools based on older scoring methodology, HMMER aims to be significantly more accurate and more able to detect remote homologs because of the strength of its underlying mathematical models. In the past, this strength came at significant computational expense, but in the new HMMER3 project, HMMER is now essentially as fast as BLAST.


View all literature mentions

Wellcome Trust Sanger Institute; Hinxton; United Kingdom

A not-for-profit research organization that is one of the world''s leading genome centers. The Institute uses genome sequences to advance understanding of the biology of humans and pathogens in order to improve human health. Funded principally by the Wellcome Trust, its scientists conduct research at scale, engaging in bold and long-term exploratory projects that are designed to influence and empower medical science globally. They provide data which can be translated for diagnostics, treatments or therapies within the context of global health including over 100 finished genomes, which can be downloaded. The Institute makes data publicly available on a limited basis, and provides more extensive data upon request.


View all literature mentions