Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Improving HIV proteome annotation: new features of BioAfrica HIV Proteomics Resource.

Database : the journal of biological databases and curation | 2016

The Human Immunodeficiency Virus (HIV) is one of the pathogens that cause the greatest global concern, with approximately 35 million people currently infected with HIV. Extensive HIV research has been performed, generating a large amount of HIV and host genomic data. However, no effective vaccine that protects the host from HIV infection is available and HIV is still spreading at an alarming rate, despite effective antiretroviral (ARV) treatment. In order to develop effective therapies, we need to expand our knowledge of the interaction between HIV and host proteins. In contrast to virus proteins, which often rapidly evolve drug resistance mutations, the host proteins are essentially invariant within all humans. Thus, if we can identify the host proteins needed for virus replication, such as those involved in transporting viral proteins to the cell surface, we have a chance of interrupting viral replication. There is no proteome resource that summarizes this interaction, making research on this subject a difficult enterprise. In order to fill this gap in knowledge, we curated a resource presents detailed annotation on the interaction between the HIV proteome and host proteins. Our resource was produced in collaboration with ViralZone and used manual curation techniques developed by UniProtKB/Swiss-Prot. Our new website also used previous annotations of the BioAfrica HIV-1 Proteome Resource, which has been accessed by approximately 10 000 unique users a year since its inception in 2005. The novel features include a dedicated new page for each HIV protein, a graphic display of its function and a section on its interaction with host proteins. Our new webpages also add information on the genomic location of each HIV protein and the position of ARV drug resistance mutations. Our improved BioAfrica HIV-1 Proteome Resource fills a gap in the current knowledge of biocuration.Database URL:http://www.bioafrica.net/proteomics/HIVproteome.html.

Pubmed ID: 27087306 RIS Download

Associated grants

  • Agency: Wellcome Trust, United Kingdom
    Id: 082384/Z/07/Z
  • Agency: Medical Research Council, United Kingdom

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


UniProt (tool)

RRID:SCR_002380

Collection of data of protein sequence and functional information. Resource for protein sequence and annotation data. Consortium for preservation of the UniProt databases: UniProt Knowledgebase (UniProtKB), UniProt Reference Clusters (UniRef), and UniProt Archive (UniParc), UniProt Proteomes. Collaboration between European Bioinformatics Institute (EMBL-EBI), SIB Swiss Institute of Bioinformatics and Protein Information Resource. Swiss-Prot is a curated subset of UniProtKB.

View all literature mentions

Stanford University HIV Drug Resistance Database (tool)

RRID:SCR_006631

The Stanford University HIV Drug Resistance Database is a curated public database designed to represent, store, and analyze the different forms of data underlying HIVs drug resistance. HIVDB has three main types of content: (1) Database queries and references, (2) Interactive programs, and (3) Educational resources. Database queries are designed primarily for researchers studying HIV drug resistance. The interactive programs and educational resources are designed for both researchers and those wishing to learn more about HIV drug resistance. 1.DATABASE QUERY AND REFERENCE PAGES Genotype-Treatment Correlations This Genotype-Treatment section of the database links to 15 interactive query pages that explore the relationship between treatment with HIV-1 antiretroviral drugs (ARVs) and mutations in HIV reverse transcriptase (RT), protease, and integrase. There are five types of interactive query pages: Treatment Profiles (Protease and RT inhibitors) Mutation Profiles (Protease and RT mutations) Detailed Treatment Queries (Protease, RT, and integrase inhibitors) Detailed Mutation Queries (Protease, RT, and integrase mutations) Mutation Prevalence According to Subtype and Treatment Genotype-Phenotype Correlations The main page of the Genotype-Phenotype Correlations section links to four interactive query pages: three dynamically updated data summaries and one regularly updated downloadable dataset. Drug Resistance Positions Query for levels of resistance associated with known drug resistance mutations Detailed Phenotype Queries Queries for levels of resistance associated with individual mutations or mutation combinations at all positions of protease, RT, and integrase Patterns of Drug Resistance Mutations Downloadable Reference Dataset Genotype-Clinical Correlations This part of the database has two main sections: Clinical Trials Datasets Summaries of Clinical Studies References This part of the database has two main sections: one with summaries of the data from each of the references in HIVDB and one in which every primate immunodeficiency virus sequence in GenBank is annotated according to its presence or absence in HIVDB. Studies in HIVDB GenBank <=> HIVDB New Submissions Approximately every three months, the New Submissions section lists the studies that have been entered into HIVDB. The study title links to the introductory page of the study in the References section. Database Statistics (http://hivdb.stanford.edu/pages/HIVdbStatistics.html) 2. INTERACTIVE PROGRAMS HIVDB has seven main interactive programs. 1. HIVdb Program Mutation List Analysis Sequence Analysis HIVdb Output Sierra Web Service Release Notes Algorithm Specification Interface (ASI) 2. HIValg Program 3. HIVseq Program 4. Calibrated Population Resistance (CPR) tool 5. Mutation ARV Evidence Listing (MARVEL) 6. ART-AiDE 7. Rega HIV-1 Subtyping tool Three programs in the HIV Drug Resistance Database share a common code base: HIVseq, HIVdb, and HIValg. HIVseq accepts user-submitted protease, RT, and integrase sequences, compares them to the consensus subtype B reference sequence, and uses the differences as query parameters for interrogating the HIV Drug Resistance database (Shafer, D Jung, & B Betts, Nat Med 2000; Rhee SY et al. AIDS 2006). The query result provides users with the prevalence of protease, RT and integrase mutations according to subtype and PI, nucleoside RT inhibitor (NRTI), non-nucleoside RT inhibitor (NNRTI), and integrase inhibitor (INI) exposure. This allows users to detect unusual sequence results immediately so that the person doing the sequencing can check the primary sequence output while it is still on the desktop. In addition, unexpected associations between sequences or isolates can be discovered by immediately retrieving data on isolates sharing one or more mutations with the sequence. There are three ways in which the HIVdb program can be used: (i) entering a list of protease and RT mutations, (ii) entering a complete sequence containing protease, RT, and/or integrase, and (iii) using a Web Service. HIVdb is an expert system that accepts user-submitted HIV-1 pol sequences and returns inferred levels of resistance to 20 FDA-approved ARV drugs including 8 PIs, 7 NRTIs, 4 NNRTIs, and - with this update - one INI. In the HIVdb system, each HIV-1 drug resistance mutation is assigned a drug penalty score and a comment; the total score for a drug is derived by adding the scores of each mutation associated with resistance to that drug. Using the total drug score, the program reports one of the following levels of inferred drug resistance: susceptible, potential low-level resistance, low-level resistance, intermediate resistance, and high-level resistance. HIValg is designed for users interested in comparing the results of different algorithms or who are interested in comparing and evaluating existing and newly developed algorithms. The ability to develop new algorithms that can be run on the HIV Drug Resistance Database depends on the Algorithm Specific Interface (ASI) compiler (Shafer & Betts JCM 2003). Submission of Sequences and Mutations For each of the three programs, sequences can be entered using either the Sequence Analysis Form or the Mutation List form. 3. EDUCATIONAL RESOURCES HIVDB contains several regularly updated sections summarizing data linking RT, protease, and integrase mutations and antiretroviral drugs (ARVs). These sections include (i) tabular summaries of the major mutations associated with each ARV class, (ii) detailed summaries of the major, minor, and accessory mutations associated with each ARV, (iii) the comments used by the HIVdb program, (iv) the scores used by the HIVdb program, (v) clinical studies in which baseline drug resistance mutations have been correlated with the virological response (clinical outcome) to a specific ARV, (vi) mutations that can be used for drug resistance surveillance, and (vii) a two-page PDF handout. 1. Drug Resistance Summaries Tabular Drug Resistance Summaries by ARV Class Detailed Drug Resistance Summaries by ARV Drug Resistance Mutation Comments Used by the HIVdb Program Drug Resistance Mutation Scores Used by the HIVdb Program Genotype-Clinical Outcome Correlation Studies 2. Surveillance Drug-Resistance Mutation List Section 3. PDF Handout Grant Support 1. National Institute for Allergy and Infectious Diseases (NIAID, NIH): Online HIV Drug Resistance Database (PI: Robert W. Shafer, MD, 1R01AI68581-01A1), 04/01/06 - 3/31/11 2. National Institute for Allergy and Infectious Diseases (NIAID, NIH) supplement to the grant Identification of Multidrug-Resistant HIV-1 Isolates (PI: Robert W. Shafer, MD, AI46148-01): Supplement provided 1999-2005. 3. NIH/NIGMS Program Project on AIDS Structural Biology Program Project: Targeting Ensembles of Drug Resistant Protease Variants (PI: Celia Schiffer, PhD, University of Massachusetts): 2002-2007 4. University-wide AIDS Research Program (CR03-ST-524). Community collaborative award: Optimizing Clinical HIV Genotypic Resistance Interpretation: Principal Investigators: Robert W. Shafer, MD and W. Jeffrey Fessel MD (Kaiser Permanente Medical Care Program): 2004-2005 5. Stanford University Bio-X Interdisciplinary Initiative: HIV Gene Sequence Analysis for Drug Resistance Studies: A Pharmacogenetic Challenge Principal Investigators: Robert W. Shafer, MD and Daphne Koller, Ph.D. (Computer Science): 2000-2002

View all literature mentions

Blocks (tool)

RRID:SCR_007567

Blocks is a database of highly conserved regions of proteins, or Blocks. THe database is no longer maintained or updated and some of its tools are no longer functional. However, Blocks does provide Block Searcher, Get Blocks and Block Maker, aids to detection and verification of protein sequence homology. They compare a protein or DNA sequence to a database of protein blocks (current version), retrieve blocks, and create new blocks, respectively. Users can further view blocks by (keyword or number), search a sequence against the database of blocks, search blocks against each other, or make blocks of their own.

View all literature mentions

Molecular Modeling DataBase (tool)

RRID:SCR_010623

The Molecular Modeling DataBase (MMDB), also known as Entrez Structure, is a database of experimentally determined structures obtained from the RCSB Protein Data Bank (PDB). MMDB is developed by the Structure Group of the NCBI Computational Biology Branch. The data processing procedure at NCBI results in the addition of a number of useful features that facilitate computation on the data and link them to many other data types in the Entrez system. The structure database is considerably smaller than Entrez''s Protein or Nucleotide databases, but a large fraction of all known protein sequences have homologs in this set, and one may often learn more about a protein by examining 3-D structures of its homologs. These are accessible as Related Structures in the Links menu of Entrez Protein sequence records (illustrated example). It is then possible to align the query protein to the structure-based sequence, as shown in the illustration on this page. Additional resources can be used along with MMDB to interactively view the structures, find similar 3D structures, learn about the types of interactions and bound chemicals that have been found to exist among the similar 3D structures, and more.

View all literature mentions

PROSITE (tool)

RRID:SCR_003457

Database of protein families and domains that is based on the observation that, while there is a huge number of different proteins, most of them can be grouped, on the basis of similarities in their sequences, into a limited number of families. Proteins or protein domains belonging to a particular family generally share functional attributes and are derived from a common ancestor. It is complemented by ProRule, a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. ScanProsite finds matches of your protein sequences to PROSITE signatures. PROSITE currently contains patterns and profiles specific for more than a thousand protein families or domains. Each of these signatures comes with documentation providing background information on the structure and function of these proteins. The database is available via FTP.

View all literature mentions

InterPro (tool)

RRID:SCR_006695

Service providing functional analysis of proteins by classifying them into families and predicting domains and important sites. They combine protein signatures from a number of member databases into a single searchable resource, capitalizing on their individual strengths to produce a powerful integrated database and diagnostic tool. This integrated database of predictive protein signatures is used for the classification and automatic annotation of proteins and genomes. InterPro classifies sequences at superfamily, family and subfamily levels, predicting the occurrence of functional domains, repeats and important sites. InterPro adds in-depth annotation, including GO terms, to the protein signatures. You can access the data programmatically, via Web Services. The member databases use a number of approaches: # ProDom: provider of sequence-clusters built from UniProtKB using PSI-BLAST. # PROSITE patterns: provider of simple regular expressions. # PROSITE and HAMAP profiles: provide sequence matrices. # PRINTS provider of fingerprints, which are groups of aligned, un-weighted Position Specific Sequence Matrices (PSSMs). # PANTHER, PIRSF, Pfam, SMART, TIGRFAMs, Gene3D and SUPERFAMILY: are providers of hidden Markov models (HMMs). Your contributions are welcome. You are encouraged to use the ''''Add your annotation'''' button on InterPro entry pages to suggest updated or improved annotation for individual InterPro entries.

View all literature mentions

BioAfrica HIV Informatics in Africa (tool)

RRID:SCR_002295

The BioAfrica HIV-1 Proteomics Resource is a website that contains detailed information about the HIV-1 proteome and protease cleavage sites, as well as data-mining tools that can be used to manipulate and query protein sequence data, a BLAST tool for initiating structural analyses of HIV-1 proteins, and a proteomics tools directory. HIV Proteomics Resource contains information about each HIV-1 gene product in regard to expression, post-transcriptional / post-translational modifications, localization, functional activities, and potential interactions with viral and host macromolecules. The Proteome section contains extensive data on each of 19 HIV-1 proteins, including their functional properties, a sample analysis of HIV-1HXB2, structural models and links to other online resources. The HIV-1 Protease Cleavage Sites section provides information on the position, subtype variation and genetic evolution of Gag, Gag-Pol and Nef cleavage sites.

View all literature mentions

UniProtKB (tool)

RRID:SCR_004426

Central repository for collection of functional information on proteins, with accurate and consistent annotation. In addition to capturing core data mandatory for each UniProtKB entry (mainly, the amino acid sequence, protein name or description, taxonomic data and citation information), as much annotation information as possible is added. This includes widely accepted biological ontologies, classifications and cross-references, and experimental and computational data. The UniProt Knowledgebase consists of two sections, UniProtKB/Swiss-Prot and UniProtKB/TrEMBL. UniProtKB/Swiss-Prot (reviewed) is a high quality manually annotated and non-redundant protein sequence database which brings together experimental results, computed features, and scientific conclusions. UniProtKB/TrEMBL (unreviewed) contains protein sequences associated with computationally generated annotation and large-scale functional characterization that await full manual annotation. Users may browse by taxonomy, keyword, gene ontology, enzyme class or pathway.

View all literature mentions

ModBase (tool)

RRID:SCR_004642

A database of three-dimensional protein models calculated by comparative modeling. ModBase is organized into datasets, which are either available to the public, to the academic community, or to specific users. 20 unique amidohydrolase and 41 unique enolase structures have been determined have been included in the database.

View all literature mentions

Pfam (tool)

RRID:SCR_004726

A database of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Users can analyze protein sequences for Pfam matches, view Pfam family annotation and alignments, see groups of related families, look at the domain organization of a protein sequence, find the domains on a PDB structure, and query Pfam by keywords. There are two components to Pfam: Pfam-A and Pfam-B. Pfam-A entries are high quality, manually curated families that may automatically generate a supplement using the ADDA database. These automatically generated entries are called Pfam-B. Although of lower quality, Pfam-B families can be useful for identifying functionally conserved regions when no Pfam-A entries are found. Pfam also generates higher-level groupings of related families, known as clans (collections of Pfam-A entries which are related by similarity of sequence, structure or profile-HMM).

View all literature mentions

ViralZone (tool)

RRID:SCR_006563

ViralZone is a SIB Swiss Institute of Bioinformatics web-resource for all viral genus and families, providing general molecular and epidemiological information, along with virion and genome figures. Each virus or family page gives an easy access to UniProtKB/Swiss-Prot viral protein entries. ViralZone project is handled by the virus program of SwissProt group. Proteins popups were developed in collaboration with Prof. Christian von Mering and Andrea Franceschini, Bioinformatics Group , Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland, funded in part by the SIB Swiss Institute of bioinformatics. All pictures in ViralZone are copyright of the SIB Swiss Institute of Bioinformatics.

View all literature mentions