Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

A large-scale proteogenomics study of apicomplexan pathogens-Toxoplasma gondii and Neospora caninum.

Proteomics | 2015

Proteomics data can supplement genome annotation efforts, for example being used to confirm gene models or correct gene annotation errors. Here, we present a large-scale proteogenomics study of two important apicomplexan pathogens: Toxoplasma gondii and Neospora caninum. We queried proteomics data against a panel of official and alternate gene models generated directly from RNASeq data, using several newly generated and some previously published MS datasets for this meta-analysis. We identified a total of 201 996 and 39 953 peptide-spectrum matches for T. gondii and N. caninum, respectively, at a 1% peptide FDR threshold. This equated to the identification of 30 494 distinct peptide sequences and 2921 proteins (matches to official gene models) for T. gondii, and 8911 peptides/1273 proteins for N. caninum following stringent protein-level thresholding. We have also identified 289 and 140 loci for T. gondii and N. caninum, respectively, which mapped to RNA-Seq-derived gene models used in our analysis and apparently absent from the official annotation (release 10 from EuPathDB) of these species. We present several examples in our study where the RNA-Seq evidence can help in correction of the current gene model and can help in discovery of potential new genes. The findings of this study have been integrated into the EuPathDB. The data have been deposited to the ProteomeXchange with identifiers PXD000297and PXD000298.

Pubmed ID: 25867681 RIS Download

Research resources used in this publication

None found

Antibodies used in this publication

None found

Associated grants

  • Agency: Biotechnology and Biological Sciences Research Council, United Kingdom
    Id: BB/G010781/1
  • Agency: Biotechnology and Biological Sciences Research Council, United Kingdom
    Id: BB/H024654/1

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


Clustal Omega (tool)

RRID:SCR_001591

Software package as multiple sequence alignment tool that uses seeded guide trees and HMM profile-profile techniques to generate alignments between three or more sequences. Accepts nucleic acid or protein sequences in multiple sequence formats NBRF/PIR, EMBL/UniProt, Pearson (FASTA), GDE, ALN/Clustal, GCG/MSF, RSF.

View all literature mentions

ProteomeXchange (tool)

RRID:SCR_004055

A data repository for proteomic data sets. The ProteomeExchange consortium, as a whole, aims to provide a coordinated submission of MS proteomics data to the main existing proteomics repositories, as well as to encourage optimal data dissemination. ProteomeXchange provides access to a number of public databases, and users can access and submit data sets to the consortium's PRIDE database and PASSEL/PeptideAtlas.

View all literature mentions

GSNAP (tool)

RRID:SCR_005483

Software to align single and paired end reads as short as 14 nt and of arbitrarily long length. Can detect short and long distance splicing, including interchromosomal splicing, in individual reads, using probabilistic models or database of known splice sites. Permits SNP-tolerant alignment to reference space of all possible combinations of major and minor alleles, and can align reads from bisulfite-treated DNA for study of methylation state.

View all literature mentions

InterProScan (tool)

RRID:SCR_005829

Software package for functional analysis of sequences by classifying them into families and predicting presence of domains and sites. Scans sequences against InterPro's signatures. Characterizes nucleotide or protein function by matching it with models from several different databases. Used in large scale analysis of whole proteomes, genomes and metagenomes. Available as Web based version and standalone Perl version and SOAP Web Service.

View all literature mentions