The PeptideMapper Web-Service provides alignments of peptide sequence alignments to proteins, mRNA, EST, and HTC sequences from Genbank, RefSeq, UniProt, IPI, VEGA, EMBL, and HInvDb. This mapping infrastructure is supported, in part, by the compressed peptide sequence database infrastructure (Edwards, 2007) which enables a fast, suffix-tree based mapping of peptide sequences to gene identifiers and a gene-focused detailed mapping of peptide sequences to source sequence evidence. The PeptideMapper Web-Service can be used interactively or as a web-service using either HTTP or SOAP requests. Results of HTTP requests can be returned in a variety of formats, including XML, JSON, CSV, TSV, or XLS, and in some cases, GFF or BED; results of SOAP requests are returned as SOAP responses. The PeptideMapper Web-Service maps at most 20 peptides with length between 5 and 30 amino-acids in each request. The number of alignments returned, per peptide, gene, and sequence type, is set to 10 by default. The default can be changed on the interactive alignments search form or by using the max web-service parameter.
Resource Type: Resource
Version: Latest Version
peptide, sequence, protein, alignment, expressed sequence tag, mrna, est, htc, genbank, refseq, uniprot, ipi, vega, embl, hinvdb
Additional Resource Types
PeptideMapper Web-Service, Peptide Mapper
Created 2 weeks ago by Christie Wang
Created 3 years ago by Anonymous
- Edwards NJ
- Mol. Syst. Biol.
- 2007 17
Peptide identification by tandem mass spectrometry is the dominant proteomics workflow for protein characterization in complex samples. Traditional search engines, which match peptide sequences with tandem mass spectra to identify the samples' proteins, use protein sequence databases to suggest peptide candidates for consideration. Although the acquisition of tandem mass spectra is not biased toward well-understood protein isoforms, this computational strategy is failing to identify peptides from alternative splicing and coding SNP protein isoforms despite the acquisition of good-quality tandem mass spectra. We propose, instead, that expressed sequence tags (ESTs) be searched. Ordinarily, such a strategy would be computationally infeasible due to the size of EST sequence databases; however, we show that a sophisticated sequence database compression strategy, applied to human ESTs, reduces the sequence database size approximately 35-fold. Once compressed, our EST sequence database is comparable in size to other commonly used protein sequence databases, making routine EST searching feasible. We demonstrate that our EST sequence database enables the discovery of novel peptides in a variety of public data sets.