• Register
X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X

Leaving Community

Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.

No
Yes

The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes.

Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.

Pubmed ID: 19498102

Authors

  • Pruitt KD
  • Harrow J
  • Harte RA
  • Wallin C
  • Diekhans M
  • Maglott DR
  • Searle S
  • Farrell CM
  • Loveland JE
  • Ruef BJ
  • Hart E
  • Suner MM
  • Landrum MJ
  • Aken B
  • Ayling S
  • Baertsch R
  • Fernandez-Banet J
  • Cherry JL
  • Curwen V
  • Dicuccio M
  • Kellis M
  • Lee J
  • Lin MF
  • Schuster M
  • Shkeda A
  • Amid C
  • Brown G
  • Dukhanina O
  • Frankish A
  • Hart J
  • Maidak BL
  • Mudge J
  • Murphy MR
  • Murphy T
  • Rajan J
  • Rajput B
  • Riddick LD
  • Snow C
  • Steward C
  • Webb D
  • Weber JA
  • Wilming L
  • Wu W
  • Birney E
  • Haussler D
  • Hubbard T
  • Ostell J
  • Durbin R
  • Lipman D

Journal

Genome research

Publication Data

July 2, 2009

Associated Grants

  • Agency: Wellcome Trust, Id: 062023
  • Agency: Wellcome Trust, Id: 077198
  • Agency: NHGRI NIH HHS, Id: 1U54HG004555-01
  • Agency: NHGRI NIH HHS, Id: U54 HG004555
  • Agency: Wellcome Trust, Id: WT062023
  • Agency: Wellcome Trust, Id: WT077198
  • Agency: Intramural NIH HHS, Id:

Mesh Terms

  • Animals
  • Consensus Sequence
  • Genome
  • Humans
  • Mice
  • Open Reading Frames
  • Sequence Alignment