The Consensus Coding Sequence (CCDS) project (http://www.ncbi.nlm.nih.gov/CCDS/) is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies by the National Center for Biotechnology Information (NCBI) and Ensembl genome annotation pipelines. Identical annotations that pass quality assurance tests are tracked with a stable identifier (CCDS ID). Members of the collaboration, who are from NCBI, the Wellcome Trust Sanger Institute and the University of California Santa Cruz, provide coordinated and continuous review of the dataset to ensure high-quality CCDS representations. We describe here the current status and recent growth in the CCDS dataset, as well as recent changes to the CCDS web and FTP sites. These changes include more explicit reporting about the NCBI and Ensembl annotation releases being compared, new search and display options, the addition of biologically descriptive information and our approach to representing genes for which support evidence is incomplete. We also present a summary of recent and future curation targets.
Pubmed ID: 24217909 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Web search tool to find regions of similarity between biological sequences. Program compares nucleotide or protein sequences to sequence databases and calculates statistical significance. Used for identifying homologous sequences.
View all literature mentionsA portal to biomedical and genomic information. NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information for the better understanding of molecular processes affecting human health and disease.
View all literature mentionsDatabase (anonymous FTP) resulting from a collaborative effort to identify a core set of human and mouse protein coding regions that are consistently annotated and of high quality. The long term goal is to support convergence towards a standard set of gene annotations. Collaborators are EBI, NCBI, UCSC, WTSI and the initial results are also available from the participants'''' genome browser Web sites. In addition, CCDS identifiers are indicated on the relevant NCBI RefSeq and Entrez Gene records and in Map Viewer displays of RNA (RefSeq) and Gene annotations on the reference assembly.
View all literature mentions