Preparing your results

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Tracking and coordinating an international curation effort for the CCDS Project.

The Consensus Coding Sequence (CCDS) collaboration involves curators at multiple centers with a goal of producing a conservative set of high quality, protein-coding region annotations for the human and mouse reference genome assemblies. The CCDS data set reflects a 'gold standard' definition of best supported protein annotations, and corresponding genes, which pass a standard series of quality assurance checks and are supported by manual curation. This data set supports use of genome annotation information by human and mouse researchers for effective experimental design, analysis and interpretation. The CCDS project consists of analysis of automated whole-genome annotation builds to identify identical CDS annotations, quality assurance testing and manual curation support. Identical CDS annotations are tracked with a CCDS identifier (ID) and any future change to the annotated CDS structure must be agreed upon by the collaborating members. CCDS curation guidelines were developed to address some aspects of curation in order to improve initial annotation consistency and to reduce time spent in discussing proposed annotation updates. Here, we present the current status of the CCDS database and details on our procedures to track and coordinate our efforts. We also present the relevant background and reasoning behind the curation standards that we have developed for CCDS database treatment of transcripts that are nonsense-mediated decay (NMD) candidates, for transcripts containing upstream open reading frames, for identifying the most likely translation start codons and for the annotation of readthrough transcripts. Examples are provided to illustrate the application of these guidelines. DATABASE URL: http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi.

Pubmed ID: 22434842 RIS Download

Mesh terms: Animals | Consensus Sequence | Database Management Systems | Databases, Genetic | Genomics | Humans | Mice | Molecular Sequence Annotation

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


NCBI

A portal to biomedical and genomic information. NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information for the better understanding of molecular processes affecting human health and disease.

tool

View all literature mentions

Genome Reference Consortium

Consortium that puts sequences into a chromosome context and provides the best possible reference assembly for human, mouse, and zebrafish via FTP. The consortium does this by both generating multiple representations (alternate loci) for regions that are too complex to be represented by a single path and by releasing regional fixes known as patches. This allows users who are interested in a specific locus to get an improved representation without affecting users who need chromosome coordinate stability. This resource additionally provides mechanisms by which the scientific community can report loci in need of further review.
The GRC has built tools to facilitate the curation of genome assemblies based on the sequence overlaps of long, high quality sequences (Clones and PCR products, not short sequence reads). The GRC currently supports production of assemblies for human, mouse or zebrafish. If your assembly data fits this model and you are interested in using these tools please contact us using the ''Contact Us'' page. The human genome assembly was produced as part of the Human Genome Project (HGP). The previous assembly (NCBI36) was the last one produced by the HGP and was described in 2004 (PMID: 15496913); this was the starting point for the GRC. The assembly is based largely on assembling overlapping clone sequences. The GRC has produced an updated assembly (GRCm38). This is an update of the last MGSC assembly (MGSCv37) which was described in 2004(PMID: 19468303). The primary assembly is based on assembling overlapping BAC clones derived from the C57BL/6J strain and several loci have sequence available from other strains. The zebrafish genome assembly was produced at the Sanger Institute. The last assembly produced from the original project was Zv9 and will be described in 2010. This assembly is the starting point for the GRC. The assembly is based on assembling overlapping BAC clones and integrating these sequences with the whole genome shotgun assembly. A set of TPF files are maintained for each assembled chromosome and partial assembly. These files are stored in a central database that manages TPF tracking and validation. Sequences (also known as components) which are adjacent on the TPF are expected to have a specific type of sequence alignment known as a full dovetail. A program call ''find_overlaps'' assesses all adjacent component sequences to determine if they have an appropriate overlap.

tool

View all literature mentions

INSDC

International collaboration of the International Nucleotide Sequence Databases (INSD), DDBJ, ENA, and GenBank, maintained for over 18 years. Individuals submitting data to the international sequence databases should be aware of INSDC policy.

tool

View all literature mentions

GWAS: Catalog of Published Genome-Wide Association Studies

Database of genome-wide association study (GWAS) publications including only those attempting to assay at least 100,000 single nucleotide polymorphisms (SNPs). Publications are organized from most to least recent date of publication. Studies focusing only on candidate genes are excluded from this catalog. Studies are identified through weekly PubMed literature searches, daily NIH-distributed compilations of news and media reports, and occasional comparisons with an existing database of GWAS literature (HuGE Navigator).

tool

View all literature mentions