Preparing your results

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

PLoS biology | Nov 31, 2003

The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs) known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp) and C. elegans (100.3 Mbp) genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C. briggsae, we found strong evidence for 1,300 new C. elegans genes. In addition, comparisons of the two genomes will help to understand the evolutionary forces that mold nematode genomes.

Pubmed ID: 14624247 RIS Download

Mesh terms: Animals | Biological Evolution | Caenorhabditis | Caenorhabditis elegans | Chromosome Mapping | Chromosomes, Artificial, Bacterial | Cluster Analysis | Codon | Conserved Sequence | Evolution, Molecular | Exons | Gene Library | Genome | Genomics | Interspersed Repetitive Sequences | Introns | MicroRNAs | Models, Genetic | Models, Statistical | Molecular Sequence Data | Multigene Family | Open Reading Frames | Physical Chromosome Mapping | Plasmids | Protein Structure, Tertiary | Proteins | RNA | RNA, Ribosomal | RNA, Spliced Leader | RNA, Transfer | Sequence Analysis, DNA | Species Specificity

Research resources used in this publication

None found

Research tools detected in this publication

Data used in this publication

None found

Associated grants

  • Agency: NIGMS NIH HHS, Id: T32 GM007754
  • Agency: NHGRI NIH HHS, Id: 5U01 HG02042
  • Agency: NHGRI NIH HHS, Id: P41 HG002223
  • Agency: NHGRI NIH HHS, Id: P01 HG000956
  • Agency: NHGRI NIH HHS, Id: 5P01 HG00956
  • Agency: NHGRI NIH HHS, Id: P41 HG02223
  • Agency: NIGMS NIH HHS, Id: R01 GM42432
  • Agency: NIGMS NIH HHS, Id: R01 GM042432
  • Agency: NIGMS NIH HHS, Id: T32 GM07754-22

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


Ensembl

A collection of genome databases for vertebrates and other eukaryotic species with DNA and protein sequence search capabilities. The goal of Ensembl is to automatically annotate the genome, integrate this annotation with other available biological data and make the data publicly available via the web. The range of available data has also expanded to include comparative genomics, variation and regulatory data. Ensembl allows users to: upload and analyze data and save it to an Ensembl account; search for a DNA or protein sequence using BLAST or BLAT; fetch desired data from the public database, using the Perl API; download the databases via FTP in FASTA, MySQL and other formats; and mine Ensembl with BioMart and export sequences or tables in text, HTML, or Excel format. The DNA sequences and assemblies used in the Ensembl genebuild are provided by various projects around the world. Ensembl has entered into an agreement with UCSC and NCBI with regard to sequence identifiers in order to improve consistency between the data provided by different genome browsers. The site also links to the Ensembl blog with updates on new species and sequences as they are added to the database.

tool

View all literature mentions

InterPro

Service providing functional analysis of proteins by classifying them into families and predicting domains and important sites. They combine protein signatures from a number of member databases into a single searchable resource, capitalizing on their individual strengths to produce a powerful integrated database and diagnostic tool. This integrated database of predictive protein signatures is used for the classification and automatic annotation of proteins and genomes. InterPro classifies sequences at superfamily, family and subfamily levels, predicting the occurrence of functional domains, repeats and important sites. InterPro adds in-depth annotation, including GO terms, to the protein signatures. You can access the data programmatically, via Web Services. The member databases use a number of approaches: # ProDom: provider of sequence-clusters built from UniProtKB using PSI-BLAST. # PROSITE patterns: provider of simple regular expressions. # PROSITE and HAMAP profiles: provide sequence matrices. # PRINTS provider of fingerprints, which are groups of aligned, un-weighted Position Specific Sequence Matrices (PSSMs). # PANTHER, PIRSF, Pfam, SMART, TIGRFAMs, Gene3D and SUPERFAMILY: are providers of hidden Markov models (HMMs). Your contributions are welcome. You are encouraged to use the ''''Add your annotation'''' button on InterPro entry pages to suggest updated or improved annotation for individual InterPro entries.

tool

View all literature mentions