Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

The Capsaspora genome reveals a complex unicellular prehistory of animals.

Nature communications | 2013

To reconstruct the evolutionary origin of multicellular animals from their unicellular ancestors, the genome sequences of diverse unicellular relatives are essential. However, only the genome of the choanoflagellate Monosiga brevicollis has been reported to date. Here we completely sequence the genome of the filasterean Capsaspora owczarzaki, the closest known unicellular relative of metazoans besides choanoflagellates. Analyses of this genome alter our understanding of the molecular complexity of metazoans' unicellular ancestors showing that they had a richer repertoire of proteins involved in cell adhesion and transcriptional regulation than previously inferred only with the choanoflagellate genome. Some of these proteins were secondarily lost in choanoflagellates. In contrast, most intercellular signalling systems controlling development evolved later concomitant with the emergence of the first metazoans. We propose that the acquisition of these metazoan-specific developmental systems and the co-option of pre-existing genes drove the evolutionary transition from unicellular protists to metazoans.

Pubmed ID: 23942320 RIS Download

Associated grants

  • Agency: NHGRI NIH HHS, United States
    Id: HG003067-05
  • Agency: NHGRI NIH HHS, United States
    Id: U54 HG003067
  • Agency: NHGRI NIH HHS, United States
    Id: HG003067-09
  • Agency: European Research Council, International
    Id: 206883
  • Agency: NHGRI NIH HHS, United States
    Id: HG003067-07
  • Agency: NHGRI NIH HHS, United States
    Id: HG003067-06
  • Agency: NHGRI NIH HHS, United States
    Id: HG003067-10
  • Agency: NHGRI NIH HHS, United States
    Id: HG003067-08

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


UniProt (tool)

RRID:SCR_002380

Collection of data of protein sequence and functional information. Resource for protein sequence and annotation data. Consortium for preservation of the UniProt databases: UniProt Knowledgebase (UniProtKB), UniProt Reference Clusters (UniRef), and UniProt Archive (UniParc), UniProt Proteomes. Collaboration between European Bioinformatics Institute (EMBL-EBI), SIB Swiss Institute of Bioinformatics and Protein Information Resource. Swiss-Prot is a curated subset of UniProtKB.

View all literature mentions

InterPro (tool)

RRID:SCR_006695

Service providing functional analysis of proteins by classifying them into families and predicting domains and important sites. They combine protein signatures from a number of member databases into a single searchable resource, capitalizing on their individual strengths to produce a powerful integrated database and diagnostic tool. This integrated database of predictive protein signatures is used for the classification and automatic annotation of proteins and genomes. InterPro classifies sequences at superfamily, family and subfamily levels, predicting the occurrence of functional domains, repeats and important sites. InterPro adds in-depth annotation, including GO terms, to the protein signatures. You can access the data programmatically, via Web Services. The member databases use a number of approaches: # ProDom: provider of sequence-clusters built from UniProtKB using PSI-BLAST. # PROSITE patterns: provider of simple regular expressions. # PROSITE and HAMAP profiles: provide sequence matrices. # PRINTS provider of fingerprints, which are groups of aligned, un-weighted Position Specific Sequence Matrices (PSSMs). # PANTHER, PIRSF, Pfam, SMART, TIGRFAMs, Gene3D and SUPERFAMILY: are providers of hidden Markov models (HMMs). Your contributions are welcome. You are encouraged to use the ''''Add your annotation'''' button on InterPro entry pages to suggest updated or improved annotation for individual InterPro entries.

View all literature mentions

NCBI BioProject (tool)

RRID:SCR_004801

Database of biological data related to a single initiative, originating from a single organization or from a consortium. A BioProject record provides users a single place to find links to the diverse data types generated for that project. It is a searchable collection of complete and incomplete (in-progress) large-scale sequencing, assembly, annotation, and mapping projects for cellular organisms. Submissions are supported by a web-based Submission Portal. The database facilitates organization and classification of project data submitted to NCBI, EBI and DDBJ databases that captures descriptive information about research projects that result in high volume submissions to archival databases, ties together related data across multiple archives and serves as a central portal by which to inform users of data availability. BioProject records link to corresponding data stored in archival repositories. The BioProject resource is a redesigned, expanded, replacement of the NCBI Genome Project resource. The redesign adds tracking of several data elements including more precise information about a project''''s scope, material, and objectives. Genome Project identifiers are retained in the BioProject as the ID value for a record, and an Accession number has been added. Database content is exchanged with other members of the International Nucleotide Sequence Database Collaboration (INSDC). BioProject is accessible via FTP.

View all literature mentions

Hmmer (tool)

RRID:SCR_005305

Tool for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using probabilistic models called profile hidden Markov models (profile HMMs). Compared to BLAST, FASTA, and other sequence alignment and database search tools based on older scoring methodology, HMMER aims to be significantly more accurate and more able to detect remote homologs because of the strength of its underlying mathematical models. In the past, this strength came at significant computational expense, but in the new HMMER3 project, HMMER is now essentially as fast as BLAST.

View all literature mentions

Blast2GO (tool)

RRID:SCR_005828

An ALL in ONE tool for functional annotation of (novel) sequences and the analysis of annotation data. Blast2GO (B2G) joins in one universal application similarity search based GO annotation and functional analysis. B2G offers the possibility of direct statistical analysis on gene function information and visualization of relevant functional features on a highlighted GO direct acyclic graph (DAG). Furthermore B2G includes various statistics charts summarizing the results obtained at BLASTing, GO-mapping, annotation and enrichment analysis (Fisher''''s Exact Test). All analysis process steps are configurable and data import and export are supported at any stage. The application also accepts pre-existing BLAST or annotation files and takes them to subsequent steps. The tool offers a very suitable platform for high throughput functional genomics research in non-model species. B2G is a species-independent, intuitive and interactive desktop application which allows monitoring and comprehending the whole annotation and analysis process supported by additional features like GO Slim integration, evidence code (EC) consideration, a Batch-Mode or GO-Multilevel-Pies. Platform: Windows compatible, Mac OS X compatible, Linux compatible, Unix compatible

View all literature mentions

RAxML (tool)

RRID:SCR_006086

Software program for phylogenetic analyses of large datasets under maximum likelihood.

View all literature mentions

MAFFT (tool)

RRID:SCR_011811

Software package as multiple alignment program for amino acid or nucleotide sequences. Can align up to 500 sequences or maximum file size of 1 MB. First version of MAFFT used algorithm based on progressive alignment, in which sequences were clustered with help of Fast Fourier Transform. Subsequent versions have added other algorithms and modes of operation, including options for faster alignment of large numbers of sequences, higher accuracy alignments, alignment of non-coding RNA sequences, and addition of new sequences to existing alignments.

View all literature mentions

Trace Archive (tool)

RRID:SCR_013788

An online repository which houses sequencing data from gel and capillary platforms (such as Applied Biosystems ABI 3730®). Most sequences are derived from Whole Genome Shotgun sequencing. Large data sets as well as only a few sequences can be obtained.

View all literature mentions

PASA (tool)

RRID:SCR_014656

Gene structure annotation and analysis tool that uses spliced alignments of expressed transcript sequences to automatically model gene structures. It also incorporates gene structures based on transcript alignments into existing gene structure annotations. It is one component of a larger eukayotic annotation pipeline implemented at the Broad Institute.

View all literature mentions

EVidenceModeler (tool)

RRID:SCR_014659

Software tool for automated eukaryotic gene structure annotation that reports eukaryotic gene structures as weighted consensus of all available evidence. Used to combine ab intio gene predictions and protein and transcript alignments into weighted consensus gene structures. Inputs include genome sequence, gene predictions, and alignment data (in GFF3 format).

View all literature mentions

GeneWise (tool)

RRID:SCR_015054

Gene alignment tool from the EBI which predicts gene structure using similar protein sequences. See also the associated GenomeWise tool.

View all literature mentions