Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Structure of the germline genome of Tetrahymena thermophila and relationship to the massively rearranged somatic genome.

eLife | 2016

The germline genome of the binucleated ciliate Tetrahymena thermophila undergoes programmed chromosome breakage and massive DNA elimination to generate the somatic genome. Here, we present a complete sequence assembly of the germline genome and analyze multiple features of its structure and its relationship to the somatic genome, shedding light on the mechanisms of genome rearrangement as well as the evolutionary history of this remarkable germline/soma differentiation. Our results strengthen the notion that a complex, dynamic, and ongoing interplay between mobile DNA elements and the host genome have shaped Tetrahymena chromosome structure, locally and globally. Non-standard outcomes of rearrangement events, including the generation of short-lived somatic chromosomes and excision of DNA interrupting protein-coding regions, may represent novel forms of developmental gene regulation. We also compare Tetrahymena's germline/soma differentiation to that of other characterized ciliates, illustrating the wide diversity of adaptations that have occurred within this phylum.

Pubmed ID: 27892853 RIS Download

Research resources used in this publication

Antibodies used in this publication

None found

Associated grants

  • Agency: Medical Research Council, United Kingdom
    Id: MC_U105178788
  • Agency: NIGMS NIH HHS, United States
    Id: R01 GM077582
  • Agency: NHGRI NIH HHS, United States
    Id: U54 HG003067

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


NCBI BioProject (tool)

RRID:SCR_004801

Database of biological data related to a single initiative, originating from a single organization or from a consortium. A BioProject record provides users a single place to find links to the diverse data types generated for that project. It is a searchable collection of complete and incomplete (in-progress) large-scale sequencing, assembly, annotation, and mapping projects for cellular organisms. Submissions are supported by a web-based Submission Portal. The database facilitates organization and classification of project data submitted to NCBI, EBI and DDBJ databases that captures descriptive information about research projects that result in high volume submissions to archival databases, ties together related data across multiple archives and serves as a central portal by which to inform users of data availability. BioProject records link to corresponding data stored in archival repositories. The BioProject resource is a redesigned, expanded, replacement of the NCBI Genome Project resource. The redesign adds tracking of several data elements including more precise information about a project''''s scope, material, and objectives. Genome Project identifiers are retained in the BioProject as the ID value for a record, and an Accession number has been added. Database content is exchanged with other members of the International Nucleotide Sequence Database Collaboration (INSDC). BioProject is accessible via FTP.

View all literature mentions

NCBI Sequence Read Archive (SRA) (tool)

RRID:SCR_004891

Repository of raw sequencing data from next generation of sequencing platforms including including Roche 454 GS System, Illumina Genome Analyzer, Applied Biosystems SOLiD System, Helicos Heliscope, Complete Genomics, and Pacific Biosciences SMRT. In addition to raw sequence data, SRA now stores alignment information in form of read placements on reference sequence. Data submissions are welcome. Archive of high throughput sequencing data,part of international partnership of archives (INSDC) at NCBI, European Bioinformatics Institute and DNA Database of Japan. Data submitted to any of this three organizations are shared among them.

View all literature mentions

Tetrahymena Stock Center (tool)

RRID:SCR_008362

Centralized repository and distribution site for variety of Tetrahymena strains and species. Maintains diverse array of wild type, mutant, and genetically engineered strains of T. thermophila, the most commonly used laboratory species, and variety of other species derived from both laboratory maintained stocks and wild isolates. All stocks are stored in liquid nitrogen to maintain genetic integrity and prevent senescence. In addition to providing worldwide access to strains currently in collection, TSC continually upgrades collection by accepting deposition of newly developed laboratory strains and well characterized wild isolates collected from clearly defined natural sites.

View all literature mentions

Ambion Inc. (tool)

RRID:SCR_008406

A division of Applied Biosystems selling products for the isolation, detection, quantification, amplification, and characterization of RNA.

View all literature mentions

BWA (software resource)

RRID:SCR_010910

Software for aligning sequencing reads against large reference genome. Consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. First for sequence reads up to 100bp, and other two for longer sequences ranged from 70bp to 1Mbp.

View all literature mentions

MUMmerGPU (data processing software)

RRID:SCR_001200

Software tool as high throughput DNA sequence alignment program that runs on nVidia G80-class GPUs. Aligns sequences in parallel on video card to accelerate widely used serial CPU program MUMmer.

View all literature mentions

RepeatMasker (software resource)

RRID:SCR_012954

Software tool that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library ) and RepBase ( consensus sequence library ).

View all literature mentions

BLASTN (data analysis service)

RRID:SCR_001598

Web application to search nucleotide databases using a nucleotide query. Algorithms: blastn, megablast, discontiguous megablast.

View all literature mentions

FASTA (software resource)

RRID:SCR_011819

Software package for DNA and protein sequence alignment to find regions of local or global similarity between Protein or DNA sequences, either by searching Protein or DNA databases, or by identifying local duplications within a sequence.

View all literature mentions

EMBOSS (software resource)

RRID:SCR_008493

Software analysis package for molecular biology community. Automatically copes with data in variety of formats and allows transparent retrieval of sequence data from web. Libraries are provided with package. Provides toolkit for creating bioinformatics applications or workflows. Provides set of sequence analysis programs. Provided programs cover areas such as sequence alignment, rapid database searching with sequence patterns, protein motif identification, nucleotide sequence pattern analysis, codon usage analysis for small genomes, rapid identification of sequence patterns in large scale sequence sets, and presentation tools for publication.

View all literature mentions

TIGRFAMS (data or information resource)

RRID:SCR_005493

Consists curated multiple sequence alignments, Hidden Markov Models (HMMs) for protein sequence classification, and associated information designed to support automated annotation of (mostly prokaryotic) proteins. Starting with release 10.0, TIGRFAMs models use HMMER3, which provides excellent search speed as well as exquisite search sensitivity. See the "TIGRFAMs Complete Listing" page to review the accession, protein name, model type, and EC number (if assigned) of all models. TIGRFAMs is a member database in InterPro. The HMM libraries and supporting files are available to download and use for free from our FTP site.

View all literature mentions

SB1969 (organism)

RRID:TSC_SD00701

Tetrahymena thermophila with name SB1969 from TSC.

View all literature mentions

EVidenceModeler (software resource)

RRID:SCR_014659

Software tool for automated eukaryotic gene structure annotation that reports eukaryotic gene structures as weighted consensus of all available evidence. Used to combine ab intio gene predictions and protein and transcript alignments into weighted consensus gene structures. Inputs include genome sequence, gene predictions, and alignment data (in GFF3 format).

View all literature mentions

Analysis and Annotation Tool (AAT) Package (software resource)

RRID:SCR_014658

Genome tool for analyzing and annotating large genomic sequences containing introns. It includes a program for comparing the query sequence with a protein database and another for comparing the query with a cDNA database. The database search program identifies regions of the query sequence that are similar to a database sequence. Then the alignment program constructs an optimal alignment for each region and the database sequence, as well as reports the coordinates of exons in the query sequence. Pairwise alignments of the query sequence with protein and cDNA database sequences are combined into multiple sequence alignments, which provide a view of all protein and cDNA sequences matching a query region.

View all literature mentions

Augustus (software resource)

RRID:SCR_008417

Software for gene prediction in eukaryotic genomic sequences. Serves as a basis for further steps in the analysis of sequenced and assembled eukaryotic genomes.

View all literature mentions

PASA (software resource)

RRID:SCR_014656

Gene structure annotation and analysis tool that uses spliced alignments of expressed transcript sequences to automatically model gene structures. It also incorporates gene structures based on transcript alignments into existing gene structure annotations. It is one component of a larger eukayotic annotation pipeline implemented at the Broad Institute.

View all literature mentions

Genezilla (software resource)

RRID:SCR_014657

Reconfigurable eukaryotic gene finder based on the Generalized Hidden Markov Model framework. The run time and memory requirements are linear in the sequence length. Genezilla utilizes Interpolated Markov Models (IMMs), Maximal Dependence Decomposition (MDD), and includes states for signal peptides, branch points, TATA boxes, and CAP sites.

View all literature mentions

Trinity (software resource)

RRID:SCR_013048

Software for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data.

View all literature mentions

TBLASTX (web application)

RRID:SCR_011823

A web-based tool used to search translated nucleotide databases using a translated nucleotide query.

View all literature mentions

Taxonomer (software resource)

RRID:SCR_014655

Genome sequence classification tool based on kmer. The code allows users to build nucleotide databases and protein databases, classify reads, construct binner databases, and create taxonomic relationship files. The website provides information on how to perform these actions with code, though users can access a web-based Taxonomer if they need/want to.

View all literature mentions

REPCLASS (software resource)

RRID:SCR_014654

Tool for the classification of known transposable elements in eukaryotic genomes. It can be combined with ab initio repeat finding in order to recover contrasting transposable element landscapes between species.

View all literature mentions

Sequencher (software resource)

RRID:SCR_001528

Software for Next-Generation DNA sequencing, Sanger DNA analysis, and RNA sequencing. It contains sequence analysis tools which include reference-guided alignments, de novo assembly, variant calling, and SNP analyses. It has integrated the Cufflinks suite for in-depth transcript analysis and differential gene expression of RNA-Seq data.

View all literature mentions

RepeatScout (software resource)

RRID:SCR_014653

Algorithm used to identify de novo repeat families in newly sequenced genomes. Repeat libraries for C. briggsae, M. muscles (X chromosome), R. novegicus (X chromosome), armadillo, H. sapiens (X chromosome), and various other mammals created using RepeatScout are available on the main site.

View all literature mentions

SAMTOOLS (software resource)

RRID:SCR_002105

Original SAMTOOLS package has been split into three separate repositories including Samtools, BCFtools and HTSlib. Samtools for manipulating next generation sequencing data used for reading, writing, editing, indexing,viewing nucleotide alignments in SAM,BAM,CRAM format. BCFtools used for reading, writing BCF2,VCF, gVCF files and calling, filtering, summarising SNP and short indel sequence variants. HTSlib used for reading, writing high throughput sequencing data.

View all literature mentions

BEDTools (software resource)

RRID:SCR_006646

A powerful toolset for genome arithmetic allowing one to address common genomics tasks such as finding feature overlaps and computing coverage. Bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line.

View all literature mentions

BLASTX (data analysis service)

RRID:SCR_001653

Web application to search protein databases using a translated nucleotide query. Translated BLAST services are useful when trying to find homologous proteins to a nucleotide coding region. Blastx compares translational products of the nucleotide query sequence to a protein database. Because blastx translates the query sequence in all six reading frames and provides combined significance statistics for hits to different frames, it is particularly useful when the reading frame of the query sequence is unknown or it contains errors that may lead to frame shifts or other coding errors. Thus blastx is often the first analysis performed with a newly determined nucleotide sequence and is used extensively in analyzing EST sequences. This search is more sensitive than nucleotide blast since the comparison is performed at the protein level.

View all literature mentions

T. borealis X4H2 (organism)

RRID:TSC_SD01609

Tetrahymena borealis with name T. borealis X4H2 from TSC.

View all literature mentions

T. elliotti 4EA (organism)

RRID:TSC_SD01607

Tetrahymena elliotti with name T. elliotti 4EA from TSC.

View all literature mentions

T. malaccensis 23b (organism)

RRID:TSC_SD01730

Tetrahymena malaccensis with name T. malaccensis 23b from TSC.

View all literature mentions

MUSCLE (software resource)

RRID:SCR_011812

Multiple sequence alignment method with reduced time and space complexity.Multiple sequence alignment with high accuracy and high throughput. Data analysis service for multiple sequence comparison by log- expectation.

View all literature mentions

PhyML (web application)

RRID:SCR_014629

Web phylogeny server based on the maximum-likelihood principle.

View all literature mentions

NCBI BLAST (software resource)

RRID:SCR_004870

Web search tool to find regions of similarity between biological sequences. Program compares nucleotide or protein sequences to sequence databases and calculates statistical significance. Used for identifying homologous sequences.

View all literature mentions

JBrowse (software resource)

RRID:SCR_001004

A high-performance visualization tool for interactive exploration of large, integrated genomic datasets written primarily in JavaScript. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.

View all literature mentions

CU427.4 (organism)

RRID:TSC_SD00715

Tetrahymena thermophila with name CU427.4 from TSC.

View all literature mentions

CU428.2 (organism)

RRID:TSC_SD00178

Tetrahymena thermophila with name CU428.2 from TSC.

View all literature mentions

ALLPATHS-LG (software resource)

RRID:SCR_010742

Software tool as whole genome shotgun assembler that can generate high quality genome assemblies using short reads (~100bp) such as those produced by the new generation of sequencers.

View all literature mentions

SB210-E (organism)

RRID:TSC_SD01539

Tetrahymena thermophila with name SB210-E from TSC.

View all literature mentions

RepeatMasker (software resource)

RRID:SCR_012954

Software tool that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library ) and RepBase ( consensus sequence library ).

View all literature mentions

RepeatMasker (software resource)

RRID:SCR_012954

Software tool that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library ) and RepBase ( consensus sequence library ).

View all literature mentions

MUMmerGPU (data processing software)

RRID:SCR_001200

Software tool as high throughput DNA sequence alignment program that runs on nVidia G80-class GPUs. Aligns sequences in parallel on video card to accelerate widely used serial CPU program MUMmer.

View all literature mentions

BWA (software resource)

RRID:SCR_010910

Software for aligning sequencing reads against large reference genome. Consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. First for sequence reads up to 100bp, and other two for longer sequences ranged from 70bp to 1Mbp.

View all literature mentions

BWA (software resource)

RRID:SCR_010910

Software for aligning sequencing reads against large reference genome. Consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. First for sequence reads up to 100bp, and other two for longer sequences ranged from 70bp to 1Mbp.

View all literature mentions

BLASTN (data analysis service)

RRID:SCR_001598

Web application to search nucleotide databases using a nucleotide query. Algorithms: blastn, megablast, discontiguous megablast.

View all literature mentions

MUMmerGPU (data processing software)

RRID:SCR_001200

Software tool as high throughput DNA sequence alignment program that runs on nVidia G80-class GPUs. Aligns sequences in parallel on video card to accelerate widely used serial CPU program MUMmer.

View all literature mentions

BLASTN (data analysis service)

RRID:SCR_001598

Web application to search nucleotide databases using a nucleotide query. Algorithms: blastn, megablast, discontiguous megablast.

View all literature mentions

MUMmerGPU (data processing software)

RRID:SCR_001200

Software tool as high throughput DNA sequence alignment program that runs on nVidia G80-class GPUs. Aligns sequences in parallel on video card to accelerate widely used serial CPU program MUMmer.

View all literature mentions