Regulation of gene expression at the level of transcription is a major control point in many biological processes. Transcription factors (TFs) can activate and/or repress the transcriptional rate of target genes and vascular plant genomes devote approximately 7% of their coding capacity to TFs. Global analysis of TFs has only been performed for three complete higher plant genomes - Arabidopsis (Arabidopsis thaliana), poplar (Populus trichocarpa) and rice (Oryza sativa). Presently, no large-scale analysis of TFs has been made from a member of the Solanaceae, one of the most important families of vascular plants. To fill this void, we have analysed tobacco (Nicotiana tabacum) TFs using a dataset of 1,159,022 gene-space sequence reads (GSRs) obtained by methylation filtering of the tobacco genome. An analytical pipeline was developed to isolate TF sequences from the GSR data set. This involved multiple (typically 10-15) independent searches with different versions of the TF family-defining domain(s) (normally the DNA-binding domain) followed by assembly into contigs and verification. Our analysis revealed that tobacco contains a minimum of 2,513 TFs representing all of the 64 well-characterised plant TF families. The number of TFs in tobacco is higher than previously reported for Arabidopsis and rice.
Pubmed ID: 18221524 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Database of transcription factor sequences from a single plant species (over 2,500 genes). It is possible to search: # 1,159,022 gene-space sequence reads (GSRs) obtained by methylation filtering from the Tobacco Genome Initiative (TGI). # The DFCI Tobacco Gene Index (Release 4.0 July 5, 2008) that contains 163,524 tobacco EST sequences and 2,288 expressed transcripts (ETs). # The complete TOBFAC database of tobacco transcription factors. It is also possible to search multiple libraries in a single search. They have incorporated tools for downloading all of the sequences from the blast results and also a contig tool to assemble any or all of the resulting sequences. They are also improving the TOBFAC sequences by extending the original contigs using a contig extension tool designed by Ryan Thompson. This has allowed them to refine the predicted genes. These will be updated on a gene family basis as the improved data become available.
View all literature mentions