Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Transcriptome Assembly and Systematic Identification of Novel Cytochrome P450s in Taxus chinensis.

Frontiers in plant science | 2017

Taxus spp. is a highly valuable medicinal plant with multiple pharmacological effects on various cancers. Cytochrome P450s (CYP450s) play important roles in the biosynthesis of active compounds in Taxus spp., such as the famous diterpenoid, Taxol. However, some specific CYP450 enzymes involved in the biosynthesis of Taxol remain unknown, and the systematic identification of CYP450s in Taxus has not been reported. In this study, 118 full-length and 175 partial CYP450 genes were identified in Taxus chinensis transcriptomes. The 118 full-length genes were divided into 8 clans and 29 families. The CYP71 clan included all A-type genes (52) belonging to 11 families. The other seven clans possessed 18 families containing 66 non-A-type genes. Two new gymnosperm-specific families were discovered, and were named CYP864 and CYP947 respectively. Protein sequence alignments revealed that all of the T. chinensis CYP450s hold distinct conserved domains. The expression patterns of all 118 CYP450 genes during the long-time subculture and MeJA elicitation were analyzed. Additionally, the expression levels of 15 novel CYP725 genes in different Taxus species were explored. Considering all the evidence, 6 CYP725s were identified to be candidates for Taxol biosynthesis. The cis-regulatory elements involved in the transcriptional regulation were also identified in the promoter regions of CYP725s. This study presents a comprehensive overview of the CYP450 gene family in T. chinensis and can provide important insights into the functional gene studies of Taxol biosynthesis.

Pubmed ID: 28878800 RIS Download

Research resources used in this publication

None found

Antibodies used in this publication

None found

Associated grants

None

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


GSNAP (tool)

RRID:SCR_005483

Software to align single and paired end reads as short as 14 nt and of arbitrarily long length. Can detect short and long distance splicing, including interchromosomal splicing, in individual reads, using probabilistic models or database of known splice sites. Permits SNP-tolerant alignment to reference space of all possible combinations of major and minor alleles, and can align reads from bisulfite-treated DNA for study of methylation state.

View all literature mentions

Pfam (tool)

RRID:SCR_004726

A database of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Users can analyze protein sequences for Pfam matches, view Pfam family annotation and alignments, see groups of related families, look at the domain organization of a protein sequence, find the domains on a PDB structure, and query Pfam by keywords. There are two components to Pfam: Pfam-A and Pfam-B. Pfam-A entries are high quality, manually curated families that may automatically generate a supplement using the ADDA database. These automatically generated entries are called Pfam-B. Although of lower quality, Pfam-B families can be useful for identifying functionally conserved regions when no Pfam-A entries are found. Pfam also generates higher-level groupings of related families, known as clans (collections of Pfam-A entries which are related by similarity of sequence, structure or profile-HMM).

View all literature mentions

Hmmer (tool)

RRID:SCR_005305

Tool for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using probabilistic models called profile hidden Markov models (profile HMMs). Compared to BLAST, FASTA, and other sequence alignment and database search tools based on older scoring methodology, HMMER aims to be significantly more accurate and more able to detect remote homologs because of the strength of its underlying mathematical models. In the past, this strength came at significant computational expense, but in the new HMMER3 project, HMMER is now essentially as fast as BLAST.

View all literature mentions

CD-HIT (tool)

RRID:SCR_007105

THIS RESOURCE IS NO LONGER IN SERVICE. Documented on February 28,2023. Software program for clustering biological sequences with many applications in various fields such as making non-redundant databases, finding duplicates, identifying protein families, filtering sequence errors and improving sequence assembly etc. It is very fast and can handle extremely large databases. CD-HIT helps to significantly reduce the computational and manual efforts in many sequence analysis tasks and aids in understanding the data structure and correct the bias within a dataset. The CD-HIT package has CD-HIT, CD-HIT-2D, CD-HIT-EST, CD-HIT-EST-2D, CD-HIT-454, CD-HIT-PARA, PSI-CD-HIT, CD-HIT-OTU and over a dozen scripts. * CD-HIT (CD-HIT-EST) clusters similar proteins (DNAs) into clusters that meet a user-defined similarity threshold. * CD-HIT-2D (CD-HIT-EST-2D) compares 2 datasets and identifies the sequences in db2 that are similar to db1 above a threshold. * CD-HIT-454 identifies natural and artificial duplicates from pyrosequencing reads. * CD-HIT-OTU cluster rRNA tags into OTUs The usage of other programs and scripts can be found in CD-HIT user''s guide. CD-HIT was originally developed by Dr. Weizhong Li at Dr. Adam Godzik''s Lab at the Burnham Institute (now Sanford-Burnham Medical Research Institute).

View all literature mentions

GMAP (tool)

RRID:SCR_008992

THIS RESOURCE IS NO LONGER IN SERVICE, documented August 29, 2016. A software program for mapping and aligning cDNA sequences to a genome. The program maps and aligns a single sequence with minimal startup time and memory requirements, and provides fast batch processing of large sequence sets. The program generates accurate gene structures, even in the presence of substantial polymorphisms and sequence errors, without using probabilistic splice site models. Methodology underlying the program includes a minimal sampling strategy for genomic mapping, oligomer chaining for approximate alignment, sandwich DP for splice site detection, and microexon identification with statistical significance testing.

View all literature mentions

KEGG (tool)

RRID:SCR_012773

Integrated database resource consisting of 16 main databases, broadly categorized into systems information, genomic information, and chemical information. In particular, gene catalogs in completely sequenced genomes are linked to higher-level systemic functions of cell, organism, and ecosystem. Analysis tools are also available. KEGG may be used as reference knowledge base for biological interpretation of large-scale datasets generated by sequencing and other high-throughput experimental technologies.

View all literature mentions

ExPASy Bioinformatics Resource Portal (tool)

RRID:SCR_012880

Portal which provides access to scientific databases and software tools (i.e., resources) in different areas of life sciences including proteomics, genomics, phylogeny, systems biology, population genetics, transcriptomics etc. It contains resources from many different SIB groups as well as external institutions.

View all literature mentions