Preparing your results

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

WebGestalt: an integrated system for exploring gene sets in various biological contexts.

Nucleic acids research | Jul 1, 2005

High-throughput technologies have led to the rapid generation of large-scale datasets about genes and gene products. These technologies have also shifted our research focus from 'single genes' to 'gene sets'. We have developed a web-based integrated data mining system, WebGestalt (http://genereg.ornl.gov/webgestalt/), to help biologists in exploring large sets of genes. WebGestalt is composed of four modules: gene set management, information retrieval, organization/visualization, and statistics. The management module uploads, saves, retrieves and deletes gene sets, as well as performs Boolean operations to generate the unions, intersections or differences between different gene sets. The information retrieval module currently retrieves information for up to 20 attributes for all genes in a gene set. The organization/visualization module organizes and visualizes gene sets in various biological contexts, including Gene Ontology, tissue expression pattern, chromosome distribution, metabolic and signaling pathways, protein domain information and publications. The statistics module recommends and performs statistical tests to suggest biological areas that are important to a gene set and warrant further investigation. In order to demonstrate the use of WebGestalt, we have generated 48 gene sets with genes over-represented in various human tissue types. Exploration of all the 48 gene sets using WebGestalt is available for the public at http://genereg.ornl.gov/webgestalt/wg_enrich.php.

Pubmed ID: 15980575 RIS Download

Mesh terms: Computer Graphics | Data Interpretation, Statistical | Databases, Genetic | Gene Expression | Genes | Genomics | Humans | Internet | Proteomics | Software | Systems Integration | Tissue Distribution | User-Computer Interface

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


Plant enzymes and biochemical pathways database

The Plant Metabolic Network (PMN) is a collaborative project among databases and biochemists with a common goal to build a broad network of plant metabolic pathway databases. A central feature of the PMN is PlantCyc, a comprehensive plant biochemical pathway database, containing curated information from the literature and computational analyses about the genes, enzymes, compounds, reactions, and pathways involved in primary and secondary metabolism. The central goal of the Plant Metabolic Network (PMN) is to bring together biochemical pathway databases and research communities focused on plant metabolism. PMN will generate an infrastructure for drawing together diverse sources of plant metabolism information.

tool

View all literature mentions

Ensembl

A collection of genome databases for vertebrates and other eukaryotic species with DNA and protein sequence search capabilities. The goal of Ensembl is to automatically annotate the genome, integrate this annotation with other available biological data and make the data publicly available via the web. The range of available data has also expanded to include comparative genomics, variation and regulatory data. Ensembl allows users to: upload and analyze data and save it to an Ensembl account; search for a DNA or protein sequence using BLAST or BLAT; fetch desired data from the public database, using the Perl API; download the databases via FTP in FASTA, MySQL and other formats; and mine Ensembl with BioMart and export sequences or tables in text, HTML, or Excel format. The DNA sequences and assemblies used in the Ensembl genebuild are provided by various projects around the world. Ensembl has entered into an agreement with UCSC and NCBI with regard to sequence identifiers in order to improve consistency between the data provided by different genome browsers. The site also links to the Ensembl blog with updates on new species and sequences as they are added to the database.

tool

View all literature mentions

GO

A community-based bioinformatics resource consisting of three structured controlled vocabularies (ontologies) for the annotation of gene products with respect to their molecular function, cellular component, and biological role in a species-independent manner. This initiative to standardize the representation of gene and gene product attributes across species and databases is an effort to address the need for consistent descriptions of gene products in different databases. The Gene Ontology project encourages input from the community into both the content of the GO and annotation using GO. There are three separate aspects to this effort: first, they write and maintain the ontologies themselves; second, they make cross-links between the ontologies and the genes and gene products in the collaborating databases; and third, they develop tools that facilitate the creation, maintenance and use of ontologies. The controlled vocabularies are structured so that users can query them at different levels: for example, uers can use GO to find all the gene products in the mouse genome that are involved in signal transduction, or users can zoom in on all the receptor tyrosine kinases. This structure also allows annotators to assign properties to gene products at different levels, depending on how much is known about a gene product.

tool

View all literature mentions

ATandT Labs Research - Software Tools

Software tools that have been developed by AT&T Labs researchers. In addition to the software tools available through Open Source and Non-Commercial licenses as listed on this page, AT&T has additional software and technology solutions available for licensing. Please reference the individual project web pages for specific license agreements. If an available license agreement does not meet your needs, please contact attip (at) att.com for assistance with a customized license. Open Source Licenses * AST: Advanced Software Technologies Open Source Collection * Cdt: Container Data Types Library * ECharts: A state machine-based programming language * GGobi: Data visualization for high-dimensional data * GSDjVu/DjVuDigital: Ghostscript driver to convert PS and PDF files to DjVu files * Graphviz: Tools for viewing and interacting with graph diagrams * PADS: Processing Arbitrary Data Streams * Sfio: Portable library for performing I/O * UWIN: Unix on Windows 95 and NT Machines * Vcodex: Software package for data transformation * WSP: Web Scraping Proxy * Yoix: The Yoix Scripting Language and Interpreter * iPlots: Interactive graphics for data analysis in R * vmalloc: Region Memory Allocator Non-Commercial Binary Licenses * BoosTexter: A general purpose machine-learning program * Hancock: A language for processing large-scale data Non-Commercial Source Licenses * ASDT: The AT&T Statistical Dialog Toolkit (ASDT) * Hancock: A language for processing large-scale data

tool

View all literature mentions

Cancer Genome Anatomy Project

Project to determine the gene expression profiles of normal, precancer, and cancer cells, whose generated resources are available to the cancer community. Interconnected modules provide access to all CGAP data, bioinformatic analysis tools, and biological resources allowing the user to find in silico answers to biological questions in a fraction of the time it once took in the laboratory. * Genes * Tissues * Pathways * RNAi * Chromosomes * SAGE Genie * Tools

tool

View all literature mentions

RefSeq

Database that provides a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins. It provides a stable reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis (especially RefSeqGene records), expression studies, and comparative analyses. Included are sequences from plasmids, organelles, viruses, archaea, bacteria, and eukaryotes. Each RefSeq is constructed wholly from sequence data submitted to the International Nucleotide Sequence Database Collaboration (INSDC). It is a unique resource because it provides a large, multi-species, curated sequence database representing separate but explicitly linked records from genomes to transcripts and translation products, as appropriate. Unlike the sequence redundancy found in the public sequence repositories that comprise the INSDC, (i.e., NCBI's GenBank, the European Nucleotide Archive, and the DNA Data Bank of Japan), the RefSeq collection aims to provide, for each included species, a complete set of non-redundant, extensively cross-linked, and richly annotated nucleic acid and protein records. It is recognized, however, that the coverage and finishing of public sequence data varies from organism to organism so intermediate genomic records are provided in some circumstances. The RefSeq collection is available without restriction and can be retrieved in several different ways, such as by searching or by available links in NCBI resources, including PubMed, Nucleotide, Protein, Gene, and Map Viewer, searching with a sequence via BLAST, and downloading from the RefSeq FTP site.

tool

View all literature mentions