Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

iBBiG: iterative binary bi-clustering of gene sets.

MOTIVATION: Meta-analysis of genomics data seeks to identify genes associated with a biological phenotype across multiple datasets; however, merging data from different platforms by their features (genes) is challenging. Meta-analysis using functionally or biologically characterized gene sets simplifies data integration is biologically intuitive and is seen as having great potential, but is an emerging field with few established statistical methods. RESULTS: We transform gene expression profiles into binary gene set profiles by discretizing results of gene set enrichment analyses and apply a new iterative bi-clustering algorithm (iBBiG) to identify groups of gene sets that are coordinately associated with groups of phenotypes across multiple studies. iBBiG is optimized for meta-analysis of large numbers of diverse genomics data that may have unmatched samples. It does not require prior knowledge of the number or size of clusters. When applied to simulated data, it outperforms commonly used clustering methods, discovers overlapping clusters of diverse sizes and is robust in the presence of noise. We apply it to meta-analysis of breast cancer studies, where iBBiG extracted novel gene set-phenotype association that predicted tumor metastases within tumor subtypes. AVAILABILITY: Implemented in the Bioconductor package iBBiG CONTACT: aedin@jimmy.harvard.edu.

Pubmed ID: 22789589 RIS Download

Mesh terms: Algorithms | Breast Neoplasms | Cluster Analysis | Computational Biology | Computer Simulation | Female | Gene Expression Profiling | Genomics | Humans | Phenotype

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


Bioconductor

A catalog of tools and software packages for the analysis and comprehension of high-throughput genomic data that uses the R statistical programming language. Bioconductor has a development version to which new features and packages are added prior to incorporation in the release. A large number of meta-data packages provide pathway, organism, microarray and other annotations. The broad goals of the Bioconductor project are: to provide widespread access to a broad range of powerful statistical and graphical methods for the analysis of genomic data; to facilitate the inclusion of biological metadata in the analysis of genomic data; to provide a common software platform that enables the rapid development and deployment of extensible, scalable, and interoperable software; and to train researchers on computational and statistical methods for the analysis of genomic data.

tool

View all literature mentions