2024MAY10: Our hosting provider is experiencing intermittent networking issues. We apologize for any inconvenience.

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Predictive Power Estimation Algorithm (PPEA)--a new algorithm to reduce overfitting for genomic biomarker discovery.

PloS one | 2011

Toxicogenomics promises to aid in predicting adverse effects, understanding the mechanisms of drug action or toxicity, and uncovering unexpected or secondary pharmacology. However, modeling adverse effects using high dimensional and high noise genomic data is prone to over-fitting. Models constructed from such data sets often consist of a large number of genes with no obvious functional relevance to the biological effect the model intends to predict that can make it challenging to interpret the modeling results. To address these issues, we developed a novel algorithm, Predictive Power Estimation Algorithm (PPEA), which estimates the predictive power of each individual transcript through an iterative two-way bootstrapping procedure. By repeatedly enforcing that the sample number is larger than the transcript number, in each iteration of modeling and testing, PPEA reduces the potential risk of overfitting. We show with three different cases studies that: (1) PPEA can quickly derive a reliable rank order of predictive power of individual transcripts in a relatively small number of iterations, (2) the top ranked transcripts tend to be functionally related to the phenotype they are intended to predict, (3) using only the most predictive top ranked transcripts greatly facilitates development of multiplex assay such as qRT-PCR as a biomarker, and (4) more importantly, we were able to demonstrate that a small number of genes identified from the top-ranked transcripts are highly predictive of phenotype as their expression changes distinguished adverse from nonadverse effects of compounds in completely independent tests. Thus, we believe that the PPEA model effectively addresses the over-fitting problem and can be used to facilitate genomic biomarker discovery for predictive toxicology and drug responses.

Pubmed ID: 21935387 RIS Download

Research resources used in this publication

None found

Antibodies used in this publication

None found

Associated grants

None

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


R Project for Statistical Computing (tool)

RRID:SCR_001905

Software environment and programming language for statistical computing and graphics. R is integrated suite of software facilities for data manipulation, calculation and graphical display. Can be extended via packages. Some packages are supplied with the R distribution and more are available through CRAN family.It compiles and runs on wide variety of UNIX platforms, Windows and MacOS.

View all literature mentions

Ingenuity Pathways Knowledge Base (tool)

RRID:SCR_008117

A horizontally and vertically structured database that pulls scientific and medical information and describes it consistently using the Ingenuity Ontology. The Knowledge Base pulls information from journals, public molecular content databases, and textbooks. Data is curated and and integrated into the Knowledge Base .

View all literature mentions

Affymetrix (tool)

RRID:SCR_010231

THIS RESOURCE IS NO LONGER IN SERVICE. Documented on May 17,2023. Affymetrix is a partially commercial resource that provides DNA Analysis Arrays, Expression Analysis Arrays, Gene Regulation Analysis, and Microarrays. It also provides reagents and assays, instruments, software, and services for a fee. Information is provided for Rats, Humans, and Mice.Affymetrix is now Applied Biosystems, brand of DNA microarray products sold by Thermo Fisher Scientific that originated with an American biotechnology research and development and manufacturing company of the same name.

View all literature mentions

Primer Express (tool)

RRID:SCR_014326

Software that allows users to manually or automatically design custom primers and probes for gene quantitation and allelic discrimination (SNP) real-time PCR applications. It supports assays based on TaqMan and SYBR Green I dye chemistries.

View all literature mentions

Affymetrix (tool)

RRID:SCR_007817

THIS RESOURCE IS NO LONGER IN SERVICE. Documented on May 17,2023. Affymetrix is a partially commercial resource that provides DNA Analysis Arrays, Expression Analysis Arrays, Gene Regulation Analysis, and Microarrays. It also provides reagents and assays, instruments, software, and services for a fee. Information is provided for Rats, Humans, and Mice.Affymetrix is now Applied Biosystems, brand of DNA microarray products sold by Thermo Fisher Scientific that originated with an American biotechnology research and development and manufacturing company of the same name.

View all literature mentions