MOTIVATION: RNA-seq is replacing microarrays as the primary tool for gene expression studies. Many RNA-seq studies have used insufficient biological replicates, resulting in low statistical power and inefficient use of sequencing resources. RESULTS: We show the explicit trade-off between more biological replicates and deeper sequencing in increasing power to detect differentially expressed (DE) genes. In the human cell line MCF7, adding more sequencing depth after 10 M reads gives diminishing returns on power to detect DE genes, whereas adding biological replicates improves power significantly regardless of sequencing depth. We also propose a cost-effectiveness metric for guiding the design of large-scale RNA-seq DE studies. Our analysis showed that sequencing less reads and performing more biological replication is an effective strategy to increase power and accuracy in large-scale differential expression RNA-seq studies, and provided new insights into efficient experiment design of RNA-seq studies. AVAILABILITY AND IMPLEMENTATION: The code used in this paper is provided on: http://home.uchicago.edu/∼jiezhou/replication/. The expression data is deposited in the Gene Expression Omnibus under the accession ID GSE51403.
We have not found any resources mentioned in this publication.
SciCrunch is a data sharing and display platform. Anyone can create a custom portal where they can select searchable subsets of hundreds of data sources, brand their web pages and create their community. SciCrunch will push data updates automatically to all portals on a weekly basis. User communities can also add their own data to SciCrunch, however this is not currently a free service.