The capacity of highly parallel sequencing technologies to detect small RNAs at unprecedented depth suggests their value in systematically identifying microRNAs (miRNAs). However, the identification of miRNAs from the large pool of sequenced transcripts from a single deep sequencing run remains a major challenge. Here, we present an algorithm, miRDeep, which uses a probabilistic model of miRNA biogenesis to score compatibility of the position and frequency of sequenced RNA with the secondary structure of the miRNA precursor. We demonstrate its accuracy and robustness using published Caenorhabditis elegans data and data we generated by deep sequencing human and dog RNAs. miRDeep reports altogether approximately 230 previously unannotated miRNAs, of which four novel C. elegans miRNAs are validated by northern blot analysis.
SciCrunch is a data sharing and display platform. Anyone can create a custom portal where they can select searchable subsets of hundreds of data sources, brand their web pages and create their community. SciCrunch will push data updates automatically to all portals on a weekly basis. User communities can also add their own data to SciCrunch, however this is not currently a free service.