MOTIVATION: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. RESULTS: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects. AVAILABILITY: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net.
SciCrunch is a data sharing and display platform. Anyone can create a custom portal where they can select searchable subsets of hundreds of data sources, brand their web pages and create their community. SciCrunch will push data updates automatically to all portals on a weekly basis. User communities can also add their own data to scicrunch, however this is not currently a free service.