EVEREST - EVolutionary Ensembles of REcurrent SegmenTs

EVEREST is an automatic process of identifying and classifying of protein domains. Users can search for specific proteins using Protein ID or name, browse through protein families, and upload/download protein sequence data. EVEREST combines methodologies from the fields of finite metric spaces, machine learning and statistical modeling and achieves state of the art results. The process begins by constructing a database of protein segments that emerge in an all vs. all pairwise sequence comparison. It then proceeds to cluster these segments into putative domain families, choosing the best putative families using machine learning techniques, and creating a statistical model for each of the chosen families. This procedure is then iterated: The aforementioned statistical models are used to scan all protein sequences, to recreate a segment database and to cluster them again. Performance was evaluated by comparing with Pfam and SCOP.

URL: http://www.everest.cs.huji.ac.il

protein domain, protein domain classification, protein domain identification



12:00am September 21, 2010

  Description was changed
  Additional Resource Types was changed

