Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Publication

Text mining for the biocuration workflow.

Database : the journal of biological databases and curation | 2012

Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on 'Text Mining for the BioCuration Workflow' at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community.

Pubmed ID: 22513129 RIS Download

Research resources used in this publication

None found

Additional research tools detected in this publication

WormBase (RRID:SCR_003098)

Antibodies used in this publication

None found

Associated grants

Agency: NLM NIH HHS, United States
Id: 1G08LM10720-01
Agency: NCRR NIH HHS, United States
Id: 1R01RR024031
Agency: NIEHS NIH HHS, United States
Id: R01ES014065-04S1
Agency: NCRR NIH HHS, United States
Id: P20RR016463
Agency: NIGMS NIH HHS, United States
Id: R01 GM083871
Agency: NLM NIH HHS, United States
Id: G08 LM010720
Agency: Biotechnology and Biological Sciences Research Council, United Kingdom
Id: BB/F010486/1
Agency: NIGMS NIH HHS, United States
Id: R01-GM083871
Agency: NHGRI NIH HHS, United States
Id: 2U01HG02712-04
Agency: NHGRI NIH HHS, United States
Id: HG001315
Agency: NIEHS NIH HHS, United States
Id: R01ES014065

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.

WormBase (tool)

RRID:SCR_003098

Central data repository for nematode biology including complete genomic sequence, gene predictions and orthology assignments from range of related nematodes.Data concerning genetics, genomics and biology of C. elegans and related nematodes. Derived from initial ACeDB database of C. elegans genetic and sequence information, WormBase includes genomic, anatomical and functional information of C. elegans, other Caenorhabditis species and other nematodes. Maintains public FTP site where researchers can find many commonly requested files and datasets, WormBase software and prepackaged databases.

View all literature mentions

About

The SciCrunch Infrastructure was developed as a cooperative data platform to be used by diverse communities in making data more FAIR.

Contact Us

FAIR Data Informatics Lab

University of California, San Diego

9500 Gilman Drive, Mail Code 0608

La Jolla, CA 92093-0608

United States

info

scicrunch.org

About SciCrunch | Privacy Policy | Terms of Service

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Text mining for the biocuration workflow.

Research resources used in this publication

Additional research tools detected in this publication

Antibodies used in this publication

Associated grants

This is a list of tools and resources that we have found mentioned in this publication.

WormBase (tool)

RRID:SCR_003098

About

Recent News Entries

Contact Us

SciCrunch

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Log in

Log in

Publication

Text mining for the biocuration workflow.

Research resources used in this publication

Additional research tools detected in this publication

Antibodies used in this publication

Associated grants

This is a list of tools and resources that we have found mentioned in this publication.

WormBase (tool)

RRID:SCR_003098

About

Recent News Entries

Contact Us

SciCrunch