X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Resource Name
RRID:SCR_008737 RRID Copied      
PDF Report How to cite
Textpresso (RRID:SCR_008737)
Copy Citation Copied
Resource Information

URL: http://www.textpresso.org/

Proper Citation: Textpresso (RRID:SCR_008737)

Description: An information extracting and processing package for biological literature that can be used online or installed locally via a downloadable software package, http://www.textpresso.org/downloads.html Textpresso's two major elements are (1) access to full text, so that entire articles can be searched, and (2) introduction of categories of biological concepts and classes that relate two objects (e.g., association, regulation, etc.) or describe one (e.g., methods, etc). A search engine enables the user to search for one or a combination of these categories and/or keywords within an entire literature. The Textpresso project serves the biological and biomedical research community by providing: * Full text literature searches of model organism research and subject-specific articles at individual sites. Major elements of these search engines are (1) access to full text, so that the entire content of articles can be searched, and (2) search capabilities using categories of biological concepts and classes that relate two objects (e.g., association, regulation, etc.) or identify one (e.g., cell, gene, allele, etc). The search engines are flexible, enabling users to query the entire literature using keywords, one or more categories or a combination of keywords and categories. * Text classification and mining of biomedical literature for database curation. They help database curators to identify and extract biological entities and facts from the full text of research articles. Examples of entity identification and extraction include new allele and gene names and human disease gene orthologs; examples of fact identification and extraction include sentence retrieval for curating gene-gene regulation, Gene Ontology (GO) cellular components and GO molecular function annotations. In addition they classify papers according to curation needs. They employ a variety of methods such as hidden Markov models, support vector machines, conditional random fields and pattern matches. Our collaborators include WormBase, FlyBase, SGD, TAIR, dictyBase and the Neuroscience Information Framework. They are looking forward to collaborating with more model organism databases and projects. * Linking biological entities in PDF and online journal articles to online databases. They have established a journal article mark-up pipeline that links select content of Genetics journal articles to model organism databases such as WormBase and SGD. The entity markup pipeline links over nine classes of objects including genes, proteins, alleles, phenotypes, and anatomical terms to the appropriate page at each database. The first article published with online and PDF-embedded hyperlinks to WormBase appeared in the September 2009 issue of Genetics. As of January 2011, we have processed around 70 articles, to be continued indefinitely. Extension of this pipeline to other journals and model organism databases is planned. Textpresso is useful as a search engine for researchers as well as a curation tool. It was developed as a part of WormBase and is used extensively by C. elegans curators. Textpresso has currently been implemented for 24 different literatures, among them Neuroscience, and can readily be extended to other corpora of text.

Abbreviations: Textpresso

Synonyms: Text presso, Textpresso - literature search engine

Resource Type: text-mining software, database, data or information resource, software application, software resource

Defining Citation: PMID:18949581, PMID:15383839

Keywords: literature, extract, process, bibliographic resource, database application, linux, macos, pdf, perl, posix/unix-like, sh, bash, unix shell, web service, search engine, curation tool, dicty, neuroscience, regulon db, ecoliwiki, ecocyc, curation, text-mining

Expand All
Usage and Citation Metrics
We apologize, the data for 2022 is currently unavailable for most resources. We are aware of the issue and are working to resolve it.

We found {{ ctrl2.mentions.total_count }} mentions in open access literature.

We have not found any literature mentions for this resource.

We are searching literature mentions for this resource.

View full usage report

Most recent articles:

{{ mention._source.dc.creators[0].familyName }} {{ mention._source.dc.creators[0].initials }}, et al. ({{ mention._source.dc.publicationYear }}) {{ mention._source.dc.title }} {{ mention._source.dc.publishers[0].name }}, {{ mention._source.dc.publishers[0].volume }}({{ mention._source.dc.publishers[0].issue }}), {{ mention._source.dc.publishers[0].pagination }}. (PMID:{{ mention._id.replace('PMID:', '') }})

Checkfor all resource mentions.

Collaborator Network

A list of researchers who have used the resource and an author search tool

Find mentions based on location


{{ ctrl2.mentions.errors.location }}

A list of researchers who have used the resource and an author search tool. This is available for resources that have literature mentions.

Ratings and Alerts

No rating or validation information has been found for Textpresso.

No alerts have been found for Textpresso.

Data and Source Information