Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Functional gene clustering via gene annotation sentences, MeSH and GO keywords from biomedical literature.

Bioinformation | 2007

Gene function annotation remains a key challenge in modern biology. This is especially true for high-throughput techniques such as gene expression experiments. Vital information about genes is available electronically from biomedical literature in the form of full texts and abstracts. In addition, various publicly available databases (such as GenBank, Gene Ontology and Entrez) provide access to gene-related information at different levels of biological organization, granularity and data format. This information is being used to assess and interpret the results from high-throughput experiments. To improve keyword extraction for annotational clustering and other types of analyses, we have developed a novel text mining approach, which is based on keywords identified at the level of gene annotation sentences (in particular sentences characterizing biological function) instead of entire abstracts. Further, to improve the expressiveness and usefulness of gene annotation terms, we investigated the combination of sentence-level keywords with terms from the Medical Subject Headings (MeSH) and Gene Ontology (GO) resources. We find that sentence-level keywords combined with MeSH terms outperforms the typical 'baseline' set-up (term frequencies at the level of abstracts) by a significant margin, whereas the addition of GO terms improves matters only marginally. We validated our approach on the basis of a manually annotated corpus of 200 abstracts generated on the basis of 2 cancer categories and 10 genes per category. We applied the method in the context of three sets of differentially expressed genes obtained from pediatric brain tumor samples. This analysis suggests novel interpretations of discovered gene expression patterns.

Pubmed ID: 18305827 RIS Download

Research resources used in this publication

None found

Additional research tools detected in this publication

Antibodies used in this publication

None found

Associated grants

None

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


HUGO (tool)

RRID:SCR_012800

Human Genome Organisation (HUGO) is the international organization of scientists involved in human genetics. HUGO was conceived in 1988, at the first meeting on genome mapping and sequencing at Cold Spring Harbor. From a 42 scientists of 17 countries membership association, HUGO has increased its membership base to over 1,200 members, both established and aspiring of 69 countries after two decades. HUGO has, over the years, played an essential role behind the scenes of the human genome project. With its mission to promote international collaborative effort to study the human genome and the myriad issues raised by knowledge of the genome, HUGO has had noteworthy successes in some of the less glamorous, but nonetheless vital, aspects of the human genome project. As a truly international organization, HUGO is entering its 20th year of its history by making an inflection in its direction seeking the biological meaning of its information content. To this end, HUGO is focusing on the medical implications of genomic knowledge. Moving forward, HUGO is also working to enhance the genomic capabilities in the emerging countries of the world. The excitement and interest in genomic sciences in Asia, Middle East, South America and Africa are palpable and the hope is that these technologies will help in national development and health.

View all literature mentions