FDI Lab - SciCrunch.org | Searching in Literature

Prediction of drug gene associations via ontological profile similarity with application to drug repositioning.

Maria Kissa‎ et al.
Methods (San Diego, Calif.)‎
2015‎

The amount of biomedical literature has been increasing rapidly during the last decade. Text mining techniques can harness this large-scale data, shed light onto complex drug mechanisms, and extract relation information that can support computational polypharmacology. In this work, we introduce a fully corpus-based and unsupervised method which utilizes the MEDLINE indexed titles and abstracts to infer drug gene associations and assist drug repositioning. The method measures the Pointwise Mutual Information (PMI) between biomedical terms derived from the Gene Ontology and the Medical Subject Headings. Based on the PMI scores, drug and gene profiles are generated and candidate drug gene associations are inferred when computing the relatedness of their profiles. Results show that an Area Under the Curve (AUC) of up to 0.88 can be achieved. The method can successfully identify direct drug gene associations with high precision and prioritize them. Validation shows that the statistically derived profiles from literature perform as good as manually curated profiles. In addition, we examine the potential application of our approach towards drug repositioning. For all FDA approved drugs repositioned over the last 5 years, we generate profiles from publications before 2009 and show that new indications rank high in the profiles. In summary, literature mined profiles can accurately predict drug gene associations and provide insights onto potential repositioning cases.

A Maximum-Entropy approach for accurate document annotation in the biomedical domain.

George Tsatsaronis‎ et al.
Journal of biomedical semantics‎
2012‎

The increasing number of scientific literature on the Web and the absence of efficient tools used for classifying and searching the documents are the two most important factors that influence the speed of the search and the quality of the results. Previous studies have shown that the usage of ontologies makes it possible to process document and query information at the semantic level, which greatly improves the search for the relevant information and makes one step further towards the Semantic Web. A fundamental step in these approaches is the annotation of documents with ontology concepts, which can also be seen as a classification task. In this paper we address this issue for the biomedical domain and present a new automated and robust method, based on a Maximum Entropy approach, for annotating biomedical literature documents with terms from the Medical Subject Headings (MeSH).The experimental evaluation shows that the suggested Maximum Entropy approach for annotating biomedical documents with MeSH terms is highly accurate, robust to the ambiguity of terms, and can provide very good performance even when a very small number of training documents is used. More precisely, we show that the proposed algorithm obtained an average F-measure of 92.4% (precision 99.41%, recall 86.77%) for the full range of the explored terms (4,078 MeSH terms), and that the algorithm's performance is resilient to terms' ambiguity, achieving an average F-measure of 92.42% (precision 99.32%, recall 86.87%) in the explored MeSH terms which were found to be ambiguous according to the Unified Medical Language System (UMLS) thesaurus. Finally, we compared the results of the suggested methodology with a Naive Bayes and a Decision Trees classification approach, and we show that the Maximum Entropy based approach performed with higher F-Measure in both ambiguous and monosemous MeSH terms.

CRISPR-Cas9 gRNA efficiency prediction: an overview of predictive tools and the role of deep learning.

Vasileios Konstantakos‎ et al.
Nucleic acids research‎
2022‎

The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system has become a successful and promising technology for gene-editing. To facilitate its effective application, various computational tools have been developed. These tools can assist researchers in the guide RNA (gRNA) design process by predicting cleavage efficiency and specificity and excluding undesirable targets. However, while many tools are available, assessment of their application scenarios and performance benchmarks are limited. Moreover, new deep learning tools have been explored lately for gRNA efficiency prediction, but have not been systematically evaluated. Here, we discuss the approaches that pertain to the on-target activity problem, focusing mainly on the features and computational methods they utilize. Furthermore, we evaluate these tools on independent datasets and give some suggestions for their usage. We conclude with some challenges and perspectives about future directions for CRISPR-Cas9 guide design.

Knowledge4COVID-19: A semantic-based approach for constructing a COVID-19 related knowledge graph from various sources and analyzing treatments' toxicities.

Ahmad Sakor‎ et al.
Web semantics (Online)‎
2023‎

In this paper, we present Knowledge4COVID-19, a framework that aims to showcase the power of integrating disparate sources of knowledge to discover adverse drug effects caused by drug-drug interactions among COVID-19 treatments and pre-existing condition drugs. Initially, we focus on constructing the Knowledge4COVID-19 knowledge graph (KG) from the declarative definition of mapping rules using the RDF Mapping Language. Since valuable information about drug treatments, drug-drug interactions, and side effects is present in textual descriptions in scientific databases (e.g., DrugBank) or in scientific literature (e.g., the CORD-19, the Covid-19 Open Research Dataset), the Knowledge4COVID-19 framework implements Natural Language Processing. The Knowledge4COVID-19 framework extracts relevant entities and predicates that enable the fine-grained description of COVID-19 treatments and the potential adverse events that may occur when these treatments are combined with treatments of common comorbidities, e.g., hypertension, diabetes, or asthma. Moreover, on top of the KG, several techniques for the discovery and prediction of interactions and potential adverse effects of drugs have been developed with the aim of suggesting more accurate treatments for treating the virus. We provide services to traverse the KG and visualize the effects that a group of drugs may have on a treatment outcome. Knowledge4COVID-19 was part of the Pan-European hackathon#EUvsVirus in April 2020 and is publicly available as a resource through a GitHub repository and a DOI.

Formalizing biomedical concepts from textual definitions.

Alina Petrova‎ et al.
Journal of biomedical semantics‎
2015‎

Ontologies play a major role in life sciences, enabling a number of applications, from new data integration to knowledge verification. SNOMED CT is a large medical ontology that is formally defined so that it ensures global consistency and support of complex reasoning tasks. Most biomedical ontologies and taxonomies on the other hand define concepts only textually, without the use of logic. Here, we investigate how to automatically generate formal concept definitions from textual ones. We develop a method that uses machine learning in combination with several types of lexical and semantic features and outputs formal definitions that follow the structure of SNOMED CT concept definitions.

Reference intervals for plasma concentrations of adrenal steroids measured by LC-MS/MS: Impact of gender, age, oral contraceptives, body mass index and blood pressure status.

Graeme Eisenhofer‎ et al.
Clinica chimica acta; international journal of clinical chemistry‎
2017‎

Mass spectrometric-based measurements of the steroid metabolome have been introduced to diagnose disorders featuring abnormal steroidogenesis. Defined reference intervals are important for interpreting such data.

CRISPRedict: a CRISPR-Cas9 web tool for interpretable efficiency predictions.

Vasileios Konstantakos‎ et al.
Nucleic acids research‎
2022‎

The development of the CRISPR-Cas9 technology has provided a simple yet powerful system for genome editing. Current gRNA design tools serve as an important platform for the efficient application of the CRISPR systems. However, most of the existing tools are black-box models that suffer from limitations, such as variable performance and unclear mechanism of decision making. Here, we introduce CRISPRedict, an interpretable gRNA efficiency prediction model for CRISPR-Cas9 gene editing. Its strength lies in the fact that it can accurately predict efficient guide RNAs-with equivalent performance to state-of-the-art tools-while being a simple linear model. Implemented as a user-friendly web server, CRISPRedict offers (i) quick and accurate predictions across various experimental conditions (e.g. U6/T7 transcription); (ii) regression and classification models for scoring gRNAs and (iii) multiple visualizations to explain the obtained results. Given its performance, interpretability, and versatility, we expect that it will assist researchers in the gRNA design process and facilitate genome editing research. CRISPRedict is available for use at http://www.crispredict.org/.

An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition.

George Tsatsaronis‎ et al.
BMC bioinformatics‎
2015‎

This article provides an overview of the first BIOASQ challenge, a competition on large-scale biomedical semantic indexing and question answering (QA), which took place between March and September 2013. BIOASQ assesses the ability of systems to semantically index very large numbers of biomedical scientific articles, and to return concise and user-understandable answers to given natural language questions by combining information from biomedical articles and ontologies.

Discovering relations between indirectly connected biomedical concepts.

Dirk Weissenborn‎ et al.
Journal of biomedical semantics‎
2015‎

The complexity and scale of the knowledge in the biomedical domain has motivated research work towards mining heterogeneous data from both structured and unstructured knowledge bases. Towards this direction, it is necessary to combine facts in order to formulate hypotheses or draw conclusions about the domain concepts. This work addresses this problem by using indirect knowledge connecting two concepts in a knowledge graph to discover hidden relations between them. The graph represents concepts as vertices and relations as edges, stemming from structured (ontologies) and unstructured (textual) data. In this graph, path patterns, i.e. sequences of relations, are mined using distant supervision that potentially characterize a biomedical relation.

BioASQ-QA: A manually curated corpus for Biomedical Question Answering.

Anastasia Krithara‎ et al.
Scientific data‎
2023‎

The BioASQ question answering (QA) benchmark dataset contains questions in English, along with golden standard (reference) answers and related material. The dataset has been designed to reflect real information needs of biomedical experts and is therefore more realistic and challenging than most existing datasets. Furthermore, unlike most previous QA benchmarks that contain only exact answers, the BioASQ-QA dataset also includes ideal answers (in effect summaries), which are particularly useful for research on multi-document summarization. The dataset combines structured and unstructured data. The materials linked with each question comprise documents and snippets, which are useful for Information Retrieval and Passage Retrieval experiments, as well as concepts that are useful in concept-to-text Natural Language Generation. Researchers working on paraphrasing and textual entailment can also measure the degree to which their methods improve the performance of biomedical QA systems. Last but not least, the dataset is continuously extended, as the BioASQ challenge is running and new data are generated.

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Prediction of drug gene associations via ontological profile similarity with application to drug repositioning.

A Maximum-Entropy approach for accurate document annotation in the biomedical domain.

CRISPR-Cas9 gRNA efficiency prediction: an overview of predictive tools and the role of deep learning.

Knowledge4COVID-19: A semantic-based approach for constructing a COVID-19 related knowledge graph from various sources and analyzing treatments' toxicities.

Formalizing biomedical concepts from textual definitions.

Reference intervals for plasma concentrations of adrenal steroids measured by LC-MS/MS: Impact of gender, age, oral contraceptives, body mass index and blood pressure status.

CRISPRedict: a CRISPR-Cas9 web tool for interpretable efficiency predictions.

An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition.

Discovering relations between indirectly connected biomedical concepts.

BioASQ-QA: A manually curated corpus for Biomedical Question Answering.

SciCrunch.org Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Facets

Options

Further Questions

About

Recent News Entries

Contact Us

SciCrunch

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Log in

Log in

Literature

Current Facets and Filters

Options

Facets

Recent searches

.in-collection { color: green; } Prediction of drug gene associations via ontological profile similarity with application to drug repositioning.

.in-collection { color: green; } A Maximum-Entropy approach for accurate document annotation in the biomedical domain.

.in-collection { color: green; } CRISPR-Cas9 gRNA efficiency prediction: an overview of predictive tools and the role of deep learning.

.in-collection { color: green; } Knowledge4COVID-19: A semantic-based approach for constructing a COVID-19 related knowledge graph from various sources and analyzing treatments' toxicities.

.in-collection { color: green; } Formalizing biomedical concepts from textual definitions.

.in-collection { color: green; } Reference intervals for plasma concentrations of adrenal steroids measured by LC-MS/MS: Impact of gender, age, oral contraceptives, body mass index and blood pressure status.

.in-collection { color: green; } CRISPRedict: a CRISPR-Cas9 web tool for interpretable efficiency predictions.

.in-collection { color: green; } An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition.

.in-collection { color: green; } Discovering relations between indirectly connected biomedical concepts.

.in-collection { color: green; } BioASQ-QA: A manually curated corpus for Biomedical Question Answering.

SciCrunch.org Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Facets

Options

Further Questions

Publications Per Year

About

Recent News Entries

Contact Us

SciCrunch

Prediction of drug gene associations via ontological profile similarity with application to drug repositioning.

A Maximum-Entropy approach for accurate document annotation in the biomedical domain.

CRISPR-Cas9 gRNA efficiency prediction: an overview of predictive tools and the role of deep learning.

Knowledge4COVID-19: A semantic-based approach for constructing a COVID-19 related knowledge graph from various sources and analyzing treatments' toxicities.

Formalizing biomedical concepts from textual definitions.

Reference intervals for plasma concentrations of adrenal steroids measured by LC-MS/MS: Impact of gender, age, oral contraceptives, body mass index and blood pressure status.

CRISPRedict: a CRISPR-Cas9 web tool for interpretable efficiency predictions.

An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition.

Discovering relations between indirectly connected biomedical concepts.

BioASQ-QA: A manually curated corpus for Biomedical Question Answering.