Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Bioactivity descriptors for uncharacterized chemical compounds.

Nature communications | 2021

Chemical descriptors encode the physicochemical and structural properties of small molecules, and they are at the core of chemoinformatics. The broad release of bioactivity data has prompted enriched representations of compounds, reaching beyond chemical structures and capturing their known biological properties. Unfortunately, bioactivity descriptors are not available for most small molecules, which limits their applicability to a few thousand well characterized compounds. Here we present a collection of deep neural networks able to infer bioactivity signatures for any compound of interest, even when little or no experimental information is available for them. Our signaturizers relate to bioactivities of 25 different types (including target profiles, cellular response and clinical outcomes) and can be used as drop-in replacements for chemical descriptors in day-to-day chemoinformatics tasks. Indeed, we illustrate how inferred bioactivity signatures are useful to navigate the chemical space in a biologically relevant manner, unveiling higher-order organization in natural product collections, and to enrich mostly uncharacterized chemical libraries for activity against the drug-orphan target Snail1. Moreover, we implement a battery of signature-activity relationship (SigAR) models and show a substantial improvement in performance, with respect to chemistry-based classifiers, across a series of biophysics and physiology activity prediction benchmarks.

Pubmed ID: 34168145 RIS Download

Research resources used in this publication

None found

Antibodies used in this publication

None found

Associated grants

None

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


HMDB (tool)

RRID:SCR_007712

Curated collection of human metabolite and human metabolism data which contains records for endogenous metabolites, with each metabolite entry containing detailed chemical, physical, biochemical, concentration, and disease information. This is further supplemented with thousands of NMR and MS spectra collected on purified reference metabolites.

View all literature mentions

MetaboLights (tool)

RRID:SCR_014663

A cross-species, cross-technique database for metabolomics experiments, data, and derived information. It includes metabolite structures and their reference spectra, their biological roles, locations and concentrations, and experimental data from metabolic experiments.

View all literature mentions

PubChem (tool)

RRID:SCR_004284

Collection of information about chemical structures and biological properties of small molecules and siRNA reagents hosted by the National Center for Biotechnology Information (NCBI).

View all literature mentions

MDA-MB-231 (tool)

RRID:CVCL_0062

Cell line MDA-MB-231 is a Cancer cell line with a species of origin Homo sapiens (Human)

View all literature mentions