FDI Lab - SciCrunch.org | Searching in Literature

FABIA: factor analysis for bicluster acquisition.

Sepp Hochreiter‎ et al.
Bioinformatics (Oxford, England)‎
2010‎

Biclustering of transcriptomic data groups genes and samples simultaneously. It is emerging as a standard tool for extracting knowledge from gene expression measurements. We propose a novel generative approach for biclustering called 'FABIA: Factor Analysis for Bicluster Acquisition'. FABIA is based on a multiplicative model, which accounts for linear dependencies between gene expression and conditions, and also captures heavy-tailed distributions as observed in real-world transcriptomic data. The generative framework allows to utilize well-founded model selection methods and to apply Bayesian techniques.

A One Pot Synthesis of Novel Bioactive Tri-Substitute-Condensed-Imidazopyridines that Targets Snake Venom Phospholipase A2.

Nirvanappa C Anilkumar‎ et al.
PloS one‎
2015‎

Drugs such as necopidem, saripidem, alpidem, zolpidem, and olprinone contain nitrogen-containing bicyclic, condensed-imidazo[1,2-α]pyridines as bioactive scaffolds. In this work, we report a high-yield one pot synthesis of 1-(2-methyl-8-aryl-substitued-imidazo[1,2-α]pyridin-3-yl)ethan-1-onefor the first-time. Subsequently, we performed in silico mode-of-action analysis and predicted that the synthesized imidazopyridines targets Phospholipase A2 (PLA2). In vitro analysis confirmed the predicted target PLA2 for the novel imidazopyridine derivative1-(2-Methyl-8-naphthalen-1-yl-imidazo [1,2-α]pyridine-3-yl)-ethanone (compound 3f) showing significant inhibitory activity towards snake venom PLA2 with an IC50 value of 14.3 μM. Evidently, the molecular docking analysis suggested that imidazopyridine compound was able to bind to the active site of the PLA2 with strong affinity, whose affinity values are comparable to nimesulide. Furthermore, we estimated the potential for oral bioavailability by Lipinski's Rule of Five. Hence, it is concluded that the compound 3f could be a lead molecule against snake venom PLA2.

Prediction of the potency of mammalian cyclooxygenase inhibitors with ensemble proteochemometric modeling.

Isidro Cortes-Ciriano‎ et al.
Journal of cheminformatics‎
2015‎

Cyclooxygenases (COX) are present in the body in two isoforms, namely: COX-1, constitutively expressed, and COX-2, induced in physiopathological conditions such as cancer or chronic inflammation. The inhibition of COX with non-steroideal anti-inflammatory drugs (NSAIDs) is the most widely used treatment for chronic inflammation despite the adverse effects associated to prolonged NSAIDs intake. Although selective COX-2 inhibition has been shown not to palliate all adverse effects (e.g. cardiotoxicity), there are still niche populations which can benefit from selective COX-2 inhibition. Thus, capitalizing on bioactivity data from both isoforms simultaneously would contribute to develop COX inhibitors with better safety profiles. We applied ensemble proteochemometric modeling (PCM) for the prediction of the potency of 3,228 distinct COX inhibitors on 11 mammalian cyclooxygenases. Ensemble PCM models ([Formula: see text], and RMSEtest = 0.71) outperformed models exclusively trained on compound ([Formula: see text], and RMSEtest = 1.09) or protein descriptors ([Formula: see text] and RMSEtest = 1.10) on the test set. Moreover, PCM predicted COX potency for 1,086 selective and non-selective COX inhibitors with [Formula: see text] and RMSEtest = 0.76. These values are in agreement with the maximum and minimum achievable [Formula: see text] and RMSEtest values of approximately 0.68 for both metrics. Confidence intervals for individual predictions were calculated from the standard deviation of the predictions from the individual models composing the ensembles. Finally, two substructure analysis pipelines singled out chemical substructures implicated in both potency and selectivity in agreement with the literature. Graphical AbstractPrediction of uncorrelated bioactivity profiles for mammalian COX inhibitors with Ensemble Proteochemometric Modeling.

HapFABIA: identification of very short segments of identity by descent characterized by rare variants in large sequencing data.

Sepp Hochreiter‎
Nucleic acids research‎
2013‎

Identity by descent (IBD) can be reliably detected for long shared DNA segments, which are found in related individuals. However, many studies contain cohorts of unrelated individuals that share only short IBD segments. New sequencing technologies facilitate identification of short IBD segments through rare variants, which convey more information on IBD than common variants. Current IBD detection methods, however, are not designed to use rare variants for the detection of short IBD segments. Short IBD segments reveal genetic structures at high resolution. Therefore, they can help to improve imputation and phasing, to increase genotyping accuracy for low-coverage sequencing and to increase the power of association studies. Since short IBD segments are further assumed to be old, they can shed light on the evolutionary history of humans. We propose HapFABIA, a computational method that applies biclustering to identify very short IBD segments characterized by rare variants. HapFABIA is designed to detect short IBD segments in genotype data that were obtained from next-generation sequencing, but can also be applied to DNA microarray data. Especially in next-generation sequencing data, HapFABIA exploits rare variants for IBD detection. HapFABIA significantly outperformed competing algorithms at detecting short IBD segments on artificial and simulated data with rare variants. HapFABIA identified 160 588 different short IBD segments characterized by rare variants with a median length of 23 kb (mean 24 kb) in data for chromosome 1 of the 1000 Genomes Project. These short IBD segments contain 752 000 single nucleotide variants (SNVs), which account for 39% of the rare variants and 23.5% of all variants. The vast majority-152 000 IBD segments-are shared by Africans, while only 19 000 and 11 000 are shared by Europeans and Asians, respectively. IBD segments that match the Denisova or the Neandertal genome are found significantly more often in Asians and Europeans but also, in some cases exclusively, in Africans. The lengths of IBD segments and their sharing between continental populations indicate that many short IBD segments from chromosome 1 existed before humans migrated out of Africa. Thus, rare variants that tag these short IBD segments predate human migration from Africa. The software package HapFABIA is available from Bioconductor. All data sets, result files and programs for data simulation, preprocessing and evaluation are supplied at http://www.bioinf.jku.at/research/short-IBD.

Genome-wide chromatin remodeling identified at GC-rich long nucleosome-free regions.

Karin Schwarzbauer‎ et al.
PloS one‎
2012‎

To gain deeper insights into principles of cell biology, it is essential to understand how cells reorganize their genomes by chromatin remodeling. We analyzed chromatin remodeling on next generation sequencing data from resting and activated T cells to determine a whole-genome chromatin remodeling landscape. We consider chromatin remodeling in terms of nucleosome repositioning which can be observed most robustly in long nucleosome-free regions (LNFRs) that are occupied by nucleosomes in another cell state. We found that LNFR sequences are either AT-rich or GC-rich, where nucleosome repositioning was observed much more prominently in GC-rich LNFRs - a considerable proportion of them outside promoter regions. Using support vector machines with string kernels, we identified a GC-rich DNA sequence pattern indicating loci of nucleosome repositioning in resting T cells. This pattern appears to be also typical for CpG islands. We found out that nucleosome repositioning in GC-rich LNFRs is indeed associated with CpG islands and with binding sites of the CpG-island-binding ZF-CXXC proteins KDM2A and CFP1. That this association occurs prominently inside and also prominently outside of promoter regions hints at a mechanism governing nucleosome repositioning that acts on a whole-genome scale.

Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets.

Gerard Jp van Westen‎ et al.
Journal of cheminformatics‎
2013‎

While a large body of work exists on comparing and benchmarking descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 amino acid descriptor sets have been benchmarked with respect to their ability of establishing bioactivity models. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI, BLOSUM, a novel protein descriptor set (termed ProtFP (4 variants)), and in addition we created and benchmarked three pairs of descriptor combinations. Prediction performance was evaluated in seven structure-activity benchmarks which comprise Angiotensin Converting Enzyme (ACE) dipeptidic inhibitor data, and three proteochemometric data sets, namely (1) GPCR ligands modeled against a GPCR panel, (2) enzyme inhibitors (NNRTIs) with associated bioactivities against a set of HIV enzyme mutants, and (3) enzyme inhibitors (PIs) with associated bioactivities on a large set of HIV enzyme mutants.

Are phylogenetic trees suitable for chemogenomics analyses of bioactivity data sets: the importance of shared active compounds and choosing a suitable data embedding method, as exemplified on Kinases.

Shardul Paricharak‎ et al.
Journal of cheminformatics‎
2013‎

'Phylogenetic trees' are commonly used for the analysis of chemogenomics datasets and to relate protein targets to each other, based on the (shared) bioactivities of their ligands. However, no real assessment as to the suitability of this representation has been performed yet in this area. We aimed to address this shortcoming in the current work, as exemplified by a kinase data set, given the importance of kinases in many diseases as well as the availability of large-scale datasets for analysis. In this work, we analyzed a dataset comprising 157 compounds, which have been tested at concentrations of 1 μM and 10 μM against a panel of 225 human protein kinases in full-matrix experiments, aiming to explain kinase promiscuity and selectivity against inhibitors. Compounds were described by chemical features, which were used to represent kinases (i.e. each kinase had an active set of features and an inactive set).

Computational prediction of metabolism: sites, products, SAR, P450 enzyme dynamics, and mechanisms.

Johannes Kirchmair‎ et al.
Journal of chemical information and modeling‎
2012‎

Metabolism of xenobiotics remains a central challenge for the discovery and development of drugs, cosmetics, nutritional supplements, and agrochemicals. Metabolic transformations are frequently related to the incidence of toxic effects that may result from the emergence of reactive species, the systemic accumulation of metabolites, or by induction of metabolic pathways. Experimental investigation of the metabolism of small organic molecules is particularly resource demanding; hence, computational methods are of considerable interest to complement experimental approaches. This review provides a broad overview of structure- and ligand-based computational methods for the prediction of xenobiotic metabolism. Current computational approaches to address xenobiotic metabolism are discussed from three major perspectives: (i) prediction of sites of metabolism (SOMs), (ii) elucidation of potential metabolites and their chemical structures, and (iii) prediction of direct and indirect effects of xenobiotics on metabolizing enzymes, where the focus is on the cytochrome P450 (CYP) superfamily of enzymes, the cardinal xenobiotics metabolizing enzymes. For each of these domains, a variety of approaches and their applications are systematically reviewed, including expert systems, data mining approaches, quantitative structure-activity relationships (QSARs), and machine learning-based methods, pharmacophore-based algorithms, shape-focused techniques, molecular interaction fields (MIFs), reactivity-focused techniques, protein-ligand docking, molecular dynamics (MD) simulations, and combinations of methods. Predictive metabolism is a developing area, and there is still enormous potential for improvement. However, it is clear that the combination of rapidly increasing amounts of available ligand- and structure-related experimental data (in particular, quantitative data) with novel and diverse simulation and modeling approaches is accelerating the development of effective tools for prediction of in vivo metabolism, which is reflected by the diverse and comprehensive data sources and methods for metabolism prediction reviewed here. This review attempts to survey the range and scope of computational methods applied to metabolism prediction and also to compare and contrast their applicability and performance.

Integrating high-content screening and ligand-target prediction to identify mechanism of action.

Daniel W Young‎ et al.
Nature chemical biology‎
2008‎

High-content screening is transforming drug discovery by enabling simultaneous measurement of multiple features of cellular phenotype that are relevant to therapeutic and toxic activities of compounds. High-content screening studies typically generate immense datasets of image-based phenotypic information, and how best to mine relevant phenotypic data is an unsolved challenge. Here, we introduce factor analysis as a data-driven tool for defining cell phenotypes and profiling compound activities. This method allows a large data reduction while retaining relevant information, and the data-derived factors used to quantify phenotype have discernable biological meaning. We used factor analysis of cells stained with fluorescent markers of cell cycle state to profile a compound library and cluster the hits into seven phenotypic categories. We then compared phenotypic profiles, chemical similarity and predicted protein binding activities of active compounds. By integrating these different descriptors of measured and potential biological activity, we can effectively draw mechanism-of-action inferences.

KekuleScope: prediction of cancer cell line sensitivity and compound potency using convolutional neural networks trained on compound images.

Isidro Cortés-Ciriano‎ et al.
Journal of cheminformatics‎
2019‎

The application of convolutional neural networks (ConvNets) to harness high-content screening images or 2D compound representations is gaining increasing attention in drug discovery. However, existing applications often require large data sets for training, or sophisticated pretraining schemes. Here, we show using 33 IC50 data sets from ChEMBL 23 that the in vitro activity of compounds on cancer cell lines and protein targets can be accurately predicted on a continuous scale from their Kekulé structure representations alone by extending existing architectures (AlexNet, DenseNet-201, ResNet152 and VGG-19), which were pretrained on unrelated image data sets. We show that the predictive power of the generated models, which just require standard 2D compound representations as input, is comparable to that of Random Forest (RF) models and fully-connected Deep Neural Networks trained on circular (Morgan) fingerprints. Notably, including additional fully-connected layers further increases the predictive power of the ConvNets by up to 10%. Analysis of the predictions generated by RF models and ConvNets shows that by simply averaging the output of the RF models and ConvNets we obtain significantly lower errors in prediction for multiple data sets, although the effect size is small, than those obtained with either model alone, indicating that the features extracted by the convolutional layers of the ConvNets provide complementary predictive signal to Morgan fingerprints. Lastly, we show that multi-task ConvNets trained on compound images permit to model COX isoform selectivity on a continuous scale with errors in prediction comparable to the uncertainty of the data. Overall, in this work we present a set of ConvNet architectures for the prediction of compound activity from their Kekulé structure representations with state-of-the-art performance, that require no generation of compound descriptors or use of sophisticated image processing techniques. The code needed to reproduce the results presented in this study and all the data sets are provided at https://github.com/isidroc/kekulescope .

DeepSynergy: predicting anti-cancer drug synergy with Deep Learning.

Kristina Preuer‎ et al.
Bioinformatics (Oxford, England)‎
2018‎

While drug combination therapies are a well-established concept in cancer treatment, identifying novel synergistic combinations is challenging due to the size of combinatorial space. However, computational approaches have emerged as a time- and cost-efficient way to prioritize combinations to test, based on recently available large-scale combination screening data. Recently, Deep Learning has had an impact in many research areas by achieving new state-of-the-art model performance. However, Deep Learning has not yet been applied to drug synergy prediction, which is the approach we present here, termed DeepSynergy. DeepSynergy uses chemical and genomic information as input information, a normalization strategy to account for input data heterogeneity, and conical layers to model drug synergies.

Fast, Quantitative and Variant Enabled Mapping of Peptides to Genomes.

Christoph N Schlaffner‎ et al.
Cell systems‎
2017‎

Current tools for visualization and integration of proteomics with other omics datasets are inadequate for large-scale studies and capture only basic sequence identity information. Furthermore, the frequent reformatting of annotations for reference genomes required by these tools is known to be highly error prone. We developed PoGo for mapping peptides identified through mass spectrometry to overcome these limitations. PoGo reduced runtime and memory usage by 85% and 20%, respectively, and exhibited overall superior performance over other tools on benchmarking with large-scale human tissue and cancer phosphoproteome datasets comprising ∼3 million peptides. In addition, extended functionality enables representation of single-nucleotide variants, post-translational modifications, and quantitative features. PoGo has been integrated in established frameworks such as the PRIDE tool suite and OpenMS, as well as a standalone tool with user-friendly graphical interface. With the rapid increase of quantitative high-resolution datasets capturing proteomes and global modifications to complement orthogonal genomics platforms, PoGo provides a central utility enabling large-scale visualization and interpretation of transomics datasets.

Rectified factor networks for biclustering of omics data.

Djork-Arné Clevert‎ et al.
Bioinformatics (Oxford, England)‎
2017‎

Biclustering has become a major tool for analyzing large datasets given as matrix of samples times features and has been successfully applied in life sciences and e-commerce for drug design and recommender systems, respectively. actor nalysis for cluster cquisition (FABIA), one of the most successful biclustering methods, is a generative model that represents each bicluster by two sparse membership vectors: one for the samples and one for the features. However, FABIA is restricted to about 20 code units because of the high computational complexity of computing the posterior. Furthermore, code units are sometimes insufficiently decorrelated and sample membership is difficult to determine. We propose to use the recently introduced unsupervised Deep Learning approach Rectified Factor Networks (RFNs) to overcome the drawbacks of existing biclustering methods. RFNs efficiently construct very sparse, non-linear, high-dimensional representations of the input via their posterior means. RFN learning is a generalized alternating minimization algorithm based on the posterior regularization method which enforces non-negative and normalized posterior means. Each code unit represents a bicluster, where samples for which the code unit is active belong to the bicluster and features that have activating weights to the code unit belong to the bicluster.

Novel Adamantanyl-Based Thiadiazolyl Pyrazoles Targeting EGFR in Triple-Negative Breast Cancer.

Anusha Sebastian‎ et al.
ACS omega‎
2016‎

The epidermal growth factor receptor (EGFR) is a validated therapeutic target for triple-negative breast cancer (TNBC). In the present study, we synthesize novel adamantanyl-based thiadiazolyl pyrazoles by introducing the adamantane ring to thiazolopyrazoline. On the basis of loss of cell viability in TNBC cells, 4-(adamantan-1-yl)-2-(3-(2,4-dichlorophenyl)-5-phenyl-4,5-dihydro-1H-pyrazol-1-yl)thiazole (APP) was identified as a lead compound. Using a Parzen-Rosenblatt Window classifier, APP was predicted to target the EGFR protein, and the same was confirmed by surface plasmon resonance. Further analysis revealed that APP suppressed the phosphorylation of EGFR at Y992, Y1045, Y1068, Y1086, Y1148, and Y1173 in TNBC cells. APP also inhibited the phosphorylation of ERK at Y204 and of STAT3 at Y705, implying that APP downregulates the activity of EGFR downstream effectors. Small interfering RNA mediated depletion of EGFR expression prevented the effect of APP in BT549 and MDA-MB-231 cells, indicating that APP specifically targets the EGFR. Furthermore, APP modulated the expression of the proteins involved in cell proliferation and survival. In addition, APP altered the expression of epithelial-mesenchymal transition related proteins and suppressed the invasion of TNBC cells. Hence, we report a novel and specific inhibitor of the EGFR signaling cascade.

Discovery of a non-toxic [1,2,4]triazolo[1,5-a]pyrimidin-7-one (WS-10) that modulates ABCB1-mediated multidrug resistance (MDR).

Liming Chang‎ et al.
Bioorganic & medicinal chemistry‎
2018‎

Multidrug resistance (MDR) has been shown to reduce the effectiveness of chemotherapy. Strategies to overcoming MDR have been widely explored in the last decades, leading to a generation of numerous small molecules targeting ABC and MRP transporters. Among the ABC family, ABCB1 plays key roles in the development of drug resistance and is the most well studied. In this work, we report the discovery of non-toxic [1,2,4]triazolo[1,5-a]pyrimidin-7-one (WS-10) from our structurally diverse in-house compound collection that selectively modulates ABCB1-mediated multidrug resistance. WS-10 enhanced the intracellular accumulation of paclitaxel in SW620/Ad300 cells, but did not affect the expression of ABCB1 Protein and ABCB1 localization. The cellular thermal shift assay (CETSA) showed that WS-10 was able to bind to ABCB1, which could be responsible for the reversal effect of WS-10 toward paclitaxel and doxorubicin in SW620/Ad300 cells. Docking simulations were performed to show the possible binding modes of WS-10 within ABCB1 transporter. To conclude, WS-10 could be used as a template for designing new ABCB1 modulators to overcome ABCB1-mediated multidrug resistance.

Combination of Ginsenosides Rb2 and Rg3 Promotes Angiogenic Phenotype of Human Endothelial Cells via PI3K/Akt and MAPK/ERK Pathways.

Ran Joo Choi‎ et al.
Frontiers in pharmacology‎
2021‎

Shexiang Baoxin Pill (SBP) is an oral formulation of Chinese materia medica for the treatment of angina pectoris. It displays pleiotropic roles in protecting the cardiovascular system. However, the mode of action of SBP in promoting angiogenesis, and in particular the synergy between its constituents is currently not fully understood. The combination of ginsenosides Rb2 and Rg3 were studied in human umbilical vein endothelial cells (HUVECs) for their proangiogenic effects. To understand the mode of action of the combination in more mechanistic detail, RNA-Seq analysis was conducted, and differentially expressed genes (DEGs), pathway analysis and Weighted Gene Correlation Network Analysis (WGCNA) were applied to further identify important genes that a play pivotal role in the combination treatment. The effects of pathway-specific inhibitors were observed to provide further support for the hypothesized mode of action of the combination. Ginsenosides Rb2 and Rg3 synergistically promoted HUVEC proliferation and tube formation under defined culture conditions. Also, the combination of Rb2/Rg3 rescued cells from homocysteine-induced damage. mRNA expression of CXCL8, CYR61, FGF16 and FGFRL1 was significantly elevated by the Rb2/Rg3 treatment, and representative signaling pathways induced by these genes were found. The increase of protein levels of phosphorylated-Akt and ERK42/44 by the Rb2/Rg3 combination supports the notion that it promotes endothelial cell proliferation via the PI3K/Akt and MAPK/ERK signaling pathways. The present study provides the hypothesis that SBP, via ginsenosides Rb2 and Rg3, involves the CXCR1/2 CXCL8 (IL8)-mediated PI3K/Akt and MAPK/ERK signaling pathways in achieving its proangiogenic effects.

Probabilistic Random Forest improves bioactivity predictions close to the classification threshold by taking into account experimental uncertainty.

Lewis H Mervin‎ et al.
Journal of cheminformatics‎
2021‎

Measurements of protein-ligand interactions have reproducibility limits due to experimental errors. Any model based on such assays will consequentially have such unavoidable errors influencing their performance which should ideally be factored into modelling and output predictions, such as the actual standard deviation of experimental measurements (σ) or the associated comparability of activity values between the aggregated heterogenous activity units (i.e., Ki versus IC50 values) during dataset assimilation. However, experimental errors are usually a neglected aspect of model generation. In order to improve upon the current state-of-the-art, we herein present a novel approach toward predicting protein-ligand interactions using a Probabilistic Random Forest (PRF) classifier. The PRF algorithm was applied toward in silico protein target prediction across ~ 550 tasks from ChEMBL and PubChem. Predictions were evaluated by taking into account various scenarios of experimental standard deviations in both training and test sets and performance was assessed using fivefold stratified shuffled splits for validation. The largest benefit in incorporating the experimental deviation in PRF was observed for data points close to the binary threshold boundary, when such information was not considered in any way in the original RF algorithm. For example, in cases when σ ranged between 0.4-0.6 log units and when ideal probability estimates between 0.4-0.6, the PRF outperformed RF with a median absolute error margin of ~ 17%. In comparison, the baseline RF outperformed PRF for cases with high confidence to belong to the active class (far from the binary decision threshold), although the RF models gave errors smaller than the experimental uncertainty, which could indicate that they were overtrained and/or over-confident. Finally, the PRF models trained with putative inactives decreased the performance compared to PRF models without putative inactives and this could be because putative inactives were not assigned an experimental pXC50 value, and therefore they were considered inactives with a low uncertainty (which in practice might not be true). In conclusion, PRF can be useful for target prediction models in particular for data where class boundaries overlap with the measurement uncertainty, and where a substantial part of the training data is located close to the classification threshold.

Prediction and identification of synergistic compound combinations against pancreatic cancer cells.

Yasaman KalantarMotamedi‎ et al.
iScience‎
2021‎

Resistance to current therapies is common for pancreatic cancer and hence novel treatment options are urgently needed. In this work, we developed and validated a computational method to select synergistic compound combinations based on transcriptomic profiles from both the disease and compound side, combined with a pathway scoring system, which was then validated prospectively by testing 30 compounds (and their combinations) on PANC-1 cells. Some compounds selected as single agents showed lower GI50 values than the standard of care, gemcitabine. Compounds suggested as combination agents with standard therapy gemcitabine based on the best performing scoring system showed on average 2.82-5.18 times higher synergies compared to compounds that were predicted to be active as single agents. Examples of highly synergistic in vitro validated compound pairs include gemcitabine combined with Entinostat, thioridazine, loperamide, scriptaid and Saracatinib. Hence, the computational approach presented here was able to identify synergistic compound combinations against pancreatic cancer cells.

Deriving time-concordant event cascades from gene expression data: A case study for Drug-Induced Liver Injury (DILI).

Anika Liu‎ et al.
PLoS computational biology‎
2022‎

Adverse event pathogenesis is often a complex process which compromises multiple events ranging from the molecular to the phenotypic level. In toxicology, Adverse Outcome Pathways (AOPs) aim to formalize this as temporal sequences of events, in which event relationships should be supported by causal evidence according to the tailored Bradford-Hill criteria. One of the criteria is whether events are consistently observed in a certain temporal order and, in this work, we study this time concordance using the concept of "first activation" as data-driven means to generate hypotheses on potentially causal mechanisms. As a case study, we analysed liver data from repeat-dose studies in rats from the TG-GATEs database which comprises measurements across eight timepoints, ranging from 3 hours to 4 weeks post-treatment. We identified time-concordant gene expression-derived events preceding adverse histopathology, which serves as surrogate readout for Drug-Induced Liver Injury (DILI). We find known mechanisms in DILI to be time-concordant, and show further that significance, frequency and log fold change (logFC) of differential expression are metrics which can additionally prioritize events although not necessary to be mechanistically relevant. Moreover, we used the temporal order of transcription factor (TF) expression and regulon activity to identify transcriptionally regulated TFs and subsequently combined this with prior knowledge on functional interactions to derive detailed gene-regulatory mechanisms, such as reduced Hnf4a activity leading to decreased expression and activity of Cebpa. At the same time, also potentially novel events are identified such as Sox13 which is highly significantly time-concordant and shows sustained activation over time. Overall, we demonstrate how time-resolved transcriptomics can derive and support mechanistic hypotheses by quantifying time concordance and how this can be combined with prior causal knowledge, with the aim of both understanding mechanisms of toxicity, as well as potential applications to the AOP framework. We make our results available in the form of a Shiny app (https://anikaliu.shinyapps.io/dili_cascades), which allows users to query events of interest in more detail.

DOCKSTRING: Easy Molecular Docking Yields Better Benchmarks for Ligand Design.

Miguel García-Ortegón‎ et al.
Journal of chemical information and modeling‎
2022‎

The field of machine learning for drug discovery is witnessing an explosion of novel methods. These methods are often benchmarked on simple physicochemical properties such as solubility or general druglikeness, which can be readily computed. However, these properties are poor representatives of objective functions in drug design, mainly because they do not depend on the candidate compound's interaction with the target. By contrast, molecular docking is a widely applied method in drug discovery to estimate binding affinities. However, docking studies require a significant amount of domain knowledge to set up correctly, which hampers adoption. Here, we present dockstring, a bundle for meaningful and robust comparison of ML models using docking scores. dockstring consists of three components: (1) an open-source Python package for straightforward computation of docking scores, (2) an extensive dataset of docking scores and poses of more than 260,000 molecules for 58 medically relevant targets, and (3) a set of pharmaceutically relevant benchmark tasks such as virtual screening or de novo design of selective kinase inhibitors. The Python package implements a robust ligand and target preparation protocol that allows nonexperts to obtain meaningful docking scores. Our dataset is the first to include docking poses, as well as the first of its size that is a full matrix, thus facilitating experiments in multiobjective optimization and transfer learning. Overall, our results indicate that docking scores are a more realistic evaluation objective than simple physicochemical properties, yielding benchmark tasks that are more challenging and more closely related to real problems in drug discovery.

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

FABIA: factor analysis for bicluster acquisition.

A One Pot Synthesis of Novel Bioactive Tri-Substitute-Condensed-Imidazopyridines that Targets Snake Venom Phospholipase A2.

Prediction of the potency of mammalian cyclooxygenase inhibitors with ensemble proteochemometric modeling.

HapFABIA: identification of very short segments of identity by descent characterized by rare variants in large sequencing data.

Genome-wide chromatin remodeling identified at GC-rich long nucleosome-free regions.

Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets.

Are phylogenetic trees suitable for chemogenomics analyses of bioactivity data sets: the importance of shared active compounds and choosing a suitable data embedding method, as exemplified on Kinases.

Computational prediction of metabolism: sites, products, SAR, P450 enzyme dynamics, and mechanisms.

Integrating high-content screening and ligand-target prediction to identify mechanism of action.

KekuleScope: prediction of cancer cell line sensitivity and compound potency using convolutional neural networks trained on compound images.

DeepSynergy: predicting anti-cancer drug synergy with Deep Learning.

Fast, Quantitative and Variant Enabled Mapping of Peptides to Genomes.

Rectified factor networks for biclustering of omics data.

Novel Adamantanyl-Based Thiadiazolyl Pyrazoles Targeting EGFR in Triple-Negative Breast Cancer.

Discovery of a non-toxic [1,2,4]triazolo[1,5-a]pyrimidin-7-one (WS-10) that modulates ABCB1-mediated multidrug resistance (MDR).

Combination of Ginsenosides Rb2 and Rg3 Promotes Angiogenic Phenotype of Human Endothelial Cells via PI3K/Akt and MAPK/ERK Pathways.

Probabilistic Random Forest improves bioactivity predictions close to the classification threshold by taking into account experimental uncertainty.

Prediction and identification of synergistic compound combinations against pancreatic cancer cells.

Deriving time-concordant event cascades from gene expression data: A case study for Drug-Induced Liver Injury (DILI).

DOCKSTRING: Easy Molecular Docking Yields Better Benchmarks for Ligand Design.

SciCrunch.org Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Facets

Options

Further Questions

About

Recent News Entries

Contact Us

SciCrunch

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Log in

Log in

Literature

Current Facets and Filters

Options

Facets

Recent searches

.in-collection { color: green; } FABIA: factor analysis for bicluster acquisition.

.in-collection { color: green; } A One Pot Synthesis of Novel Bioactive Tri-Substitute-Condensed-Imidazopyridines that Targets Snake Venom Phospholipase A2.

.in-collection { color: green; } Prediction of the potency of mammalian cyclooxygenase inhibitors with ensemble proteochemometric modeling.

.in-collection { color: green; } HapFABIA: identification of very short segments of identity by descent characterized by rare variants in large sequencing data.

.in-collection { color: green; } Genome-wide chromatin remodeling identified at GC-rich long nucleosome-free regions.

.in-collection { color: green; } Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets.

.in-collection { color: green; } Are phylogenetic trees suitable for chemogenomics analyses of bioactivity data sets: the importance of shared active compounds and choosing a suitable data embedding method, as exemplified on Kinases.

.in-collection { color: green; } Computational prediction of metabolism: sites, products, SAR, P450 enzyme dynamics, and mechanisms.

.in-collection { color: green; } Integrating high-content screening and ligand-target prediction to identify mechanism of action.

.in-collection { color: green; } KekuleScope: prediction of cancer cell line sensitivity and compound potency using convolutional neural networks trained on compound images.

.in-collection { color: green; } DeepSynergy: predicting anti-cancer drug synergy with Deep Learning.

.in-collection { color: green; } Fast, Quantitative and Variant Enabled Mapping of Peptides to Genomes.

.in-collection { color: green; } Rectified factor networks for biclustering of omics data.

.in-collection { color: green; } Novel Adamantanyl-Based Thiadiazolyl Pyrazoles Targeting EGFR in Triple-Negative Breast Cancer.

.in-collection { color: green; } Discovery of a non-toxic [1,2,4]triazolo[1,5-a]pyrimidin-7-one (WS-10) that modulates ABCB1-mediated multidrug resistance (MDR).

.in-collection { color: green; } Combination of Ginsenosides Rb2 and Rg3 Promotes Angiogenic Phenotype of Human Endothelial Cells via PI3K/Akt and MAPK/ERK Pathways.

.in-collection { color: green; } Probabilistic Random Forest improves bioactivity predictions close to the classification threshold by taking into account experimental uncertainty.

.in-collection { color: green; } Prediction and identification of synergistic compound combinations against pancreatic cancer cells.

.in-collection { color: green; } Deriving time-concordant event cascades from gene expression data: A case study for Drug-Induced Liver Injury (DILI).

.in-collection { color: green; } DOCKSTRING: Easy Molecular Docking Yields Better Benchmarks for Ligand Design.

SciCrunch.org Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Facets

Options

Further Questions

Publications Per Year

About

Recent News Entries

Contact Us

SciCrunch

FABIA: factor analysis for bicluster acquisition.

A One Pot Synthesis of Novel Bioactive Tri-Substitute-Condensed-Imidazopyridines that Targets Snake Venom Phospholipase A2.

Prediction of the potency of mammalian cyclooxygenase inhibitors with ensemble proteochemometric modeling.

HapFABIA: identification of very short segments of identity by descent characterized by rare variants in large sequencing data.

Genome-wide chromatin remodeling identified at GC-rich long nucleosome-free regions.

Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets.

Are phylogenetic trees suitable for chemogenomics analyses of bioactivity data sets: the importance of shared active compounds and choosing a suitable data embedding method, as exemplified on Kinases.

Computational prediction of metabolism: sites, products, SAR, P450 enzyme dynamics, and mechanisms.

Integrating high-content screening and ligand-target prediction to identify mechanism of action.

KekuleScope: prediction of cancer cell line sensitivity and compound potency using convolutional neural networks trained on compound images.

DeepSynergy: predicting anti-cancer drug synergy with Deep Learning.

Fast, Quantitative and Variant Enabled Mapping of Peptides to Genomes.

Rectified factor networks for biclustering of omics data.

Novel Adamantanyl-Based Thiadiazolyl Pyrazoles Targeting EGFR in Triple-Negative Breast Cancer.

Discovery of a non-toxic [1,2,4]triazolo[1,5-a]pyrimidin-7-one (WS-10) that modulates ABCB1-mediated multidrug resistance (MDR).

Combination of Ginsenosides Rb2 and Rg3 Promotes Angiogenic Phenotype of Human Endothelial Cells via PI3K/Akt and MAPK/ERK Pathways.

Probabilistic Random Forest improves bioactivity predictions close to the classification threshold by taking into account experimental uncertainty.

Prediction and identification of synergistic compound combinations against pancreatic cancer cells.

Deriving time-concordant event cascades from gene expression data: A case study for Drug-Induced Liver Injury (DILI).

DOCKSTRING: Easy Molecular Docking Yields Better Benchmarks for Ligand Design.