This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.
Our aim was to compare non-linear and linear mathematical model responses for backstroke start performance prediction. Ten swimmers randomly completed eight 15 m backstroke starts with feet over the wedge, four with hands on the highest horizontal and four on the vertical handgrip. Swimmers were videotaped using a dual media camera set-up, with the starts being performed over an instrumented block with four force plates. Artificial neural networks were applied to predict 5 m start time using kinematic and kinetic variables and to determine the accuracy of the mean absolute percentage error. Artificial neural networks predicted start time more robustly than the linear model with respect to changing training to the validation dataset for the vertical handgrip (3.95 ± 1.67 vs. 5.92 ± 3.27%). Artificial neural networks obtained a smaller mean absolute percentage error than the linear model in the horizontal (0.43 ± 0.19 vs. 0.98 ± 0.19%) and vertical handgrip (0.45 ± 0.19 vs. 1.38 ± 0.30%) using all input data. The best artificial neural network validation revealed a smaller mean absolute error than the linear model for the horizontal (0.007 vs. 0.04 s) and vertical handgrip (0.01 vs. 0.03 s). Artificial neural networks should be used for backstroke 5 m start time prediction due to the quite small differences among the elite level performances.
Multiple alternative diagnostic tests for one disease are commonly available to clinicians. It's important to use all the good diagnostic predictors simultaneously to establish a new predictor with higher statistical utility. Under the generalized linear model for binary outcomes, the linear combination of multiple predictors in the link function is proved optimal in the sense that the area under the receiver operating characteristic (ROC) curve of this combination is the largest among all possible linear combination. The result was applied to analysis of the data from the Study of Osteoporotic Fractures (SOF) with comparison to Su and Liu's approach.
The dynamics of complex systems generally include high-dimensional, nonstationary, and nonlinear behavior, all of which pose fundamental challenges to quantitative understanding. To address these difficulties, we detail an approach based on local linear models within windows determined adaptively from data. While the dynamics within each window are simple, consisting of exponential decay, growth, and oscillations, the collection of local parameters across all windows provides a principled characterization of the full time series. To explore the resulting model space, we develop a likelihood-based hierarchical clustering, and we examine the eigenvalues of the linear dynamics. We demonstrate our analysis with the Lorenz system undergoing stable spiral dynamics and in the standard chaotic regime. Applied to the posture dynamics of the nematode Caenorhabditis elegans, our approach identifies fine-grained behavioral states and model dynamics which fluctuate about an instability boundary, and we detail a bifurcation in a transition from forward to backward crawling. We analyze whole-brain imaging in C. elegans and show that global brain dynamics is damped away from the instability boundary by a decrease in oxygen concentration. We provide additional evidence for such near-critical dynamics from the analysis of electrocorticography in monkey and the imaging of a neural population from mouse visual cortex at single-cell resolution.
Multiple hypothesis testing is a major issue in genome-wide association studies (GWAS), which often analyze millions of markers. The permutation test is considered to be the gold standard in multiple testing correction as it accurately takes into account the correlation structure of the genome. Recently, the linear mixed model (LMM) has become the standard practice in GWAS, addressing issues of population structure and insufficient power. However, none of the current multiple testing approaches are applicable to LMM.
The aim of the present paper is to analyse the differences between tube-based models which are widely used for predicting the linear viscoelasticity of monodisperse linear polymers, in comparison to a large set of experimental data. The following models are examined: Milner-McLeish, Likhtman-McLeish, the Hierarchical model proposed by the group of Larson, the BoB model of Das and Read, and the TMA model proposed by the group of van Ruymbeke. This comparison allows us to highlight and discuss important questions related to the relaxation of entangled polymers, such as the importance of the contour-length fluctuations (CLF) process and how it affects the reptation mechanism, or the contribution of the constraint release (CR) process on the motion of the chains. In particular, it allows us to point out important approximations, inherent in some models, which result in an overestimation of the effect of CLF on the reptation time. On the contrary, by validating the TMA model against experimental data, we show that this effect is underestimated in TMA. Therefore, in order to obtain accurate predictions, a novel modification to the TMA model is proposed. Our current work is a continuation of earlier research (Shchetnikava et al., 2014), where a similar analysis is performed on well-defined star polymers.
Linear mixed models (LMMs) and their extensions have recently become the method of choice in phenotype prediction for complex traits. However, LMM use to date has typically been limited by assuming simple genetic architectures. Here, we present multikernel linear mixed model (MKLMM), a predictive modeling framework that extends the standard LMM using multiple-kernel machine learning approaches. MKLMM can model genetic interactions and is particularly suitable for modeling complex local interactions between nearby variants. We additionally present MKLMM-Adapt, which automatically infers interaction types across multiple genomic regions. In an analysis of eight case-control data sets from the Wellcome Trust Case Control Consortium and more than a hundred mouse phenotypes, MKLMM-Adapt consistently outperforms competing methods in phenotype prediction. MKLMM is as computationally efficient as standard LMMs and does not require storage of genotypes, thus achieving state-of-the-art predictive power without compromising computational feasibility or genomic privacy.
Both linear mixed models (LMMs) and sparse regression models are widely used in genetics applications, including, recently, polygenic modeling in genome-wide association studies. These two approaches make very different assumptions, so are expected to perform well in different situations. However, in practice, for a given dataset one typically does not know which assumptions will be more accurate. Motivated by this, we consider a hybrid of the two, which we refer to as a "Bayesian sparse linear mixed model" (BSLMM) that includes both these models as special cases. We address several key computational and statistical issues that arise when applying BSLMM, including appropriate prior specification for the hyper-parameters and a novel Markov chain Monte Carlo algorithm for posterior inference. We apply BSLMM and compare it with other methods for two polygenic modeling applications: estimating the proportion of variance in phenotypes explained (PVE) by available genotypes, and phenotype (or breeding value) prediction. For PVE estimation, we demonstrate that BSLMM combines the advantages of both standard LMMs and sparse regression modeling. For phenotype prediction it considerably outperforms either of the other two methods, as well as several other large-scale regression methods previously suggested for this problem. Software implementing our method is freely available from http://stephenslab.uchicago.edu/software.html.
The analysis of longitudinal, heterogeneous or unbalanced clustered data is of primary importance to a wide range of applications. The linear mixed model (LMM) is a popular and flexible extension of the linear model specifically designed for such purposes. Historically, a large proportion of material published on the LMM concerns the application of popular numerical optimization algorithms, such as Newton-Raphson, Fisher Scoring and expectation maximization to single-factor LMMs (i.e. LMMs that only contain one "factor" by which observations are grouped). However, in recent years, the focus of the LMM literature has moved towards the development of estimation and inference methods for more complex, multi-factored designs. In this paper, we present and derive new expressions for the extension of an algorithm classically used for single-factor LMM parameter estimation, Fisher Scoring, to multiple, crossed-factor designs. Through simulation and real data examples, we compare five variants of the Fisher Scoring algorithm with one another, as well as against a baseline established by the R package lme4, and find evidence of correctness and strong computational efficiency for four of the five proposed approaches. Additionally, we provide a new method for LMM Satterthwaite degrees of freedom estimation based on analytical results, which does not require iterative gradient estimation. Via simulation, we find that this approach produces estimates with both lower bias and lower variance than the existing methods.
Quantitative trait loci (QTLs) may affect not only the mean of a trait but also its variability. A special aspect is the variability between multiple measured traits of genotyped animals, such as the within-litter variance of piglet birth weights. The sample variance of repeated measurements is assigned as an observation for every genotyped individual. It is shown that the conditional distribution of the non-normally distributed trait can be approximated by a gamma distribution. To detect QTL effects in the daughter design, a generalized linear model with the identity link function is applied. Suitable test statistics are constructed to test the null hypothesis H(0): No QTL with effect on the within-litter variance is segregating versus H(A): There is a QTL with effect on the variability of birth weight within litter. Furthermore, estimates of the QTL effect and the QTL position are introduced and discussed. The efficiency of the presented tests is compared with a test based on weighted regression. The error probability of the first type as well as the power of QTL detection are discussed and compared for the different tests.
Constraints arise naturally in many scientific experiments/studies such as in, epidemiology, biology, toxicology, etc. and often researchers ignore such information when analyzing their data and use standard methods such as the analysis of variance (ANOVA). Such methods may not only result in a loss of power and efficiency in costs of experimentation but also may result poor interpretation of the data. In this paper we discuss constrained statistical inference in the context of linear mixed effects models that arise naturally in many applications, such as in repeated measurements designs, familial studies and others. We introduce a novel methodology that is broadly applicable for a variety of constraints on the parameters. Since in many applications sample sizes are small and/or the data are not necessarily normally distributed and furthermore error variances need not be homoscedastic (i.e. heterogeneity in the data) we use an empirical best linear unbiased predictor (EBLUP) type residual based bootstrap methodology for deriving critical values of the proposed test. Our simulation studies suggest that the proposed procedure maintains the desired nominal Type I error while competing well with other tests in terms of power. We illustrate the proposed methodology by re-analyzing a clinical trial data on blood mercury level. The methodology introduced in this paper can be easily extended to other settings such as nonlinear and generalized regression models.
We hypothesized that generalized linear mixed models (GLMMs), which estimate the additive genetic variance underlying phenotype variability, would facilitate rapid characterization of clinical phenotypes from an electronic health record. We evaluated 1,288 phenotypes in 29,349 subjects of European ancestry with single-nucleotide polymorphism (SNP) genotyping on the Illumina Exome Beadchip. We show that genetic liability estimates are primarily driven by SNPs identified by prior genome-wide association studies and SNPs within the human leukocyte antigen (HLA) region. We identify 44 (false discovery rate q<0.05) phenotypes associated with HLA SNP variation and show that hypothyroidism is genetically correlated with Type I diabetes (rG=0.31, s.e. 0.12, P=0.003). We also report novel SNP associations for hypothyroidism near HLA-DQA1/HLA-DQB1 at rs6906021 (combined odds ratio (OR)=1.2 (95% confidence interval (CI): 1.1-1.2), P=9.8 × 10(-11)) and for polymyalgia rheumatica near C6orf10 at rs6910071 (OR=1.5 (95% CI: 1.3-1.6), P=1.3 × 10(-10)). Phenome-wide application of GLMMs identifies phenotypes with important genetic drivers, and focusing on these phenotypes can identify novel genetic associations.
Extracellular protein concentrations and gradients initiate a wide range of cellular responses, such as cell motility, growth, proliferation and death. Understanding inter-cellular communication requires spatio-temporal knowledge of these secreted factors and their causal relationship with cell phenotype. Techniques which can detect cellular secretions in real time are becoming more common but generalizable data analysis methodologies which can quantify concentration from these measurements are still lacking. Here we introduce a probabilistic approach in which local-linear models and the law of mass action are applied to obtain time-varying secreted concentrations from affinity-based biosensor data. We first highlight the general features of this approach using simulated data which contains both static and time-varying concentration profiles. Next we apply the technique to determine concentration of secreted antibodies from 9E10 hybridoma cells as detected using nanoplasmonic biosensors. A broad range of time-dependent concentrations was observed: from steady-state secretions of 230 pM near the cell surface to large transients which reached as high as 56 nM over several minutes and then dissipated.
Diffusion MRI (dMRI) is a valuable tool in the assessment of tissue microstructure. By fitting a model to the dMRI signal it is possible to derive various quantitative features. Several of the most popular dMRI signal models are expansions in an appropriately chosen basis, where the coefficients are determined using some variation of least-squares. However, such approaches lack any notion of uncertainty, which could be valuable in e.g. group analyses. In this work, we use a probabilistic interpretation of linear least-squares methods to recast popular dMRI models as Bayesian ones. This makes it possible to quantify the uncertainty of any derived quantity. In particular, for quantities that are affine functions of the coefficients, the posterior distribution can be expressed in closed-form. We simulated measurements from single- and double-tensor models where the correct values of several quantities are known, to validate that the theoretically derived quantiles agree with those observed empirically. We included results from residual bootstrap for comparison and found good agreement. The validation employed several different models: Diffusion Tensor Imaging (DTI), Mean Apparent Propagator MRI (MAP-MRI) and Constrained Spherical Deconvolution (CSD). We also used in vivo data to visualize maps of quantitative features and corresponding uncertainties, and to show how our approach can be used in a group analysis to downweight subjects with high uncertainty. In summary, we convert successful linear models for dMRI signal estimation to probabilistic models, capable of accurate uncertainty quantification.
Modelling biological associations or dependencies using linear regression is often complicated when the analyzed data-sets are high-dimensional and less observations than variables are available (n ≪ p). For genomic data-sets penalized regression methods have been applied settling this issue. Recently proposed regression models utilize prior knowledge on dependencies, e.g. in the form of graphs, arguing that this information will lead to more reliable estimates for regression coefficients. However, none of the proposed models for multivariate genomic response variables have been implemented as a computationally efficient, freely available library. In this paper we propose netReg, a package for graph-penalized regression models that use large networks and thousands of variables. netReg incorporates a priori generated biological graph information into linear models yielding sparse or smooth solutions for regression coefficients.
Relative transcript abundance has proven to be a valuable tool for understanding the function of genes in biological systems. For the differential analysis of transcript abundance using RNA sequencing data, the negative binomial model is by far the most frequently adopted. However, common methods that are based on a negative binomial model are not robust to extreme outliers, which we found to be abundant in public datasets. So far, no rigorous and probabilistic methods for detection of outliers have been developed for RNA sequencing data, leaving the identification mostly to visual inspection. Recent advances in Bayesian computation allow large-scale comparison of observed data against its theoretical distribution given in a statistical model. Here we propose ppcseq, a key quality-control tool for identifying transcripts that include outlier data points in differential expression analysis, which do not follow a negative binomial distribution. Applying ppcseq to analyse several publicly available datasets using popular tools, we show that from 3 to 10 percent of differentially abundant transcripts across algorithms and datasets had statistics inflated by the presence of outliers.
High-throughput metabolomics data provide a detailed molecular window into biological processes. We consider the problem of assessing how the association of metabolite levels with individual (sample) characteristics such as sex or treatment may depend on metabolite characteristics such as pathway. Typically this is one in a two-step process: In the first step we assess the association of each metabolite with individual characteristics. In the second step an enrichment analysis is performed by metabolite characteristics among significant associations. We combine the two steps using a bilinear model based on the matrix linear model (MLM) framework we have previously developed for high-throughput genetic screens. Our framework can estimate relationships in metabolites sharing known characteristics, whether categorical (such as type of lipid or pathway) or numerical (such as number of double bonds in triglycerides). We demonstrate how MLM offers flexibility and interpretability by applying our method to three metabolomic studies. We show that our approach can separate the contribution of the overlapping triglycerides characteristics, such as the number of double bonds and the number of carbon atoms. The proposed method have been implemented in the open-source Julia package, MatrixLM. Data analysis scripts with example data analyses are also available.
Longitudinal studies are commonly used to examine possible causal factors associated with human health and disease. However, the statistical models, such as two-way ANOVA, often applied in these studies do not appropriately model the experimental design, resulting in biased and imprecise results. Here, we describe the linear mixed effects (LME) model and how to use it for longitudinal studies. We re-analyze a dataset published by Blanton et al. in 2016 that modeled growth trajectories in mice after microbiome implantation from nourished or malnourished children. We compare the fit and stability of different parameterizations of ANOVA and LME models; most models found that the nourished versus malnourished growth trajectories differed significantly. We show through simulation that the results from the two-way ANOVA and LME models are not always consistent. Incorrectly modeling correlated data can result in increased rates of false positives or false negatives, supporting the need to model correlated data correctly. We provide an interactive Shiny App to enable accessible and appropriate analysis of longitudinal data using LME models.
The time-scale hierarchies of a very general class of models in differential equations is analyzed. Classical methods for model reduction and time-scale analysis have been adapted to this formalism and a complementary method is proposed. A unified theoretical treatment shows how the structure of the system can be much better understood by inspection of two sets of singular values: one related to the stoichiometric structure of the system and another to its kinetics. The methods are exemplified first through a toy model, then a large synthetic network and finally with numeric simulations of three classical benchmark models of real biological systems.
A structurally diverse dataset of 530 polo-like kinase-1 (PLK1) inhibitors is compiled from the ChEMBL database and studied by means of a conformation-independent quantitative structure-activity relationship (QSAR) approach. A large number (26,761) of molecular descriptors are explored with the main intention of capturing the most relevant structural characteristics affecting the bioactivity. The structural descriptors are derived with different freeware, such as PaDEL, Mold², and QuBiLs-MAS; such descriptor software complements each other and improves the QSAR results. The best multivariable linear regression models are found with the replacement method variable subset selection technique. The balanced subsets method partitions the dataset into training, validation, and test sets. It is found that the proposed linear QSAR model improves previously reported models by leading to a simpler alternative structure-activity relationship.
Linear mixed effect models are powerful tools used to account for population structure in genome-wide association studies (GWASs) and estimate the genetic architecture of complex traits. However, fully-specified models are computationally demanding and common simplifications often lead to reduced power or biased inference. We describe Grid-LMM (https://github.com/deruncie/GridLMM), an extendable algorithm for repeatedly fitting complex linear models that account for multiple sources of heterogeneity, such as additive and non-additive genetic variance, spatial heterogeneity, and genotype-environment interactions. Grid-LMM can compute approximate (yet highly accurate) frequentist test statistics or Bayesian posterior summaries at a genome-wide scale in a fraction of the time compared to existing general-purpose methods. We apply Grid-LMM to two types of quantitative genetic analyses. The first is focused on accounting for spatial variability and non-additive genetic variance while scanning for QTL; and the second aims to identify gene expression traits affected by non-additive genetic variation. In both cases, modeling multiple sources of heterogeneity leads to new discoveries.
Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the facets that you can filter your papers by.
From here we'll present any options for the literature, such as exporting your current results.
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.
Year:
Count: