Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.

Search

Type in a keyword to search

On page 1 showing 1 ~ 20 papers out of 1,683 papers

Understanding Cybersecurity Threat Trends Through Dynamic Topic Modeling.

  • Jennifer Sleeman‎ et al.
  • Frontiers in big data‎
  • 2021‎

Cybersecurity threats continue to increase and are impacting almost all aspects of modern life. Being aware of how vulnerabilities and their exploits are changing gives helpful insights into combating new threats. Applying dynamic topic modeling to a time-stamped cybersecurity document collection shows how the significance and details of concepts found in them are evolving. We correlate two different temporal corpora, one with reports about specific exploits and the other with research-oriented papers on cybersecurity vulnerabilities and threats. We represent the documents, concepts, and dynamic topic modeling data in a semantic knowledge graph to support integration, inference, and discovery. A critical insight into discovering knowledge through topic modeling is seeding the knowledge graph with domain concepts to guide the modeling process. We use Wikipedia concepts to provide a basis for performing concept phrase extraction and show how using those phrases improves the quality of the topic models. Researchers can query the resulting knowledge graph to reveal important relations and trends. This work is novel because it uses topics as a bridge to relate documents across corpora over time.


Assessing systemic risk in financial markets using dynamic topic networks.

  • Mike K P So‎ et al.
  • Scientific reports‎
  • 2022‎

Systemic risk in financial markets refers to the breakdown of a financial system due to global events, catastrophes, or extreme incidents, leading to huge financial instability and losses. This study proposes a dynamic topic network (DTN) approach that combines topic modelling and network analysis to assess systemic risk in financial markets. We make use of Latent Dirichlet Allocation (LDA) to semantically analyse news articles, and the extracted topics then serve as input to construct topic similarity networks over time. Our results indicate how connected the topics are so that we can correlate any abnormal behaviours with volatility in the financial markets. With the 2015-2016 stock market selloff and COVID-19 as use cases, our results also suggest that the proposed DTN approach can provide an indication of (a) abnormal movement in the Dow Jones Industrial Average and (b) when the market would gradually begin to recover from such an event. From a practical risk management point of view, this analysis can be carried out on a daily basis when new data come in so that we can make use of the calculated metrics to predict real-time systemic risk in financial markets.


A Bibliometric Analysis on Cancer Population Science with Topic Modeling.

  • Ding-Cheng Li‎ et al.
  • AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science‎
  • 2015‎

Bibliometric analysis is a research method used in library and information science to evaluate research performance. It applies quantitative and statistical analyses to describe patterns observed in a set of publications and can help identify previous, current, and future research trends or focus. To better guide our institutional strategic plan in cancer population science, we conducted bibliometric analysis on publications of investigators currently funded by either Division of Cancer Preventions (DCP) or Division of Cancer Control and Population Science (DCCPS) at National Cancer Institute. We applied two topic modeling techniques: author topic modeling (AT) and dynamic topic modeling (DTM). Our initial results show that AT can address reasonably the issues related to investigators' research interests, research topic distributions and popularities. In compensation, DTM can address the evolving trend of each topic by displaying the proportion changes of key words, which is consistent with the changes of MeSH headings.


Mathematics Anxiety and Statistics Anxiety. Shared but Also Unshared Components and Antagonistic Contributions to Performance in Statistics.

  • Manuela Paechter‎ et al.
  • Frontiers in psychology‎
  • 2017‎

In many social science majors, e.g., psychology, students report high levels of statistics anxiety. However, these majors are often chosen by students who are less prone to mathematics and who might have experienced difficulties and unpleasant feelings in their mathematics courses at school. The present study investigates whether statistics anxiety is a genuine form of anxiety that impairs students' achievements or whether learners mainly transfer previous experiences in mathematics and their anxiety in mathematics to statistics. The relationship between mathematics anxiety and statistics anxiety, their relationship to learning behaviors and to performance in a statistics examination were investigated in a sample of 225 undergraduate psychology students (164 women, 61 men). Data were recorded at three points in time: At the beginning of term students' mathematics anxiety, general proneness to anxiety, school grades, and demographic data were assessed; 2 weeks before the end of term, they completed questionnaires on statistics anxiety and their learning behaviors. At the end of term, examination scores were recorded. Mathematics anxiety and statistics anxiety correlated highly but the comparison of different structural equation models showed that they had genuine and even antagonistic contributions to learning behaviors and performance in the examination. Surprisingly, mathematics anxiety was positively related to performance. It might be that students realized over the course of their first term that knowledge and skills in higher secondary education mathematics are not sufficient to be successful in statistics. Part of mathematics anxiety may then have strengthened positive extrinsic effort motivation by the intention to avoid failure and may have led to higher effort for the exam preparation. However, via statistics anxiety mathematics anxiety also had a negative contribution to performance. Statistics anxiety led to higher procrastination in the structural equation model and, therefore, contributed indirectly and negatively to performance. Furthermore, it had a direct negative impact on performance (probably via increased tension and worry in the exam). The results of the study speak for shared but also unique components of statistics anxiety and mathematics anxiety. They are also important for instruction and give recommendations to learners as well as to instructors.


Zika discourse in the Americas: A multilingual topic analysis of Twitter.

  • Dasha Pruss‎ et al.
  • PloS one‎
  • 2019‎

This work examines Twitter discussion surrounding the 2015 outbreak of Zika, a virus that is most often mild but has been associated with serious birth defects and neurological syndromes. We introduce and analyze a collection of 3.9 million tweets mentioning Zika geolocated to North and South America, where the virus is most prevalent. Using a multilingual topic model, we automatically identify and extract the key topics of discussion across the dataset in English, Spanish, and Portuguese. We examine the variation in Twitter activity across time and location, finding that rises in activity tend to follow to major events, and geographic rates of Zika-related discussion are moderately correlated with Zika incidence (ρ = .398).


Estimating colocalization probability from limited summary statistics.

  • Emily A King‎ et al.
  • BMC bioinformatics‎
  • 2021‎

Colocalization is a statistical method used in genetics to determine whether the same variant is causal for multiple phenotypes, for example, complex traits and gene expression. It provides stronger mechanistic evidence than shared significance, which can be produced through separate causal variants in linkage disequilibrium. Current colocalization methods require full summary statistics for both traits, limiting their use with the majority of reported GWAS associations (e.g. GWAS Catalog). We propose a new approximation to the popular coloc method that can be applied when limited summary statistics are available. Our method (POint EstiMation of Colocalization, POEMColoc) imputes missing summary statistics for one or both traits using LD structure in a reference panel, and performs colocalization using the imputed summary statistics.


Biomedical Text Categorization Based on Ensemble Pruning and Optimized Topic Modelling.

  • Aytuğ Onan‎
  • Computational and mathematical methods in medicine‎
  • 2018‎

Text mining is an important research direction, which involves several fields, such as information retrieval, information extraction, and text categorization. In this paper, we propose an efficient multiple classifier approach to text categorization based on swarm-optimized topic modelling. The Latent Dirichlet allocation (LDA) can overcome the high dimensionality problem of vector space model, but identifying appropriate parameter values is critical to performance of LDA. Swarm-optimized approach estimates the parameters of LDA, including the number of topics and all the other parameters involved in LDA. The hybrid ensemble pruning approach based on combined diversity measures and clustering aims to obtain a multiple classifier system with high predictive performance and better diversity. In this scheme, four different diversity measures (namely, disagreement measure, Q-statistics, the correlation coefficient, and the double fault measure) among classifiers of the ensemble are combined. Based on the combined diversity matrix, a swarm intelligence based clustering algorithm is employed to partition the classifiers into a number of disjoint groups and one classifier (with the highest predictive performance) from each cluster is selected to build the final multiple classifier system. The experimental results based on five biomedical text benchmarks have been conducted. In the swarm-optimized LDA, different metaheuristic algorithms (such as genetic algorithms, particle swarm optimization, firefly algorithm, cuckoo search algorithm, and bat algorithm) are considered. In the ensemble pruning, five metaheuristic clustering algorithms are evaluated. The experimental results on biomedical text benchmarks indicate that swarm-optimized LDA yields better predictive performance compared to the conventional LDA. In addition, the proposed multiple classifier system outperforms the conventional classification algorithms, ensemble learning, and ensemble pruning methods.


Topic modeling identifies novel genetic loci associated with multimorbidities in UK Biobank.

  • Yidong Zhang‎ et al.
  • Cell genomics‎
  • 2023‎

Many diseases show patterns of co-occurrence, possibly driven by systemic dysregulation of underlying processes affecting multiple traits. We have developed a method (treeLFA) for identifying such multimorbidities from routine health-care data, which combines topic modeling with an informative prior derived from medical ontology. We apply treeLFA to UK Biobank data and identify a variety of topics representing multimorbidity clusters, including a healthy topic. We find that loci identified using topic weights as traits in a genome-wide association study (GWAS) analysis, which we validated with a range of approaches, only partially overlap with loci from GWASs on constituent single diseases. We also show that treeLFA improves upon existing methods like latent Dirichlet allocation in various ways. Overall, our findings indicate that topic models can characterize multimorbidity patterns and that genetic analysis of these patterns can provide insight into the etiology of complex traits that cannot be determined from the analysis of constituent traits alone.


Vaginal microbiome topic modeling of laboring Ugandan women with and without fever.

  • Mercedeh Movassagh‎ et al.
  • NPJ biofilms and microbiomes‎
  • 2021‎

The composition of the maternal vaginal microbiome influences the duration of pregnancy, onset of labor, and even neonatal outcomes. Maternal microbiome research in sub-Saharan Africa has focused on non-pregnant and postpartum composition of the vaginal microbiome. Here we aimed to illustrate the relationship between the vaginal microbiome of 99 laboring Ugandan women and intrapartum fever using routine microbiology and 16S ribosomal RNA gene sequencing from two hypervariable regions (V1-V2 and V3-V4). To describe the vaginal microbes associated with vaginal microbial communities, we pursued two approaches: hierarchical clustering methods and a novel Grades of Membership (GoM) modeling approach for vaginal microbiome characterization. Leveraging GoM models, we created a basis composed of a preassigned number of microbial topics whose linear combination optimally represents each patient yielding more comprehensive associations and characterization between maternal clinical features and the microbial communities. Using a random forest model, we showed that by including microbial topic models we improved upon clinical variables to predict maternal fever. Overall, we found a higher prevalence of Granulicatella, Streptococcus, Fusobacterium, Anaerococcus, Sneathia, Clostridium, Gemella, Mobiluncus, and Veillonella genera in febrile mothers, and higher prevalence of Lactobacillus genera (in particular L. crispatus and L. jensenii), Acinobacter, Aerococcus, and Prevotella species in afebrile mothers. By including clinical variables with microbial topics in this model, we observed young maternal age, fever reported earlier in the pregnancy, longer labor duration, and microbial communities with reduced Lactobacillus diversity were associated with intrapartum fever. These results better defined relationships between the presence or absence of intrapartum fever, demographics, peripartum course, and vaginal microbial topics, and expanded our understanding of the impact of the microbiome on maternal and potentially neonatal outcome risk.


Deep learning for COVID-19 topic modelling via Twitter: Alpha, Delta and Omicron.

  • Janhavi Lande‎ et al.
  • PloS one‎
  • 2023‎

Topic modelling with innovative deep learning methods has gained interest for a wide range of applications that includes COVID-19. It can provide, psychological, social and cultural insights for understanding human behaviour in extreme events such as the COVID-19 pandemic. In this paper, we use prominent deep learning-based language models for COVID-19 topic modelling taking into account data from the emergence (Alpha) to the Omicron variant in India. Our results show that the topics extracted for the subsequent waves had certain overlapping themes such as governance, vaccination, and pandemic management while novel issues aroused in political, social and economic situations during the COVID-19 pandemic. We also find a strong correlation between the major topics with news media prevalent during the respective time period. Hence, our framework has the potential to capture major issues arising during different phases of the COVID-19 pandemic which can be extended to other countries and regions.


Topic model-based mass spectrometric data analysis in cancer biomarker discovery studies.

  • Minkun Wang‎ et al.
  • BMC genomics‎
  • 2016‎

A fundamental challenge in quantitation of biomolecules for cancer biomarker discovery is owing to the heterogeneous nature of human biospecimens. Although this issue has been a subject of discussion in cancer genomic studies, it has not yet been rigorously investigated in mass spectrometry based proteomic and metabolomic studies. Purification of mass spectometric data is highly desired prior to subsequent analysis, e.g., quantitative comparison of the abundance of biomolecules in biological samples.


Sharing GWAS summary statistics results in more citations.

  • Guillermo Reales‎ et al.
  • Communications biology‎
  • 2023‎

Rates of sharing of genome-wide association studies (GWAS) summary statistics are historically low, limiting potential for scientific discovery. Here we show, using GWAS Catalog data, that GWAS papers that share data get on average 81.8% more citations, an effect that is sustained over time.


Effects of individual health topic familiarity on activity patterns during health information searches.

  • Ira Puspitasari‎ et al.
  • JMIR medical informatics‎
  • 2015‎

Non-medical professionals (consumers) are increasingly using the Internet to support their health information needs. However, the cognitive effort required to perform health information searches is affected by the consumer's familiarity with health topics. Consumers may have different levels of familiarity with individual health topics. This variation in familiarity may cause misunderstandings because the information presented by search engines may not be understood correctly by the consumers.


Topical Imiquimod Treatment of High-grade Cervical Intraepithelial Neoplasia (TOPIC-3): A Nonrandomized Multicenter Study.

  • Natasja Hendriks‎ et al.
  • Journal of immunotherapy (Hagerstown, Md. : 1997)‎
  • 2022‎

Topical imiquimod could be an alternative, noninvasive, treatment modality for high-grade cervical intraepithelial neoplasia (CIN). However, evidence is limited, and there are no studies that compared treatment effectiveness and side effects of topical imiquimod cream to standard large loop excision of the transformation zone (LLETZ) treatment. A multi-center, nonrandomized controlled trial was performed among women with a histologic diagnosis of CIN 2/3. Women were treated with either vaginal imiquimod (6.25 mg 3 times weekly for 8 to 16 wk) or LLETZ according to their own preference. Successful treatment was defined as the absence of high-grade dysplasia at the first follow-up interval after treatment (at 20 wk for the imiquimod group and at 26 wk for the LLETZ group). Secondary outcome measures were high-risk human papillomavirus (hrHPV) clearance, side effects, and predictive factors for successful imiquimod treatment. Imiquimod treatment was successful in 60% of women who completed imiquimod treatment and 95% of women treated with LLETZ. hrHPV clearance occurred in 69% and 67% in the imiquimod group and LLETZ group, respectively. This study provides further evidence on topical imiquimod cream as a feasible and safe treatment modality for high-grade CIN. Although the effectiveness is considerably lower than LLETZ treatment, imiquimod treatment could prevent initial surgical treatment in over 40% of women and should be offered to a selected population of women who wish to avoid (repeated) surgical treatment of high-grade CIN.


Age-dependent topic modeling of comorbidities in UK Biobank identifies disease subtypes with differential genetic risk.

  • Xilin Jiang‎ et al.
  • Nature genetics‎
  • 2023‎

The analysis of longitudinal data from electronic health records (EHRs) has the potential to improve clinical diagnoses and enable personalized medicine, motivating efforts to identify disease subtypes from patient comorbidity information. Here we introduce an age-dependent topic modeling (ATM) method that provides a low-rank representation of longitudinal records of hundreds of distinct diseases in large EHR datasets. We applied ATM to 282,957 UK Biobank samples, identifying 52 diseases with heterogeneous comorbidity profiles; analyses of 211,908 All of Us samples produced concordant results. We defined subtypes of the 52 heterogeneous diseases based on their comorbidity profiles and compared genetic risk across disease subtypes using polygenic risk scores (PRSs), identifying 18 disease subtypes whose PRS differed significantly from other subtypes of the same disease. We further identified specific genetic variants with subtype-dependent effects on disease risk. In conclusion, ATM identifies disease subtypes with differential genome-wide and locus-specific genetic risk profiles.


Models of archaic admixture and recent history from two-locus statistics.

  • Aaron P Ragsdale‎ et al.
  • PLoS genetics‎
  • 2019‎

We learn about population history and underlying evolutionary biology through patterns of genetic polymorphism. Many approaches to reconstruct evolutionary histories focus on a limited number of informative statistics describing distributions of allele frequencies or patterns of linkage disequilibrium. We show that many commonly used statistics are part of a broad family of two-locus moments whose expectation can be computed jointly and rapidly under a wide range of scenarios, including complex multi-population demographies with continuous migration and admixture events. A full inspection of these statistics reveals that widely used models of human history fail to predict simple patterns of linkage disequilibrium. To jointly capture the information contained in classical and novel statistics, we implemented a tractable likelihood-based inference framework for demographic history. Using this approach, we show that human evolutionary models that include archaic admixture in Africa, Asia, and Europe provide a much better description of patterns of genetic diversity across the human genome. We estimate that an unidentified, deeply diverged population admixed with modern humans within Africa both before and after the split of African and Eurasian populations, contributing 4 - 8% genetic ancestry to individuals in world-wide populations.


Looking for Image Statistics: Active Vision With Avatars in a Naturalistic Virtual Environment.

  • Dominik Straub‎ et al.
  • Frontiers in psychology‎
  • 2021‎

The efficient coding hypothesis posits that sensory systems are tuned to the regularities of their natural input. The statistics of natural image databases have been the topic of many studies, which have revealed biases in the distribution of orientations that are related to neural representations as well as behavior in psychophysical tasks. However, commonly used natural image databases contain images taken with a camera with a planar image sensor and limited field of view. Thus, these images do not incorporate the physical properties of the visual system and its active use reflecting body and eye movements. Here, we investigate quantitatively, whether the active use of the visual system influences image statistics across the visual field by simulating visual behaviors in an avatar in a naturalistic virtual environment. Images with a field of view of 120° were generated during exploration of a virtual forest environment both for a human and cat avatar. The physical properties of the visual system were taken into account by projecting the images onto idealized retinas according to models of the eyes' geometrical optics. Crucially, different active gaze behaviors were simulated to obtain image ensembles that allow investigating the consequences of active visual behaviors on the statistics of the input to the visual system. In the central visual field, the statistics of the virtual images matched photographic images regarding their power spectra and a bias in edge orientations toward cardinal directions. At larger eccentricities, the cardinal bias was superimposed with a gradually increasing radial bias. The strength of this effect depends on the active visual behavior and the physical properties of the eye. There were also significant differences between the upper and lower visual field, which became stronger depending on how the environment was actively sampled. Taken together, the results show that quantitatively relating natural image statistics to neural representations and psychophysical behavior requires not only to take the structure of the environment into account, but also the physical properties of the visual system, and its active use in behavior.


In their own words: Topic analysis of the motivations and strategies of over 6,000 long-term weight-loss maintainers.

  • Suzanne Phelan‎ et al.
  • Obesity (Silver Spring, Md.)‎
  • 2022‎

This study aimed to identify major themes of a large cohort experiencing long-term weight-loss maintenance who answered open-ended questions about weight-loss triggers, current motivations, strategies, and experiences.


Is useful research data usually shared? An investigation of genome-wide association study summary statistics.

  • Mike Thelwall‎ et al.
  • PloS one‎
  • 2020‎

Primary data collected during a research study is often shared and may be reused for new studies. To assess the extent of data sharing in favourable circumstances and whether data sharing checks can be automated, this article investigates summary statistics from primary human genome-wide association studies (GWAS). This type of data is highly suitable for sharing because it is a standard research output, is straightforward to use in future studies (e.g., for secondary analysis), and may be already stored in a standard format for internal sharing within multi-site research projects. Manual checks of 1799 articles from 2010 and 2017 matching a simple PubMed query for molecular epidemiology GWAS were used to identify 314 primary human GWAS papers. Of these, only 13% reported the location of a complete set of GWAS summary data, increasing from 3% in 2010 to 23% in 2017. Whilst information about whether data was shared was typically located clearly within a data availability statement, the exact nature of the shared data was usually unspecified. Thus, data sharing is the exception even in suitable research fields with relatively strong data sharing norms. Moreover, the lack of clear data descriptions within data sharing statements greatly complicates the task of automatically characterising shared data sets.


Integrative PheWAS analysis in risk categorization of major depressive disorder and identifying their associations with genetic variants using a latent topic model approach.

  • Xiangfei Meng‎ et al.
  • Translational psychiatry‎
  • 2022‎

Major depressive disorder (MDD) is the most prevalent mental disorder that constitutes a major public health problem. A tool for predicting the risk of MDD could assist with the early identification of MDD patients and targeted interventions to reduce the risk. We aimed to derive a risk prediction tool that can categorize the risk of MDD as well as discover biologically meaningful genetic variants. Data analyzed were from the fourth and fifth data collections of a longitudinal community-based cohort from Southwest Montreal, Canada, between 2015 and 2018. To account for high dimensional features, we adopted a latent topic model approach to infer a set of topical distributions over those studied predictors that characterize the underlying meta-phenotypes of the MDD cohort. MDD probability derived from 30 MDD meta-phenotypes demonstrated superior prediction accuracy to differentiate MDD cases and controls. Six latent MDD meta-phenotypes we inferred via a latent topic model were highly interpretable. We then explored potential genetic variants that were statistically associated with these MDD meta-phenotypes. The genetic heritability of MDD meta-phenotypes was 0.126 (SE = 0.316), compared to 0.000001 (SE = 0.297) for MDD diagnosis defined by the structured interviews. We discovered a list of significant MDD - related genes and pathways that were missed by MDD diagnosis. Our risk prediction model confers not only accurate MDD risk categorization but also meaningful associations with genetic predispositions that are linked to MDD subtypes. Our findings shed light on future research focusing on these identified genes and pathways for MDD subtypes.


  1. SciCrunch.org Resources

    Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.

  2. Navigation

    You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.

  3. Logging in and Registering

    If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.

  4. Searching

    Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:

    1. Use quotes around phrases you want to match exactly
    2. You can manually AND and OR terms to change how we search between words
    3. You can add "-" to terms to make sure no results return with that term in them (ex. Cerebellum -CA1)
    4. You can add "+" to terms to require they be in the data
    5. Using autocomplete specifies which branch of our semantics you with to search and can help refine your search
  5. Save Your Search

    You can save any searches you perform for quick access to later from here.

  6. Query Expansion

    We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.

  7. Collections

    If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.

  8. Facets

    Here are the facets that you can filter your papers by.

  9. Options

    From here we'll present any options for the literature, such as exporting your current results.

  10. Further Questions

    If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.

Publications Per Year

X

Year:

Count: