Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs.

Nature communications | 2021

The large majority of variants identified by GWAS are non-coding, motivating detailed characterization of the function of non-coding variants. Experimental methods to assess variants' effect on gene expressions in native chromatin context via direct perturbation are low-throughput. Existing high-throughput computational predictors thus have lacked large gold standard sets of regulatory variants for training and validation. Here, we leverage a set of 14,807 putative causal eQTLs in humans obtained through statistical fine-mapping, and we use 6121 features to directly train a predictor of whether a variant modifies nearby gene expression. We call the resulting prediction the expression modifier score (EMS). We validate EMS by comparing its ability to prioritize functional variants with other major scores. We then use EMS as a prior for statistical fine-mapping of eQTLs to identify an additional 20,913 putatively causal eQTLs, and we incorporate EMS into co-localization analysis to identify 310 additional candidate genes across UK Biobank phenotypes.

Pubmed ID: 34099641 RIS Download

Research resources used in this publication

None found

Additional research tools detected in this publication

Antibodies used in this publication

None found

Associated grants

  • Agency: NIH HHS, United States
    Id: DP5 OD024582
  • Agency: Medical Research Council, United Kingdom
    Id: MC_PC_17228
  • Agency: Medical Research Council, United Kingdom
    Id: MC_QA137853

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


seaborn (tool)

RRID:SCR_018132

Software Python tool as data visualization library based on matplotlib. Provides interface for drawing attractive and informative statistical graphics. Statistical data visualization using matplotlib.

View all literature mentions

Pandas (tool)

RRID:SCR_018214

Software Python package for data analysis providing labeled data structures similar to R data. Provides data structures designed to make working with relational or labeled data. Software as building block for doing practical, real world open source data analysis and manipulation tool.

View all literature mentions