Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.

Search

Type in a keyword to search

On page 1 showing 1 ~ 19 papers out of 19 papers

Assessment of 2q23.1 microdeletion syndrome implicates MBD5 as a single causal locus of intellectual disability, epilepsy, and autism spectrum disorder.

  • Michael E Talkowski‎ et al.
  • American journal of human genetics‎
  • 2011‎

Persons with neurodevelopmental disorders or autism spectrum disorder (ASD) often harbor chromosomal microdeletions, yet the individual genetic contributors within these regions have not been systematically evaluated. We established a consortium of clinical diagnostic and research laboratories to accumulate a large cohort with genetic alterations of chromosomal region 2q23.1 and acquired 65 subjects with microdeletion or translocation. We sequenced translocation breakpoints; aligned microdeletions to determine the critical region; assessed effects on mRNA expression; and examined medical records, photos, and clinical evaluations. We identified a single gene, methyl-CpG-binding domain 5 (MBD5), as the only locus that defined the critical region. Partial or complete deletion of MBD5 was associated with haploinsufficiency of mRNA expression, intellectual disability, epilepsy, and autistic features. Fourteen alterations, including partial deletions of noncoding regions not typically captured or considered pathogenic by current diagnostic screening, disrupted MBD5 alone. Expression profiles and clinical characteristics were largely indistinguishable between MBD5-specific alteration and deletion of the entire 2q23.1 interval. No copy-number alterations of MBD5 were observed in 7878 controls, suggesting MBD5 alterations are highly penetrant. We surveyed MBD5 coding variations among 747 ASD subjects compared to 2043 non-ASD subjects analyzed by whole-exome sequencing and detected an association with a highly conserved methyl-CpG-binding domain missense variant, p.79Gly>Glu (c.236G>A) (p = 0.012). These results suggest that genetic alterations of MBD5 cause features of 2q23.1 microdeletion syndrome and that this epigenetic regulator significantly contributes to ASD risk, warranting further consideration in research and clinical diagnostic screening and highlighting the importance of chromatin remodeling in the etiology of these complex disorders.


SpeedSeq: ultra-fast personal genome analysis and interpretation.

  • Colby Chiang‎ et al.
  • Nature methods‎
  • 2015‎

SpeedSeq is an open-source genome analysis platform that accomplishes alignment, variant detection and functional annotation of a 50× human genome in 13 h on a low-cost server and alleviates a bioinformatics bottleneck that typically demands weeks of computation with extensive hands-on expert involvement. SpeedSeq offers performance competitive with or superior to current methods for detecting germline and somatic single-nucleotide variants, structural variants, insertions and deletions, and it includes novel functionality for streamlined interpretation.


The impact of structural variation on human gene expression.

  • Colby Chiang‎ et al.
  • Nature genetics‎
  • 2017‎

Structural variants (SVs) are an important source of human genetic diversity, but their contribution to traits, disease and gene regulation remains unclear. We mapped cis expression quantitative trait loci (eQTLs) in 13 tissues via joint analysis of SVs, single-nucleotide variants (SNVs) and short insertion/deletion (indel) variants from deep whole-genome sequencing (WGS). We estimated that SVs are causal at 3.5-6.8% of eQTLs-a substantially higher fraction than prior estimates-and that expression-altering SVs have larger effect sizes than do SNVs and indels. We identified 789 putative causal SVs predicted to directly alter gene expression: most (88.3%) were noncoding variants enriched at enhancers and other regulatory elements, and 52 were linked to genome-wide association study loci. We observed a notable abundance of rare high-impact SVs associated with aberrant expression of nearby genes. These results suggest that comprehensive WGS-based SV analyses will increase the power of common- and rare-variant association studies.


svtools: population-scale analysis of structural variation.

  • David E Larson‎ et al.
  • Bioinformatics (Oxford, England)‎
  • 2019‎

Large-scale human genetics studies are now employing whole genome sequencing with the goal of conducting comprehensive trait mapping analyses of all forms of genome variation. However, methods for structural variation (SV) analysis have lagged far behind those for smaller scale variants, and there is an urgent need to develop more efficient tools that scale to the size of human populations. Here, we present a fast and highly scalable software toolkit (svtools) and cloud-based pipeline for assembling high quality SV maps-including deletions, duplications, mobile element insertions, inversions and other rearrangements-in many thousands of human genomes. We show that this pipeline achieves similar variant detection performance to established per-sample methods (e.g. LUMPY), while providing fast and affordable joint analysis at the scale of ≥100 000 genomes. These tools will help enable the next generation of human genetics studies.


The genome of the vervet (Chlorocebus aethiops sabaeus).

  • Wesley C Warren‎ et al.
  • Genome research‎
  • 2015‎

We describe a genome reference of the African green monkey or vervet (Chlorocebus aethiops). This member of the Old World monkey (OWM) superfamily is uniquely valuable for genetic investigations of simian immunodeficiency virus (SIV), for which it is the most abundant natural host species, and of a wide range of health-related phenotypes assessed in Caribbean vervets (C. a. sabaeus), whose numbers have expanded dramatically since Europeans introduced small numbers of their ancestors from West Africa during the colonial era. We use the reference to characterize the genomic relationship between vervets and other primates, the intra-generic phylogeny of vervet subspecies, and genome-wide structural variations of a pedigreed C. a. sabaeus population. Through comparative analyses with human and rhesus macaque, we characterize at high resolution the unique chromosomal fission events that differentiate the vervets and their close relatives from most other catarrhine primates, in whom karyotype is highly conserved. We also provide a summary of transposable elements and contrast these with the rhesus macaque and human. Analysis of sequenced genomes representing each of the main vervet subspecies supports previously hypothesized relationships between these populations, which range across most of sub-Saharan Africa, while uncovering high levels of genetic diversity within each. Sequence-based analyses of major histocompatibility complex (MHC) polymorphisms reveal extremely low diversity in Caribbean C. a. sabaeus vervets, compared to vervets from putatively ancestral West African regions. In the C. a. sabaeus research population, we discover the first structural variations that are, in some cases, predicted to have a deleterious effect; future studies will determine the phenotypic impact of these variations.


The impact of rare variation on gene expression across tissues.

  • Xin Li‎ et al.
  • Nature‎
  • 2017‎

Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.


Association of structural variation with cardiometabolic traits in Finns.

  • Lei Chen‎ et al.
  • American journal of human genetics‎
  • 2021‎

The contribution of genome structural variation (SV) to quantitative traits associated with cardiometabolic diseases remains largely unknown. Here, we present the results of a study examining genetic association between SVs and cardiometabolic traits in the Finnish population. We used sensitive methods to identify and genotype 129,166 high-confidence SVs from deep whole-genome sequencing (WGS) data of 4,848 individuals. We tested the 64,572 common and low-frequency SVs for association with 116 quantitative traits and tested candidate associations using exome sequencing and array genotype data from an additional 15,205 individuals. We discovered 31 genome-wide significant associations at 15 loci, including 2 loci at which SVs have strong phenotypic effects: (1) a deletion of the ALB promoter that is greatly enriched in the Finnish population and causes decreased serum albumin level in carriers (p = 1.47 × 10-54) and is also associated with increased levels of total cholesterol (p = 1.22 × 10-28) and 14 additional cholesterol-related traits, and (2) a multi-allelic copy number variant (CNV) at PDPR that is strongly associated with pyruvate (p = 4.81 × 10-21) and alanine (p = 6.14 × 10-12) levels and resides within a structurally complex genomic region that has accumulated many rearrangements over evolutionary time. We also confirmed six previously reported associations, including five led by stronger signals in single nucleotide variants (SNVs) and one linking recurrent HP gene deletion and cholesterol levels (p = 6.24 × 10-10), which was also found to be strongly associated with increased glycoprotein level (p = 3.53 × 10-35). Our study confirms that integrating SVs in trait-mapping studies will expand our knowledge of genetic factors underlying disease risk.


Pervasive correlations between causal disease effects of proximal SNPs vary with functional annotations and implicate stabilizing selection.

  • Martin Jinye Zhang‎ et al.
  • medRxiv : the preprint server for health sciences‎
  • 2023‎

The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.


Potential molecular consequences of transgene integration: The R6/2 mouse example.

  • Jessie C Jacobsen‎ et al.
  • Scientific reports‎
  • 2017‎

Integration of exogenous DNA into a host genome represents an important route to generate animal and cellular models for exploration into human disease and therapeutic development. In most models, little is known concerning structural integrity of the transgene, precise site of integration, or its impact on the host genome. We previously used whole-genome and targeted sequencing approaches to reconstruct transgene structure and integration sites in models of Huntington's disease, revealing complex structural rearrangements that can result from transgenesis. Here, we demonstrate in the R6/2 mouse, a widely used Huntington's disease model, that integration of a rearranged transgene with coincident deletion of 5,444 bp of host genome within the gene Gm12695 has striking molecular consequences. Gm12695, the function of which is unknown, is normally expressed at negligible levels in mouse brain, but transgene integration has resulted in cortical expression of a partial fragment (exons 8-11) 3' to the transgene integration site in R6/2. This transcript shows significant expression among the extensive network of differentially expressed genes associated with this model, including synaptic transmission, cell signalling and transcription. These data illustrate the value of sequence-level resolution of transgene insertions and transcription analysis to inform phenotypic characterization of transgenic models utilized in therapeutic research.


Mutations in DCHS1 cause mitral valve prolapse.

  • Ronen Durst‎ et al.
  • Nature‎
  • 2015‎

Mitral valve prolapse (MVP) is a common cardiac valve disease that affects nearly 1 in 40 individuals. It can manifest as mitral regurgitation and is the leading indication for mitral valve surgery. Despite a clear heritable component, the genetic aetiology leading to non-syndromic MVP has remained elusive. Four affected individuals from a large multigenerational family segregating non-syndromic MVP underwent capture sequencing of the linked interval on chromosome 11. We report a missense mutation in the DCHS1 gene, the human homologue of the Drosophila cell polarity gene dachsous (ds), that segregates with MVP in the family. Morpholino knockdown of the zebrafish homologue dachsous1b resulted in a cardiac atrioventricular canal defect that could be rescued by wild-type human DCHS1, but not by DCHS1 messenger RNA with the familial mutation. Further genetic studies identified two additional families in which a second deleterious DCHS1 mutation segregates with MVP. Both DCHS1 mutations reduce protein stability as demonstrated in zebrafish, cultured cells and, notably, in mitral valve interstitial cells (MVICs) obtained during mitral valve repair surgery of a proband. Dchs1(+/-) mice had prolapse of thickened mitral leaflets, which could be traced back to developmental errors in valve morphogenesis. DCHS1 deficiency in MVP patient MVICs, as well as in Dchs1(+/-) mouse MVICs, result in altered migration and cellular patterning, supporting these processes as aetiological underpinnings for the disease. Understanding the role of DCHS1 in mitral valve development and MVP pathogenesis holds potential for therapeutic insights for this very common disease.


LUMPY: a probabilistic framework for structural variant discovery.

  • Ryan M Layer‎ et al.
  • Genome biology‎
  • 2014‎

Comprehensive discovery of structural variation (SV) from whole genome sequencing data requires multiple detection signals including read-pair, split-read, read-depth and prior knowledge. Owing to technical challenges, extant SV discovery algorithms either use one signal in isolation, or at best use two sequentially. We present LUMPY, a novel SV discovery framework that naturally integrates multiple SV signals jointly across multiple samples. We show that LUMPY yields improved sensitivity, especially when SV signal is reduced owing to either low coverage data or low intra-sample variant allele frequency. We also report a set of 4,564 validated breakpoints from the NA12878 human genome. https://github.com/arq5x/lumpy-sv.


Disruption of a large intergenic noncoding RNA in subjects with neurodevelopmental disabilities.

  • Michael E Talkowski‎ et al.
  • American journal of human genetics‎
  • 2012‎

Large intergenic noncoding (linc) RNAs represent a newly described class of ribonucleic acid whose importance in human disease remains undefined. We identified a severely developmentally delayed 16-year-old female with karyotype 46,XX,t(2;11)(p25.1;p15.1)dn in the absence of clinically significant copy number variants (CNVs). DNA capture followed by next-generation sequencing of the translocation breakpoints revealed disruption of a single noncoding gene on chromosome 2, LINC00299, whose RNA product is expressed in all tissues measured, but most abundantly in brain. Among a series of additional, unrelated subjects referred for clinical diagnostic testing who showed CNV affecting this locus, we identified four with exon-crossing deletions in association with neurodevelopmental abnormalities. No disruption of the LINC00299 coding sequence was seen in almost 14,000 control subjects. Together, these subjects with disruption of LINC00299 implicate this particular noncoding RNA in brain development and raise the possibility that, as a class, abnormalities of lincRNAs may play a significant role in human developmental disorders.


Molecular analysis of a deletion hotspot in the NRXN1 region reveals the involvement of short inverted repeats in deletion CNVs.

  • Xiaoli Chen‎ et al.
  • American journal of human genetics‎
  • 2013‎

NRXN1 microdeletions occur at a relatively high frequency and confer increased risk for neurodevelopmental and neurobehavioral abnormalities. The mechanism that makes NRXN1 a deletion hotspot is unknown. Here, we identified deletions of the NRXN1 region in affected cohorts, confirming a strong association with the autism spectrum and other neurodevelopmental disorders. Interestingly, deletions in both affected and control individuals were clustered in the 5' portion of NRXN1 and its immediate upstream region. To explore the mechanism of deletion, we mapped and analyzed the breakpoints of 32 deletions. At the deletion breakpoints, frequent microhomology (68.8%, 2-19 bp) suggested predominant mechanisms of DNA replication error and/or microhomology-mediated end-joining. Long terminal repeat (LTR) elements, unique non-B-DNA structures, and MEME-defined sequence motifs were significantly enriched, but Alu and LINE sequences were not. Importantly, small-size inverted repeats (minus self chains, minus sequence motifs, and partial complementary sequences) were significantly overrepresented in the vicinity of NRXN1 region deletion breakpoints, suggesting that, although they are not interrupted by the deletion process, such inverted repeats can predispose a region to genomic instability by mediating single-strand DNA looping via the annealing of partially reverse complementary strands and the promoting of DNA replication fork stalling and DNA replication error. Our observations highlight the potential importance of inverted repeats of variable sizes in generating a rearrangement hotspot in which individual breakpoints are not recurrent. Mechanisms that involve short inverted repeats in initiating deletion may also apply to other deletion hotspots in the human genome.


Next-generation sequencing strategies enable routine detection of balanced chromosome rearrangements for clinical diagnostics and genetic research.

  • Michael E Talkowski‎ et al.
  • American journal of human genetics‎
  • 2011‎

The contribution of balanced chromosomal rearrangements to complex disorders remains unclear because they are not detected routinely by genome-wide microarrays and clinical localization is imprecise. Failure to consider these events bypasses a potentially powerful complement to single nucleotide polymorphism and copy-number association approaches to complex disorders, where much of the heritability remains unexplained. To capitalize on this genetic resource, we have applied optimized sequencing and analysis strategies to test whether these potentially high-impact variants can be mapped at reasonable cost and throughput. By using a whole-genome multiplexing strategy, rearrangement breakpoints could be delineated at a fraction of the cost of standard sequencing. For rearrangements already mapped regionally by karyotyping and fluorescence in situ hybridization, a targeted approach enabled capture and sequencing of multiple breakpoints simultaneously. Importantly, this strategy permitted capture and unique alignment of up to 97% of repeat-masked sequences in the targeted regions. Genome-wide analyses estimate that only 3.7% of bases should be routinely omitted from genomic DNA capture experiments. Illustrating the power of these approaches, the rearrangement breakpoints were rapidly defined to base pair resolution and revealed unexpected sequence complexity, such as co-occurrence of inversion and translocation as an underlying feature of karyotypically balanced alterations. These findings have implications ranging from genome annotation to de novo assemblies and could enable sequencing screens for structural variations at a cost comparable to that of microarrays in standard clinical practice.


Pervasive correlations between causal disease effects of proximal SNPs vary with functional annotations and implicate stabilizing selection.

  • Martin Jinye Zhang‎ et al.
  • Research square‎
  • 2023‎

The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.


Identification of Drivers of Aneuploidy in Breast Tumors.

  • Katherine Pfister‎ et al.
  • Cell reports‎
  • 2018‎

Although aneuploidy is found in the majority of tumors, the degree of aneuploidy varies widely. It is unclear how cancer cells become aneuploid or how highly aneuploid tumors are different from those of more normal ploidy. We developed a simple computational method that measures the degree of aneuploidy or structural rearrangements of large chromosome regions of 522 human breast tumors from The Cancer Genome Atlas (TCGA). Highly aneuploid tumors overexpress activators of mitotic transcription and the genes encoding proteins that segregate chromosomes. Overexpression of three mitotic transcriptional regulators, E2F1, MYBL2, and FOXM1, is sufficient to increase the rate of lagging anaphase chromosomes in a non-transformed vertebrate tissue, demonstrating that this event can initiate aneuploidy. Highly aneuploid human breast tumors are also enriched in TP53 mutations. TP53 mutations co-associate with the overexpression of mitotic transcriptional activators, suggesting that these events work together to provide fitness to breast tumors.


Exonic deletions in AUTS2 cause a syndromic form of intellectual disability and suggest a critical role for the C terminus.

  • Gea Beunders‎ et al.
  • American journal of human genetics‎
  • 2013‎

Genomic rearrangements involving AUTS2 (7q11.22) are associated with autism and intellectual disability (ID), although evidence for causality is limited. By combining the results of diagnostic testing of 49,684 individuals, we identified 24 microdeletions that affect at least one exon of AUTS2, as well as one translocation and one inversion each with a breakpoint within the AUTS2 locus. Comparison of 17 well-characterized individuals enabled identification of a variable syndromic phenotype including ID, autism, short stature, microcephaly, cerebral palsy, and facial dysmorphisms. The dysmorphic features were more pronounced in persons with 3'AUTS2 deletions. This part of the gene is shown to encode a C-terminal isoform (with an alternative transcription start site) expressed in the human brain. Consistent with our genetic data, suppression of auts2 in zebrafish embryos caused microcephaly that could be rescued by either the full-length or the C-terminal isoform of AUTS2. Our observations demonstrate a causal role of AUTS2 in neurocognitive disorders, establish a hitherto unappreciated syndromic phenotype at this locus, and show how transcriptional complexity can underpin human pathology. The zebrafish model provides a valuable tool for investigating the etiology of AUTS2 syndrome and facilitating gene-function analysis in the future.


Mapping and characterization of structural variation in 17,795 human genomes.

  • Haley J Abel‎ et al.
  • Nature‎
  • 2020‎

A key goal of whole-genome sequencing for studies of human genetics is to interrogate all forms of variation, including single-nucleotide variants, small insertion or deletion (indel) variants and structural variants. However, tools and resources for the study of structural variants have lagged behind those for smaller variants. Here we used a scalable pipeline1 to map and characterize structural variants in 17,795 deeply sequenced human genomes. We publicly release site-frequency data to create the largest, to our knowledge, whole-genome-sequencing-based structural variant resource so far. On average, individuals carry 2.9 rare structural variants that alter coding regions; these variants affect the dosage or structure of 4.2 genes and account for 4.0-11.2% of rare high-impact coding alleles. Using a computational model, we estimate that structural variants account for 17.2% of rare alleles genome-wide, with predicted deleterious effects that are equivalent to loss-of-function coding alleles; approximately 90% of such structural variants are noncoding deletions (mean 19.1 per genome). We report 158,991 ultra-rare structural variants and show that 2% of individuals carry ultra-rare megabase-scale structural variants, nearly half of which are balanced or complex rearrangements. Finally, we infer the dosage sensitivity of genes and noncoding elements, and reveal trends that relate to element class and conservation. This work will help to guide the analysis and interpretation of structural variants in the era of whole-genome sequencing.


Structural variants are a major source of gene expression differences in humans and often affect multiple nearby genes.

  • Alexandra J Scott‎ et al.
  • Genome research‎
  • 2021‎

Structural variants (SVs) are an important source of human genome diversity, but their functional effects are poorly understood. We mapped 61,668 SVs in 613 individuals from the GTEx project and measured their effects on gene expression. We estimate that common SVs are causal at 2.66% of eQTLs, a 10.5-fold enrichment relative to their abundance in the genome. Duplications and deletions were the most impactful variant types, whereas the contribution of mobile element insertions was small (0.12% of eQTLs, 1.9-fold enriched). Multitissue analysis of eQTLs revealed that gene-altering SVs show more constitutive effects than other variant types, with 62.09% of coding SV-eQTLs active in all tissues with eQTL activity compared with 23.08% of coding SNV- and indel-eQTLs. Noncoding SVs, SNVs and indels show broadly similar patterns. We also identified 539 rare SVs associated with nearby gene expression outliers. Of these, 62.34% are noncoding SVs that affect gene expression but have modest enrichment at regulatory elements, showing that rare noncoding SVs are a major source of gene expression differences but remain difficult to predict from current annotations. Both common and rare SVs often affect the expression of multiple genes: SV-eQTLs affect an average of 1.82 nearby genes, whereas SNV- and indel-eQTLs affect an average of 1.09 genes, and 21.34% of rare expression-altering SVs show effects on two to nine different genes. We also observe significant effects on rare gene expression changes extending 1 Mb from the SV. This provides a mechanism by which individual SVs may have strong or pleiotropic effects on phenotypic variation.


  1. SciCrunch.org Resources

    Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.

  2. Navigation

    You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.

  3. Logging in and Registering

    If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.

  4. Searching

    Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:

    1. Use quotes around phrases you want to match exactly
    2. You can manually AND and OR terms to change how we search between words
    3. You can add "-" to terms to make sure no results return with that term in them (ex. Cerebellum -CA1)
    4. You can add "+" to terms to require they be in the data
    5. Using autocomplete specifies which branch of our semantics you with to search and can help refine your search
  5. Save Your Search

    You can save any searches you perform for quick access to later from here.

  6. Query Expansion

    We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.

  7. Collections

    If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.

  8. Facets

    Here are the facets that you can filter your papers by.

  9. Options

    From here we'll present any options for the literature, such as exporting your current results.

  10. Further Questions

    If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.

Publications Per Year

X

Year:

Count: