FDI Lab - SciCrunch.org | Searching in Literature

The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity.

Dominik G Grimm‎ et al.
Human mutation‎
2015‎

Prioritizing missense variants for further experimental investigation is a key challenge in current sequencing studies for exploring complex and Mendelian diseases. A large number of in silico tools have been employed for the task of pathogenicity prediction, including PolyPhen-2, SIFT, FatHMM, MutationTaster-2, MutationAssessor, Combined Annotation Dependent Depletion, LRT, phyloP, and GERP++, as well as optimized methods of combining tool scores, such as Condel and Logit. Due to the wealth of these methods, an important practical question to answer is which of these tools generalize best, that is, correctly predict the pathogenic character of new variants. We here demonstrate in a study of 10 tools on five datasets that such a comparative evaluation of these tools is hindered by two types of circularity: they arise due to (1) the same variants or (2) different variants from the same protein occurring both in the datasets used for training and for evaluation of these tools, which may lead to overly optimistic results. We show that comparative evaluations of predictors that do not address these types of circularity may erroneously conclude that circularity confounded tools are most accurate among all tools, and may even outperform optimized combinations of tools.

CDG: An Online Server for Detecting Biologically Closest Disease-Causing Genes and its Application to Primary Immunodeficiency.

David Requena‎ et al.
Frontiers in immunology‎
2018‎

High-throughput genomic technologies yield about 20,000 variants in the protein-coding exome of each individual. A commonly used approach to select candidate disease-causing variants is to test whether the associated gene has been previously reported to be disease-causing. In the absence of known disease-causing genes, it can be challenging to associate candidate genes with specific genetic diseases. To facilitate the discovery of novel gene-disease associations, we determined the putative biologically closest known genes and their associated diseases for 13,005 human genes not currently reported to be disease-associated. We used these data to construct the closest disease-causing genes (CDG) server, which can be used to infer the closest genes with an associated disease for a user-defined list of genes or diseases. We demonstrate the utility of the CDG server in five immunodeficiency patient exomes across different diseases and modes of inheritance, where CDG dramatically reduced the number of candidate genes to be evaluated. This resource will be a considerable asset for ascertaining the potential relevance of genetic variants found in patient exomes to specific diseases of interest. The CDG database and online server are freely available to non-commercial users at: http://lab.rockefeller.edu/casanova/CDG.

Insights into hominid evolution from the gorilla genome sequence.

Aylwyn Scally‎ et al.
Nature‎
2012‎

Gorillas are humans' closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.

Distinct sequence features underlie microdeletions and gross deletions in the human genome.

Mengling Qi‎ et al.
Human mutation‎
2022‎

Microdeletions and gross deletions are important causes (~20%) of human inherited disease and their genomic locations are strongly influenced by the local DNA sequence environment. This notwithstanding, no study has systematically examined their underlying generative mechanisms. Here, we obtained 42,098 pathogenic microdeletions and gross deletions from the Human Gene Mutation Database (HGMD) that together form a continuum of germline deletions ranging in size from 1 to 28,394,429 bp. We analyzed the DNA sequence within 1 kb of the breakpoint junctions and found that the frequencies of non-B DNA-forming repeats, GC-content, and the presence of seven of 78 specific sequence motifs in the vicinity of pathogenic deletions correlated with deletion length for deletions of length ≤30 bp. Further, we found that the presence of DR, GQ, and STR repeats is important for the formation of longer deletions (>30 bp) but not for the formation of shorter deletions (≤30 bp) while significantly (χ2 , p < 2E-16) more microhomologies were identified flanking short deletions than long deletions (length >30 bp). We provide evidence to support a functional distinction between microdeletions and gross deletions. Finally, we propose that a deletion length cut-off of 25-30 bp may serve as an objective means to functionally distinguish microdeletions from gross deletions.

AVADA: toward automated pathogenic variant evidence retrieval directly from the full-text literature.

Johannes Birgmeier‎ et al.
Genetics in medicine : official journal of the American College of Medical Genetics‎
2020‎

Both monogenic pathogenic variant cataloging and clinical patient diagnosis start with variant-level evidence retrieval followed by expert evidence integration in search of diagnostic variants and genes. Here, we try to accelerate pathogenic variant evidence retrieval by an automatic approach.

Prospects for the automated extraction of mutation data from the scientific literature.

Peter D Stenson‎ et al.
Human genomics‎
2010‎

No abstract available

Deleterious- and disease-allele prevalence in healthy individuals: insights from current predictions, mutation databases, and population-scale resequencing.

Yali Xue‎ et al.
American journal of human genetics‎
2012‎

We have assessed the numbers of potentially deleterious variants in the genomes of apparently healthy humans by using (1) low-coverage whole-genome sequence data from 179 individuals in the 1000 Genomes Pilot Project and (2) current predictions and databases of deleterious variants. Each individual carried 281-515 missense substitutions, 40-85 of which were homozygous, predicted to be highly damaging. They also carried 40-110 variants classified by the Human Gene Mutation Database (HGMD) as disease-causing mutations (DMs), 3-24 variants in the homozygous state, and many polymorphisms putatively associated with disease. Whereas many of these DMs are likely to represent disease-allele-annotation errors, between 0 and 8 DMs (0-1 homozygous) per individual are predicted to be highly damaging, and some of them provide information of medical relevance. These analyses emphasize the need for improved annotation of disease alleles both in mutation databases and in the primary literature; some HGMD mutation data have been recategorized on the basis of the present findings, an iterative process that is both necessary and ongoing. Our estimates of deleterious-allele numbers are likely to be subject to both overcounting and undercounting. However, our current best mean estimates of ~400 damaging variants and ~2 bona fide disease mutations per individual are likely to increase rather than decrease as sequencing studies ascertain rare variants more effectively and as additional disease alleles are discovered.

Methylation-mediated deamination of 5-methylcytosine appears to give rise to mutations causing human inherited disease in CpNpG trinucleotides, as well as in CpG dinucleotides.

David N Cooper‎ et al.
Human genomics‎
2010‎

The cytosine-guanine (CpG) dinucleotide has long been known to be a hotspot for pathological mutation in the human genome. This hypermutability is related to its role as the major site of cytosine methylation with the attendant risk of spontaneous deamination of 5-methylcytosine (5mC) to yield thymine. Cytosine methylation, however, also occurs in the context of CpNpG sites in the human genome, an unsurprising finding since the intrinsic symmetry of CpNpG renders it capable of supporting a semi-conservative model of replication of the methylation pattern. Recently, it has become clear that significant DNA methylation occurs in a CpHpG context (where H = A, C or T) in a variety of human somatic tissues. If we assume that CpHpG methylation also occurs in the germline, and that 5mC deamination can occur within a CpHpG context, then we might surmise that methylated CpHpG sites could also constitute mutation hotspots causing human genetic disease. To test this postulate, 54,625 missense and nonsense mutations from 2,113 genes causing inherited disease were retrieved from the Human Gene Mutation Database (http://www.hgmd.org). Some 18.2 per cent of these pathological lesions were found to be C → T and G → A transitions located in CpG dinucleotides (compatible with a model of methylation-mediated deamination of 5mC), an approximately ten-fold higher proportion than would have been expected by chance alone. The corresponding proportion for the CpHpG trinucleotide was 9.9 per cent, an approximately two-fold higher proportion than would have been expected by chance. We therefore estimate that ∼5 per cent of missense/nonsense mutations causing human inherited disease may be attributable to methylation-mediated deamination of 5mC within a CpHpG context.

The Human Gene Mutation Database (HGMD®): optimizing its use in a clinical diagnostic or research setting.

Peter D Stenson‎ et al.
Human genetics‎
2020‎

The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that are thought to underlie, or are closely associated with human inherited disease. At the time of writing (June 2020), the database contains in excess of 289,000 different gene lesions identified in over 11,100 genes manually curated from 72,987 articles published in over 3100 peer-reviewed journals. There are primarily two main groups of users who utilise HGMD on a regular basis; research scientists and clinical diagnosticians. This review aims to highlight how to make the most out of HGMD data in each setting.

A massively parallel pipeline to clone DNA variants and examine molecular phenotypes of human disease mutations.

Xiaomu Wei‎ et al.
PLoS genetics‎
2014‎

Understanding the functional relevance of DNA variants is essential for all exome and genome sequencing projects. However, current mutagenesis cloning protocols require Sanger sequencing, and thus are prohibitively costly and labor-intensive. We describe a massively-parallel site-directed mutagenesis approach, "Clone-seq", leveraging next-generation sequencing to rapidly and cost-effectively generate a large number of mutant alleles. Using Clone-seq, we further develop a comparative interactome-scanning pipeline integrating high-throughput GFP, yeast two-hybrid (Y2H), and mass spectrometry assays to systematically evaluate the functional impact of mutations on protein stability and interactions. We use this pipeline to show that disease mutations on protein-protein interaction interfaces are significantly more likely than those away from interfaces to disrupt corresponding interactions. We also find that mutation pairs with similar molecular phenotypes in terms of both protein stability and interactions are significantly more likely to cause the same disease than those with different molecular phenotypes, validating the in vivo biological relevance of our high-throughput GFP and Y2H assays, and indicating that both assays can be used to determine candidate disease mutations in the future. The general scheme of our experimental pipeline can be readily expanded to other types of interactome-mapping methods to comprehensively evaluate the functional relevance of all DNA variants, including those in non-coding regions.

Human genomics. Effect of predicted protein-truncating genetic variants on the human transcriptome.

Manuel A Rivas‎ et al.
Science (New York, N.Y.)‎
2015‎

Accurate prediction of the functional effect of genetic variation is critical for clinical genome interpretation. We systematically characterized the transcriptome effects of protein-truncating variants, a class of variants expected to have profound effects on gene function, using data from the Genotype-Tissue Expression (GTEx) and Geuvadis projects. We quantitated tissue-specific and positional effects on nonsense-mediated transcript decay and present an improved predictive model for this decay. We directly measured the effect of variants both proximal and distal to splice junctions. Furthermore, we found that robustness to heterozygous gene inactivation is not due to dosage compensation. Our results illustrate the value of transcriptome data in the functional interpretation of genetic variants.

iRegNet3D: three-dimensional integrated regulatory network for the genomic analysis of coding and non-coding disease mutations.

Siqi Liang‎ et al.
Genome biology‎
2017‎

The mechanistic details of most disease-causing mutations remain poorly explored within the context of regulatory networks. We present a high-resolution three-dimensional integrated regulatory network (iRegNet3D) in the form of a web tool, where we resolve the interfaces of all known transcription factor (TF)-TF, TF-DNA and chromatin-chromatin interactions for the analysis of both coding and non-coding disease-associated mutations to obtain mechanistic insights into their functional impact. Using iRegNet3D, we find that disease-associated mutations may perturb the regulatory network through diverse mechanisms including chromatin looping. iRegNet3D promises to be an indispensable tool in large-scale sequencing and disease association studies.

Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models.

Hashem A Shihab‎ et al.
Human mutation‎
2013‎

The rate at which nonsynonymous single nucleotide polymorphisms (nsSNPs) are being identified in the human genome is increasing dramatically owing to advances in whole-genome/whole-exome sequencing technologies. Automated methods capable of accurately and reliably distinguishing between pathogenic and functionally neutral nsSNPs are therefore assuming ever-increasing importance. Here, we describe the Functional Analysis Through Hidden Markov Models (FATHMM) software and server: a species-independent method with optional species-specific weightings for the prediction of the functional effects of protein missense variants. Using a model weighted for human mutations, we obtained performance accuracies that outperformed traditional prediction methods (i.e., SIFT, PolyPhen, and PANTHER) on two separate benchmarks. Furthermore, in one benchmark, we achieve performance accuracies that outperform current state-of-the-art prediction methods (i.e., SNPs&GO and MutPred). We demonstrate that FATHMM can be efficiently applied to high-throughput/large-scale human and nonhuman genome sequencing projects with the added benefit of phenotypic outcome associations. To illustrate this, we evaluated nsSNPs in wheat (Triticum spp.) to identify some of the important genetic variants responsible for the phenotypic differences introduced by intense selection during domestication. A Web-based implementation of FATHMM, including a high-throughput batch facility and a downloadable standalone package, is available at http://fathmm.biocompute.org.uk.

CRAVAT: cancer-related analysis of variants toolkit.

Christopher Douville‎ et al.
Bioinformatics (Oxford, England)‎
2013‎

Advances in sequencing technology have greatly reduced the costs incurred in collecting raw sequencing data. Academic laboratories and researchers therefore now have access to very large datasets of genomic alterations but limited time and computational resources to analyse their potential biological importance. Here, we provide a web-based application, Cancer-Related Analysis of Variants Toolkit, designed with an easy-to-use interface to facilitate the high-throughput assessment and prioritization of genes and missense alterations important for cancer tumorigenesis. Cancer-Related Analysis of Variants Toolkit provides predictive scores for germline variants, somatic mutations and relative gene importance, as well as annotations from published literature and databases. Results are emailed to users as MS Excel spreadsheets and/or tab-separated text files.

Identifying Mendelian disease genes with the variant effect scoring tool.

Hannah Carter‎ et al.
BMC genomics‎
2013‎

Whole exome sequencing studies identify hundreds to thousands of rare protein coding variants of ambiguous significance for human health. Computational tools are needed to accelerate the identification of specific variants and genes that contribute to human disease.

Evolutionary conservation and selection of human disease gene orthologs in the rat and mouse genomes.

Hui Huang‎ et al.
Genome biology‎
2004‎

Model organisms have contributed substantially to our understanding of the etiology of human disease as well as having assisted with the development of new treatment modalities. The availability of the human, mouse and, most recently, the rat genome sequences now permit the comprehensive investigation of the rodent orthologs of genes associated with human disease. Here, we investigate whether human disease genes differ significantly from their rodent orthologs with respect to their overall levels of conservation and their rates of evolutionary change.

The Human Gene Mutation Database: 2008 update.

Peter D Stenson‎ et al.
Genome medicine‎
2009‎

The Human Gene Mutation Database (HGMD((R))) is a comprehensive core collection of germline mutations in nuclear genes that underlie or are associated with human inherited disease. Here, we summarize the history of the database and its current resources. By December 2008, the database contained over 85,000 different lesions detected in 3,253 different genes, with new entries currently accumulating at a rate exceeding 9,000 per annum. Although originally established for the scientific study of mutational mechanisms in human genes, HGMD has since acquired a much broader utility for researchers, physicians, clinicians and genetic counselors as well as for companies specializing in biopharmaceuticals, bioinformatics and personalized genomics. HGMD was first made publicly available in April 1996, and a collaboration was initiated in 2006 between HGMD and BIOBASE GmbH. This cooperative agreement covers the exclusive worldwide marketing of the most up-to-date (subscription) version of HGMD, HGMD Professional, to academic, clinical and commercial users.

Extensive disruption of protein interactions by genetic variants across the allele frequency spectrum in human populations.

Robert Fragoza‎ et al.
Nature communications‎
2019‎

Each human genome carries tens of thousands of coding variants. The extent to which this variation is functional and the mechanisms by which they exert their influence remains largely unexplored. To address this gap, we leverage the ExAC database of 60,706 human exomes to investigate experimentally the impact of 2009 missense single nucleotide variants (SNVs) across 2185 protein-protein interactions, generating interaction profiles for 4797 SNV-interaction pairs, of which 421 SNVs segregate at > 1% allele frequency in human populations. We find that interaction-disruptive SNVs are prevalent at both rare and common allele frequencies. Furthermore, these results suggest that 10.5% of missense variants carried per individual are disruptive, a higher proportion than previously reported; this indicates that each individual's genetic makeup may be significantly more complex than expected. Finally, we demonstrate that candidate disease-associated mutations can be identified through shared interaction perturbations between variants of interest and known disease mutations.

Genome-wide detection of human intronic AG-gain variants located between splicing branchpoints and canonical splice acceptor sites.

Peng Zhang‎ et al.
Proceedings of the National Academy of Sciences of the United States of America‎
2023‎

Human genetic variants that introduce an AG into the intronic region between the branchpoint (BP) and the canonical splice acceptor site (ACC) of protein-coding genes can disrupt pre-mRNA splicing. Using our genome-wide BP database, we delineated the BP-ACC segments of all human introns and found extreme depletion of AG/YAG in the [BP+8, ACC-4] high-risk region. We developed AGAIN as a genome-wide computational approach to systematically and precisely pinpoint intronic AG-gain variants within the BP-ACC regions. AGAIN identified 350 AG-gain variants from the Human Gene Mutation Database, all of which alter splicing and cause disease. Among them, 74% created new acceptor sites, whereas 31% resulted in complete exon skipping. AGAIN also predicts the protein-level products resulting from these two consequences. We performed AGAIN on our exome/genomes database of patients with severe infectious diseases but without known genetic etiology and identified a private homozygous intronic AG-gain variant in the antimycobacterial gene SPPL2A in a patient with mycobacterial disease. AGAIN also predicts a retention of six intronic nucleotides that encode an in-frame stop codon, turning AG-gain into stop-gain. This allele was then confirmed experimentally to lead to loss of function by disrupting splicing. We further showed that AG-gain variants inside the high-risk region led to misspliced products, while those outside the region did not, by two case studies in genes STAT1 and IRF7. We finally evaluated AGAIN on our 14 paired exome-RNAseq samples and found that 82% of AG-gain variants in high-risk regions showed evidence of missplicing. AGAIN is publicly available from https://hgidsoft.rockefeller.edu/AGAIN and https://github.com/casanova-lab/AGAIN.

The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies.

Peter D Stenson‎ et al.
Human genetics‎
2017‎

The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that underlie, or are closely associated with human inherited disease. At the time of writing (March 2017), the database contained in excess of 203,000 different gene lesions identified in over 8000 genes manually curated from over 2600 journals. With new mutation entries currently accumulating at a rate exceeding 17,000 per annum, HGMD represents de facto the central unified gene/disease-oriented repository of heritable mutations causing human genetic disease used worldwide by researchers, clinicians, diagnostic laboratories and genetic counsellors, and is an essential tool for the annotation of next-generation sequencing data. The public version of HGMD ( http://www.hgmd.org ) is freely available to registered users from academic institutions and non-profit organisations whilst the subscription version (HGMD Professional) is available to academic, clinical and commercial users under license via QIAGEN Inc.

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity.

CDG: An Online Server for Detecting Biologically Closest Disease-Causing Genes and its Application to Primary Immunodeficiency.

Insights into hominid evolution from the gorilla genome sequence.

Distinct sequence features underlie microdeletions and gross deletions in the human genome.

AVADA: toward automated pathogenic variant evidence retrieval directly from the full-text literature.

Prospects for the automated extraction of mutation data from the scientific literature.

Deleterious- and disease-allele prevalence in healthy individuals: insights from current predictions, mutation databases, and population-scale resequencing.

Methylation-mediated deamination of 5-methylcytosine appears to give rise to mutations causing human inherited disease in CpNpG trinucleotides, as well as in CpG dinucleotides.

The Human Gene Mutation Database (HGMD®): optimizing its use in a clinical diagnostic or research setting.

A massively parallel pipeline to clone DNA variants and examine molecular phenotypes of human disease mutations.

Human genomics. Effect of predicted protein-truncating genetic variants on the human transcriptome.

iRegNet3D: three-dimensional integrated regulatory network for the genomic analysis of coding and non-coding disease mutations.

Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models.

CRAVAT: cancer-related analysis of variants toolkit.

Identifying Mendelian disease genes with the variant effect scoring tool.

Evolutionary conservation and selection of human disease gene orthologs in the rat and mouse genomes.

The Human Gene Mutation Database: 2008 update.

Extensive disruption of protein interactions by genetic variants across the allele frequency spectrum in human populations.

Genome-wide detection of human intronic AG-gain variants located between splicing branchpoints and canonical splice acceptor sites.

The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies.

SciCrunch.org Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Facets

Options

Further Questions

About

Recent News Entries

Contact Us

SciCrunch

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Log in

Log in

Literature

Current Facets and Filters

Options

Facets

Recent searches

.in-collection { color: green; } The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity.

.in-collection { color: green; } CDG: An Online Server for Detecting Biologically Closest Disease-Causing Genes and its Application to Primary Immunodeficiency.

.in-collection { color: green; } Insights into hominid evolution from the gorilla genome sequence.

.in-collection { color: green; } Distinct sequence features underlie microdeletions and gross deletions in the human genome.

.in-collection { color: green; } AVADA: toward automated pathogenic variant evidence retrieval directly from the full-text literature.

.in-collection { color: green; } Prospects for the automated extraction of mutation data from the scientific literature.

.in-collection { color: green; } Deleterious- and disease-allele prevalence in healthy individuals: insights from current predictions, mutation databases, and population-scale resequencing.

.in-collection { color: green; } Methylation-mediated deamination of 5-methylcytosine appears to give rise to mutations causing human inherited disease in CpNpG trinucleotides, as well as in CpG dinucleotides.

.in-collection { color: green; } The Human Gene Mutation Database (HGMD®): optimizing its use in a clinical diagnostic or research setting.

.in-collection { color: green; } A massively parallel pipeline to clone DNA variants and examine molecular phenotypes of human disease mutations.

.in-collection { color: green; } Human genomics. Effect of predicted protein-truncating genetic variants on the human transcriptome.

.in-collection { color: green; } iRegNet3D: three-dimensional integrated regulatory network for the genomic analysis of coding and non-coding disease mutations.

.in-collection { color: green; } Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models.

.in-collection { color: green; } CRAVAT: cancer-related analysis of variants toolkit.

.in-collection { color: green; } Identifying Mendelian disease genes with the variant effect scoring tool.

.in-collection { color: green; } Evolutionary conservation and selection of human disease gene orthologs in the rat and mouse genomes.

.in-collection { color: green; } The Human Gene Mutation Database: 2008 update.

.in-collection { color: green; } Extensive disruption of protein interactions by genetic variants across the allele frequency spectrum in human populations.

.in-collection { color: green; } Genome-wide detection of human intronic AG-gain variants located between splicing branchpoints and canonical splice acceptor sites.

.in-collection { color: green; } The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies.

SciCrunch.org Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Facets

Options

Further Questions

Publications Per Year

About

Recent News Entries

Contact Us

SciCrunch

The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity.

CDG: An Online Server for Detecting Biologically Closest Disease-Causing Genes and its Application to Primary Immunodeficiency.

Insights into hominid evolution from the gorilla genome sequence.

Distinct sequence features underlie microdeletions and gross deletions in the human genome.

AVADA: toward automated pathogenic variant evidence retrieval directly from the full-text literature.

Prospects for the automated extraction of mutation data from the scientific literature.

Deleterious- and disease-allele prevalence in healthy individuals: insights from current predictions, mutation databases, and population-scale resequencing.

Methylation-mediated deamination of 5-methylcytosine appears to give rise to mutations causing human inherited disease in CpNpG trinucleotides, as well as in CpG dinucleotides.

The Human Gene Mutation Database (HGMD®): optimizing its use in a clinical diagnostic or research setting.

A massively parallel pipeline to clone DNA variants and examine molecular phenotypes of human disease mutations.

Human genomics. Effect of predicted protein-truncating genetic variants on the human transcriptome.

iRegNet3D: three-dimensional integrated regulatory network for the genomic analysis of coding and non-coding disease mutations.

Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models.

CRAVAT: cancer-related analysis of variants toolkit.

Identifying Mendelian disease genes with the variant effect scoring tool.

Evolutionary conservation and selection of human disease gene orthologs in the rat and mouse genomes.

The Human Gene Mutation Database: 2008 update.

Extensive disruption of protein interactions by genetic variants across the allele frequency spectrum in human populations.

Genome-wide detection of human intronic AG-gain variants located between splicing branchpoints and canonical splice acceptor sites.

The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studies.