This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.
With the advent of high throughput genotyping technology and the information available via projects such as the human genome sequencing and the HapMap project, more and more data relevant to the study of genetics and disease risk will be produced. Systematic reviews and meta-analyses of human genome epidemiology studies rely on the ability to identify relevant studies and to obtain suitable data from these studies. A first port of call for most such reviews is a search of MEDLINE. We examined whether this could be usefully supplemented by identifying databases on the World Wide Web that contain genetic epidemiological information.
Direct-to-consumer (DTC) genetics services are increasingly popular, with tens of millions of customers. Several DTC genealogy services allow users to upload genetic data to search for relatives, identified as people with genomes that share identical by state (IBS) regions. Here, we describe methods by which an adversary can learn database genotypes by uploading multiple datasets. For example, an adversary who uploads approximately 900 genomes could recover at least one allele at SNP sites across up to 82% of the genome of a median person of European ancestries. In databases that detect IBS segments using unphased genotypes, approximately 100 falsified uploads can reveal enough genetic information to allow genome-wide genetic imputation. We provide a proof-of-concept demonstration in the GEDmatch database, and we suggest countermeasures that will prevent the exploits we describe.
Correct species identifications are of tremendous importance for invasion ecology, as mistakes could lead to misdirecting limited resources against harmless species or inaction against problematic ones. DNA barcoding is becoming a promising and reliable tool for species identifications, however the efficacy of such molecular taxonomy depends on gene region(s) that provide a unique sequence to differentiate among species and on availability of reference sequences in existing genetic databases. Here, we assembled a list of aquatic and terrestrial non-indigenous species (NIS) and checked two leading genetic databases for corresponding sequences of six genome regions used for DNA barcoding. The genetic databases were checked in 2010, 2012, and 2016. All four aquatic kingdoms (Animalia, Chromista, Plantae and Protozoa) were initially equally represented in the genetic databases, with 64, 65, 69, and 61 % of NIS included, respectively. Sequences for terrestrial NIS were present at rates of 58 and 78 % for Animalia and Plantae, respectively. Six years later, the number of sequences for aquatic NIS increased to 75, 75, 74, and 63 % respectively, while those for terrestrial NIS increased to 74 and 88 % respectively. Genetic databases are marginally better populated with sequences of terrestrial NIS of plants compared to aquatic NIS and terrestrial NIS of animals. The rate at which sequences are added to databases is not equal among taxa. Though some groups of NIS are not detectable at all based on available data-mostly aquatic ones-encouragingly, current availability of sequences of taxa with environmental and/or economic impact is relatively good and continues to increase with time.
Some organizations such as 23andMe and the UK Biobank have large genomic databases that they re-use for multiple different genome-wide association studies. Even research studies that compile smaller genomic databases often utilize these databases to investigate many related traits. It is common for the study to report a genetic risk score (GRS) model for each trait within the publication. Here, we show that under some circumstances, these GRS models can be used to recover the genetic variants of individuals in these genomic databases-a reconstruction attack. In particular, if two GRS models are trained by using a largely overlapping set of participants, it is often possible to determine the genotype for each of the individuals who were used to train one GRS model, but not the other. We demonstrate this theoretically and experimentally by analyzing the Cornell Dog Genome database. The accuracy of our reconstruction attack depends on how accurately we can estimate the rate of co-occurrence of pairs of single nucleotide polymorphisms within the private database, so if this aggregate information is ever released, it would drastically reduce the security of a private genomic database. Caution should be applied when using the same database for multiple analysis, especially when a small number of individuals are included or excluded from one part of the study.
Mitochondrial disorders are a group of rare diseases, caused by nuclear or mitochondrial DNA mutations. Their marked clinical and genetic heterogeneity as well as referral and ascertainment biases render phenotype-based prevalence estimations difficult. Here we calculated the lifetime risk of all known autosomal recessive mitochondrial disorders on basis of genetic data.
Arginase 1 Deficiency (ARG1-D) is a rare inherited metabolic disease with progressive, devastating neurological manifestations with early mortality and high unmet need. Information on prevalence is scarce and highly variable due to limited newborn screening (NBS) availability, variability of arginine levels in the first days of life, and high rates of misdiagnosis. US birth prevalence was recently estimated via indirect methods at 1.1 cases per million live births. Due to the autosomal recessive nature of ARG1-D we hypothesize that the global prevalence may be more accurately estimated using genetic population databases.
Abiotic stresses extensively reduce agricultural crop production globally. Traditional breeding technology has been the fundamental approach used to cope with abiotic stresses. The development of gene editing technology for modifying genes responsible for the stresses and the related genetic networks has established the foundation for sustainable agriculture against environmental stress. Integrated approaches based on functional genomics and transcriptomics are now expanding the opportunities to elucidate the molecular mechanisms underlying abiotic stress responses. This review summarizes some of the features and weblinks of plant genome databases related to abiotic stress genes utilized for improving crops. The gene-editing tool based on clustered, regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9) has revolutionized stress tolerance research due to its simplicity, versatility, adaptability, flexibility, and broader applications. However, off-target and low cleavage efficiency hinder the successful application of CRISPR/Cas systems. Computational tools have been developed for designing highly competent gRNA with better cleavage efficiency. This powerful genome editing tool offers tremendous crop improvement opportunities, overcoming conventional breeding techniques' shortcomings. Furthermore, we also discuss the mechanistic insights of the CRISPR/Cas9-based genome editing technology. This review focused on the current advances in understanding plant species' abiotic stress response mechanism and applying the CRISPR/Cas system genome editing technology to develop crop resilience against drought, salinity, temperature, heavy metals, and herbicides.
Comprehensive knowledge of immunoglobulin genetics is required to advance our understanding of B cell biology. Validated immunoglobulin variable (V) gene databases are close to completion only for human and mouse. We present a novel computational approach, IgDiscover, that identifies germline V genes from expressed repertoires to a specificity of 100%. IgDiscover uses a cluster identification process to produce candidate sequences that, once filtered, results in individualized germline V gene databases. IgDiscover was tested in multiple species, validated by genomic cloning and cross library comparisons and produces comprehensive gene databases even where limited genomic sequence is available. IgDiscover analysis of the allelic content of the Indian and Chinese-origin rhesus macaques reveals high levels of immunoglobulin gene diversity in this species. Further, we describe a novel human IGHV3-21 allele and confirm significant gene differences between Balb/c and C57BL6 mouse strains, demonstrating the power of IgDiscover as a germline V gene discovery tool.
National and ethnic mutation databases (NEMDBs) are emerging online repositories, recording extensive information about the described genetic heterogeneity of an ethnic group or population. These resources facilitate the provision of genetic services and provide a comprehensive list of genomic variations among different populations. As such, they enhance awareness of the various genetic disorders. Here, we describe the features of the ETHNOS software, a simple but versatile tool based on a flat-file database that is specifically designed for the development and curation of NEMDBs. ETHNOS is a freely available software which runs more than half of the NEMDBs currently available. Given the emerging need for NEMDB in genetic testing services and the fact that ETHNOS is the only off-the-shelf software available for NEMDB development and curation, its adoption in subsequent NEMDB development would contribute towards data content uniformity, unlike the diverse contents and quality of the available gene (locus)-specific databases. Finally, we allude to the potential applications of NEMDBs, not only as worldwide central allele frequency repositories, but also, and most importantly, as data warehouses of individual-level genomic data, hence allowing for a comprehensive ethnicity-specific documentation of genomic variation.
Understanding genotype/phenotype relationships has become more complicated as increasing amounts of inter- and intra-tissue genetic heterogeneity have been revealed through next-generation sequencing and evidence showing that factors such as epigenetic modifications, non-coding RNAs and RNA editing can play an important role in determining phenotype. Such findings have challenged a number of classic genetic assumptions including (i) analysis of genomic sequence obtained from blood is an accurate reflection of the genotype responsible for phenotype expression in an individual; (ii) that significant genetic alterations will be found only in diseased individuals, in germline tissues in inherited diseases, or in specific diseased tissues in somatic diseases such as cancer; and (iii) that mutation rates in putative disease-associated genes solely determine disease phenotypes. With the breakdown of our traditional understanding of genotype to phenotype relationships, it is becoming increasingly apparent that new analytical tools will be required to determine the relationship between genotype and phenotypic expression. To this end, we are proposing that next-generation genetic database (NGDB) platforms be created that include new bioinformatics tools based on algorithms that can evaluate genetic heterogeneity, as well as powerful systems biology analysis tools to actively process and evaluate the vast amounts of both genomic and genomic-modifying information required to reveal the true relationships between genotype and phenotype.
The identification of genes underlying human genetic disorders requires the combination of data related to cytogenetic localization, phenotypes and expression patterns, to generate a list of candidate genes. In the field of human genetics, it is normal to perform this combination analysis by hand. We report on GeneSeeker (http://www.cmbi.ru.nl/GeneSeeker/), a web server that gathers and combines data from a series of databases. All database searches are performed via the web interfaces provided with the original databases, guaranteeing that the most recent data are queried, and obviating data warehousing. GeneSeeker makes the same selection of candidate genes as the human geneticists would have performed, and thus reducing the time-consuming process to a few minutes. GeneSeeker is particularly well suited for syndromes in which the disease gene displays altered expression patterns in the affected tissue(s).
Tripal is an open-source freely available toolkit for construction of online genomic and genetic databases. It aims to facilitate development of community-driven biological websites by integrating the GMOD Chado database schema with Drupal, a popular website creation and content management software. Tripal provides a suite of tools for interaction with a Chado database and display of content therein. The tools are designed to be generic to support the various ways in which data may be stored in Chado. Previous releases of Tripal have supported organisms, genomic libraries, biological stocks, stock collections and genomic features, their alignments and annotations. Also, Tripal and its extension modules provided loaders for commonly used file formats such as FASTA, GFF, OBO, GAF, BLAST XML, KEGG heir files and InterProScan XML. Default generic templates were provided for common views of biological data, which could be customized using an open Application Programming Interface to change the way data are displayed. Here, we report additional tools and functionality that are part of release v1.1 of Tripal. These include (i) a new bulk loader that allows a site curator to import data stored in a custom tab delimited format; (ii) full support of every Chado table for Drupal Views (a powerful tool allowing site developers to construct novel displays and search pages); (iii) new modules including 'Feature Map', 'Genetic', 'Publication', 'Project', 'Contact' and the 'Natural Diversity' modules. Tutorials, mailing lists, download and set-up instructions, extension modules and other documentation can be found at the Tripal website located at http://tripal.info. DATABASE URL: http://tripal.info/.
Biallelic pathogenic variants in CBS gene cause the most common form of homocystinuria, the classical homocystinuria (HCU). The worldwide prevalence of HCU is estimated to be 0.82:100,000 [95% CI, 0.39-1.73:100,000] according to clinical records and 1.09:100,000 [95% CI, 0.34-3.55:100,000] by neonatal screening. In this study, we aimed to estimate the minimal worldwide incidence of HCU.
Von Willebrand disease (VWD) is a common bleeding disorder caused by mutations in the von Willebrand factor gene (VWF). The true global prevalence of VWD has not been accurately established. We estimated the worldwide and within-population prevalence of inherited VWD by analyzing exome and genome data of 141,456 individuals gathered by the genome Aggregation Database (gnomAD). We also extended our data deepening by mining the main databases containing VWF variants i.e., the Leiden Open Variation Database (LOVD) and the Human Gene Mutation Database (HGMD) with the goal to explore the global mutational spectrum of VWD. A total of 4,313 VWF variants were identified in the gnomAD population, of which 505 were predicted to be pathogenic or already reported to be associated with VWD. Among the 282,912 alleles analyzed, 31,785 were affected by the aforementioned variants. The global prevalence of dominant VWD in 1000 individuals was established to be 74 for type 1, 3 for 2A, 3 for 2B and 6 for 2M. The global prevalences for recessive VWD forms (type 2N and type 3) were 0.31 and 0.7 in 1000 individuals, respectively. This comprehensive analysis provided a global mutational landscape of VWF by means of 927 already reported variants in the HGMD and LOVD datasets and 287 novel pathogenic variants identified in the gnomAD. Our results reveal that there is a considerably higher than expected prevalence of putative disease alleles and variants associated with VWD and suggest that a large number of VWD patients are undiagnosed.
Antimicrobial resistance (AMR) is a rising health threat with 10 million annual casualties estimated by 2050. Appropriate treatment of infectious diseases with the right antibiotics reduces the spread of antibiotic resistance. Today, clinical practice relies on molecular and PCR techniques for pathogen identification and culture-based antibiotic susceptibility testing (AST). Recently, WGS has started to transform clinical microbiology, enabling prediction of resistance phenotypes from genotypes and allowing for more informed treatment decisions. WGS-based AST (WGS-AST) depends on the detection of AMR markers in sequenced isolates and therefore requires AMR reference databases. The completeness and quality of these databases are material to increase WGS-AST performance.
The use of Cannabis is gaining greater social acceptance for its beneficial medicinal and recreational uses. With this acceptance has come new opportunities for crop management, selective breeding, and the potential for targeted genetic manipulation. However, as an agricultural product Cannabis lags far behind other domesticated plants in knowledge of the genes and genetic variation that influence plant traits of interest such as growth form and chemical composition. Despite this lack of information, there are substantial publicly available resources that document phenotypic traits believed to be associated with particular Cannabis varieties. Such databases could be a valuable resource for developing a greater understanding of genes underlying phenotypic variation if combined with appropriate genetic information. To test this potential, we collated phenotypic data from information available through multiple online databases. We then produced a Cannabis SNP database from 845 strains to examine genome wide associations in conjunction with our assembled phenotypic traits. Our goal was not to locate Cannabis-specific genetic variation that correlates with phenotypic variation as such, but rather to examine the potential utility of these databases more broadly for future, explicit genome wide association studies (GWAS), either in stand-alone analyses or to complement other types of data. For this reason, we examined a very broad array of phenotypic traits. In total, we performed 201 distinct association tests using web-derived phenotype data appended to 290 uniquely named Cannabis strains. Our results indicated that chemical phenotypes, such as tetrahydrocannabinol (THC) and cannabidiol (CBD) content, may have sufficiently high-quality information available through web-based sources to allow for genetic association inferences. In many cases, variation in chemical traits correlated with genetic variation in or near biologically reasonable candidate genes, including several not previously implicated in Cannabis chemical variation. As with chemical phenotypes, we found that publicly available data on growth traits such as height, area of growth, and floral yield may be precise enough for use in future association studies. In contrast, phenotypic information for subjective traits such as taste, physiological affect, neurological affect, and medicinal use appeared less reliable. These results are consistent with the high degree of subjectivity for such trait data found on internet databases, and suggest that future work on these important but less easily quantifiable characteristics of Cannabis may require dedicated, controlled phenotyping.
Lynch syndrome is an autosomal dominant disease caused by germ line heterozygous mutations mainly involving the MSH2, MLH1 and MSH6 genes that belong to the DNA MisMatch Repair (MMR) genes family. The French network counting the 16 licensed laboratories involved in Lynch syndrome genetic testing developed three locus-specific databases with the UMD software (www.umd.be/MLH1/, www.umd.be/MSH2/ and www.umd.be/MSH6/) that presently contain a total of 7047 sequence variations including 707 distinct variations of a priori unknown functional significance (VUS) that were identified through complete mutation screening or targeted predictive testing. Mutation carriers are at high risk for developing early-onset colorectal and endometrial adenocarcinomas. Consensus clinical guidelines have been proposed, allowing the efficient detection of curable lesions. The major challenge of genetic testing is to reliably classify the genomic variations in those patients who seek genetic counseling. Combining the interactive tools of the software, the relevant published data and mainly original information produced by the French MisMatch Repair network, the UMD-MLH1/MSH2/MSH6 databases provide interpretation data for the 707 VUS that were classified according to the IARC 5-Class system. These public databases are regularly updated to improve the classification of all registered VUS, exploring their role in cancer pre-disposition based on structural and functional approaches.
Neurodegeneration with brain iron accumulation (NBIA) are a group of clinically and genetically heterogeneous diseases characterized by iron overload in basal ganglia and progressive neurodegeneration. Little is known about the epidemiology of NBIA disorders. In the absence of large-scale population-based studies, obtaining reliable epidemiological data requires innovative approaches.
More than 100,000 human genetic variations have been described in various genes that are associated with a wide variety of diseases. Such data provides invaluable information for both clinical medicine and basic science. A number of locus-specific databases have been developed to exploit this huge amount of data. However, the scope, format and content of these databases differ strongly and as no standard for variation databases has yet been adopted, the way data is presented varies enormously. This review aims to give an overview of current resources for human variation data in public and commercial resources.
Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the facets that you can filter your papers by.
From here we'll present any options for the literature, such as exporting your current results.
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.
Year:
Count: