Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Publication

Diverse monogenic subforms of human spermatogenic failure.

Nature communications | 2022

Non-obstructive azoospermia (NOA) is the most severe form of male infertility and typically incurable. Defining the genetic basis of NOA has proven challenging, and the most advanced classification of NOA subforms is not based on genetics, but simple description of testis histology. In this study, we exome-sequenced over 1000 clinically diagnosed NOA cases and identified a plausible recessive Mendelian cause in 20%. We find further support for 21 genes in a 2-stage burden test with 2072 cases and 11,587 fertile controls. The disrupted genes are primarily on the autosomes, enriched for undescribed human "knockouts", and, for the most part, have yet to be linked to a Mendelian trait. Integration with single-cell RNA sequencing data shows that azoospermia genes can be grouped into molecular subforms with synchronized expression patterns, and analogs of these subforms exist in mice. This analysis framework identifies groups of genes with known roles in spermatogenesis but also reveals unrecognized subforms, such as a set of genes expressed across mitotic divisions of differentiating spermatogonia. Our findings highlight NOA as an understudied Mendelian disorder and provide a conceptual structure for organizing the complex genetics of male infertility, which may provide a rational basis for disease classification.

Pubmed ID: 36572685 RIS Download

Research resources used in this publication

None found

Additional research tools detected in this publication

Antibodies used in this publication

None found

Associated grants

Agency: NICHD NIH HHS, United States
Id: R01 HD078641
Agency: NICHD NIH HHS, United States
Id: P50 HD096723
Agency: Wellcome Trust, United Kingdom
Id: 209451/Z/17/Z

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.

Human Gene Mutation Database (tool)

RRID:SCR_001621

Curated database of known (published) gene lesions responsible for human inherited disease.

View all literature mentions

GATK (tool)

RRID:SCR_001876

A software package to analyze next-generation resequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of any size. This software library makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner. (entry from Genetic Analysis Software)

View all literature mentions

Cytoscape (tool)

RRID:SCR_003032

Software platform for complex network analysis and visualization. Used for visualization of molecular interaction networks and biological pathways and integrating these networks with annotations, gene expression profiles and other state data.

View all literature mentions

STRING (tool)

RRID:SCR_005223

Database of known and predicted protein interactions. The interactions include direct (physical) and indirect (functional) associations and are derived from four sources: Genomic Context, High-throughput experiments, (Conserved) Coexpression, and previous knowledge. STRING quantitatively integrates interaction data from these sources for a large number of organisms, and transfers information between these organisms where applicable. The database currently covers 5''214''234 proteins from 1133 organisms. (2013)

View all literature mentions

Bowtie (tool)

RRID:SCR_005476

Software ultrafast memory efficient tool for aligning sequencing reads. Bowtie is short read aligner.

View all literature mentions

International Mouse Phenotyping Consortium (IMPC) (tool)

RRID:SCR_006158

Center that produces knockout mice and carries out high-throughput phenotyping of each line in order to determine function of every gene in mouse genome. These mice will be preserved in repositories and made available to scientific community representing valuable resource for basic scientific research as well as generating new models for human diseases.

View all literature mentions

ClinVar (tool)

RRID:SCR_006169

Archive of aggregated information about sequence variation and its relationship to human health. Provides reports of relationships among human variations and phenotypes along with supporting evidence. Submissions from clinical testing labs, research labs, locus-specific databases, expert panels and professional societies are welcome. Collects reports of variants found in patient samples, assertions made regarding their clinical significance, information about submitter, and other supporting data. Alleles described in submissions are mapped to reference sequences, and reported according to HGVS standard.

View all literature mentions

Mouse Genome Informatics (MGI) (tool)

RRID:SCR_006460

International database for laboratory mouse. Data offered by The Jackson Laboratory includes information on integrated genetic, genomic, and biological data. MGI creates and maintains integrated representation of mouse genetic, genomic, expression, and phenotype data and develops reference data set and consensus data views, synthesizes comparative genomic data between mouse and other mammals, maintains set of links and collaborations with other bioinformatics resources, develops and supports analysis and data submission tools, and provides technical support for database users. Projects contributing to this resource are: Mouse Genome Database (MGD) Project, Gene Expression Database (GXD) Project, Mouse Tumor Biology (MTB) Database Project, Gene Ontology (GO) Project at MGI, and MouseCyc Project at MGI.

View all literature mentions

Picard (tool)

RRID:SCR_006525

Java toolset for working with next generation sequencing data in the BAM format.

View all literature mentions

Variant Effect Predictor (tool)

RRID:SCR_007931

Data analysis service to predict the functional consequences of known and unknown variants.

View all literature mentions

QIAGEN (tool)

RRID:SCR_008539

A commercial organization which provides assay technologies to isolate DNA, RNA, and proteins from any biological sample. Assay technologies are then used to make specific target biomolecules, such as the DNA of a specific virus, visible for subsequent analysis.

View all literature mentions

1000 Genomes Project and AWS (tool)

RRID:SCR_008801

A dataset containing the full genomic sequence of 1,700 individuals, freely available for research use. The 1000 Genomes Project is an international research effort coordinated by a consortium of 75 companies and organizations to establish the most detailed catalogue of human genetic variation. The project has grown to 200 terabytes of genomic data including DNA sequenced from more than 1,700 individuals that researchers can now access on AWS for use in disease research free of charge. The dataset containing the full genomic sequence of 1,700 individuals is now available to all via Amazon S3. The data can be found at: http://s3.amazonaws.com/1000genomes The 1000 Genomes Project aims to include the genomes of more than 2,662 individuals from 26 populations around the world, and the NIH will continue to add the remaining genome samples to the data collection this year. Public Data Sets on AWS provide a centralized repository of public data hosted on Amazon Simple Storage Service (Amazon S3). The data can be seamlessly accessed from AWS services such Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic MapReduce (Amazon EMR), which provide organizations with the highly scalable compute resources needed to take advantage of these large data collections. AWS is storing the public data sets at no charge to the community. Researchers pay only for the additional AWS resources they need for further processing or analysis of the data. All 200 TB of the latest 1000 Genomes Project data is available in a publicly available Amazon S3 bucket. You can access the data via simple HTTP requests, or take advantage of the AWS SDKs in languages such as Ruby, Java, Python, .NET and PHP. Researchers can use the Amazon EC2 utility computing service to dive into this data without the usual capital investment required to work with data at this scale. AWS also provides a number of orchestration and automation services to help teams make their research available to others to remix and reuse. Making the data available via a bucket in Amazon S3 also means that customers can crunch the information using Hadoop via Amazon Elastic MapReduce, and take advantage of the growing collection of tools for running bioinformatics job flows, such as CloudBurst and Crossbow.

View all literature mentions

New England Biolabs (tool)

RRID:SCR_013517

An Antibody supplier

View all literature mentions

Phaser (tool)

RRID:SCR_014219

Crystallographic software which solves structures using algorithms and automated rapid search calculations to perform molecular replacement and experimental phasing methods.

View all literature mentions

GEMINI (tool)

RRID:SCR_014819

Framework for exploring genetic variation in the context of the genome annotations available for the human genome. Users can load a VCF file into a database and each variant is automatically annotated by comparing it to several genome annotations from source such as ENCODE tracks, UCSC tracks, OMIM, dbSNP, KEGG, and HPRD.

View all literature mentions

Genome Aggregation Database (tool)

RRID:SCR_014964

Database that aggregates exome and genome sequencing data from large-scale sequencing projects. The gnomAD data set contains individuals sequenced using multiple exome capture methods and sequencing chemistries. Raw data from the projects have been reprocessed through the same pipeline, and jointly variant-called to increase consistency across projects.

View all literature mentions

clusterProfiler (tool)

RRID:SCR_016884

Software R package for statistical analysis and visualization of functional profiles for genes and gene clusters.

View all literature mentions

About

The SciCrunch Infrastructure was developed as a cooperative data platform to be used by diverse communities in making data more FAIR.

Contact Us

FAIR Data Informatics Lab

University of California, San Diego

9500 Gilman Drive, Mail Code 0608

La Jolla, CA 92093-0608

United States

info

scicrunch.org

About SciCrunch | Privacy Policy | Terms of Service

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Log in

Log in

Publication

Diverse monogenic subforms of human spermatogenic failure.

Research resources used in this publication

Additional research tools detected in this publication

Antibodies used in this publication

Associated grants

This is a list of tools and resources that we have found mentioned in this publication.

RRID:SCR_001621

RRID:SCR_001876

RRID:SCR_003032

RRID:SCR_005223

RRID:SCR_005476

RRID:SCR_006158

RRID:SCR_006169

RRID:SCR_006460

RRID:SCR_006525

RRID:SCR_007931

RRID:SCR_008539

RRID:SCR_008801

RRID:SCR_013517

RRID:SCR_014219

RRID:SCR_014819

RRID:SCR_014964

RRID:SCR_016884

About

Recent News Entries

Contact Us

SciCrunch