Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Publication

The genome trilogy of Anopheles stephensi, an urban malaria vector, reveals structure of a locus associated with adaptation to environmental heterogeneity.

Scientific reports | 2022

Anopheles stephensi is the most menacing malaria vector to watch for in newly urbanising parts of the world. Its fitness is reported to be a direct consequence of the vector adapting to laying eggs in over-head water tanks with street-side water puddles polluted by oil and sewage. Large frequent inversions in the genome of malaria vectors are implicated in adaptation. We report the genome assembly of a strain of An. stephensi of the type-form, collected from a construction site from Chennai (IndCh) in 2016. The genome reported here with a L50 of 4, completes the trilogy of high-resolution genomes of strains with respect to a 16.5 Mbp 2Rb genotype in An. stephensi known to be associated with adaptation to environmental heterogeneity. Unlike the reported genomes of two other strains, STE2 (2R+b/2Rb) and UCI (2Rb/2Rb), IndCh is found to be homozygous for the standard form (2R+b/2R+b). Comparative genome analysis revealed base-level details of the breakpoints and allowed extraction of 22,650 segregating SNPs for typing this inversion in populations. Whole genome sequencing of 82 individual mosquitoes from diverse geographical locations reveal that one third of both wild and laboratory populations maintain the heterozygous genotype of 2Rb. The large number of SNPs can be tailored to 1740 exonic SNPs enabling genotyping directly from transcriptome sequencing. The genome trilogy approach accelerated the study of fine structure and typing of an important inversion in An. stephensi, putting the genome resources for this understudied species on par with the extensively studied malaria vector, Anopheles gambiae. We argue that the IndCh genome is relevant for field translation work compared to those reported earlier by showing that individuals from diverse geographical locations cluster with IndCh, pointing to significant convergence resulting from travel and commerce between cities, perhaps, contributing to the survival of the fittest strain.

Pubmed ID: 35246568 RIS Download

Research resources used in this publication

None found

Additional research tools detected in this publication

Antibodies used in this publication

None found

Associated grants

None

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.

SAMTOOLS (tool)

RRID:SCR_002105

Original SAMTOOLS package has been split into three separate repositories including Samtools, BCFtools and HTSlib. Samtools for manipulating next generation sequencing data used for reading, writing, editing, indexing,viewing nucleotide alignments in SAM,BAM,CRAM format. BCFtools used for reading, writing BCF2,VCF, gVCF files and calling, filtering, summarising SNP and short indel sequence variants. HTSlib used for reading, writing high throughput sequencing data.

View all literature mentions

SnpEff (tool)

RRID:SCR_005191

Genetic variant annotation and effect prediction software toolbox that annotates and predicts effects of variants on genes (such as amino acid changes). By using standards, such as VCF, SnpEff makes it easy to integrate with other programs.

View all literature mentions

Bioconductor (tool)

RRID:SCR_006442

Software repository for R packages related to analysis and comprehension of high throughput genomic data. Uses separate set of commands for installation of packages. Software project based on R programming language that provides tools for analysis and comprehension of high throughput genomic data.

View all literature mentions

Augustus (tool)

RRID:SCR_008417

Software for gene prediction in eukaryotic genomic sequences. Serves as a basis for further steps in the analysis of sequenced and assembled eukaryotic genomes.

View all literature mentions

RepeatMasker (tool)

RRID:SCR_012954

Software tool that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library ) and RepBase ( consensus sequence library ).

View all literature mentions

New England Biolabs (tool)

RRID:SCR_013517

An Antibody supplier

View all literature mentions

Pilon (tool)

RRID:SCR_014731

Software tool to automatically improve draft assemblies and find variation among strains, including large event detection. FASTA files of genome along with one or more BAM files of reads aligned as input. Read alignment analysis is used to identify inconsistencies between input genome and evidence in reads, then attempts to make improvements to genome.

View all literature mentions

RepeatModeler (tool)

RRID:SCR_015027

Sequence analysis software that performs repeat family identification and creates models for sequence data. RepeatModeler utilizes RepeatScout and RECON to identify repeat element boundaries and family relationships.

View all literature mentions

Canu (tool)

RRID:SCR_015880

Software for scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Canu is a fork of the Celera Assembler and is designed for high-noise single-molecule sequencing (such as the PacBio RS II/Sequel or Oxford Nanopore MinION).

View all literature mentions

FALCON (tool)

RRID:SCR_018804

Web tool as high throughput protein structure prediction service. High throughput server for protein structure prediction.

View all literature mentions

About

The SciCrunch Infrastructure was developed as a cooperative data platform to be used by diverse communities in making data more FAIR.

Contact Us

FAIR Data Informatics Lab

University of California, San Diego

9500 Gilman Drive, Mail Code 0608

La Jolla, CA 92093-0608

United States

info

scicrunch.org

About SciCrunch | Privacy Policy | Terms of Service

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

The genome trilogy of Anopheles stephensi, an urban malaria vector, reveals structure of a locus associated with adaptation to environmental heterogeneity.

Research resources used in this publication

Additional research tools detected in this publication

Antibodies used in this publication

Associated grants

This is a list of tools and resources that we have found mentioned in this publication.