Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Putting hornets on the genomic map.

Scientific reports | 2023

Hornets are the largest of the social wasps, and are important regulators of insect populations in their native ranges. Hornets are also very successful as invasive species, with often devastating economic, ecological and societal effects. Understanding why these wasps are such successful invaders is critical to managing future introductions and minimising impact on native biodiversity. Critical to the management toolkit is a comprehensive genomic resource for these insects. Here we provide the annotated genomes for two hornets, Vespa crabro and Vespa velutina. We compare their genomes with those of other social Hymenoptera, including the northern giant hornet Vespa mandarinia. The three hornet genomes show evidence of selection pressure on genes associated with reproduction, which might facilitate the transition into invasive ranges. Vespa crabro has experienced positive selection on the highest number of genes, including those putatively associated with molecular binding and olfactory systems. Caste-specific brain transcriptomic analysis also revealed 133 differentially expressed genes, some of which are associated with olfactory functions. This report provides a spring-board for advancing our understanding of the evolution and ecology of hornets, and opens up opportunities for using molecular methods in the future management of both native and invasive populations of these over-looked insects.

Pubmed ID: 37085574 RIS Download

Associated grants

  • Agency: Wellcome Trust, United Kingdom
    Id: 206194
  • Agency: Wellcome Trust, United Kingdom
    Id: 218328

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


QUAST (tool)

RRID:SCR_001228

Quality assessment software tool for evaluating and comparing genome assemblies. It works both with and without a given reference genome. It produces many reports, summary tables and plots.

View all literature mentions

STAR (tool)

RRID:SCR_004463

Software performing alignment of high-throughput RNA-seq data. Aligns RNA-seq reads to reference genome using uncompressed suffix arrays.

View all literature mentions

GBIF - Global Biodiversity Information Facility (tool)

RRID:SCR_005904

The Global Biodiversity Information Facility (GBIF) was established by governments in 2001 to encourage free and open access to biodiversity data, via the Internet. Through a global network of countries and organizations, GBIF promotes and facilitates the mobilization, access, discovery and use of information about the occurrence of organisms over time and across the planet. GBIF provides three core services and products: # An information infrastructure an Internet-based index of a globally distributed network of interoperable databases that contain primary biodiversity data information on museum specimens, field observations of plants and animals in nature, and results from experiments so that data holders across the world can access and share them # Community-developed tools, standards and protocols the tools data providers need to format and share their data # Capacity-building the training, access to international experts and mentoring programs that national and regional institutions need to become part of a decentralized network of biodiversity information facilities. GBIF and its many partners work to mobilize the data, and to improve search mechanisms, data and metadata standards, web services, and the other components of an Internet-based information infrastructure for biodiversity. GBIF makes available data that are shared by hundreds of data publishers from around the world. These data are shared according to the GBIF Data Use Agreement, which includes the provision that users of any data accessed through or retrieved via the GBIF Portal will always give credit to the original data publishers. * Explore Species: Find data for a species or other group of organisms. Information on species and other groups of plants, animals, fungi and micro-organisms, including species occurrence records, as well as classifications and scientific and common names. * Explore Countries: Find data on the species recorded in a particular country, territory or island. Information on the species recorded in each country, including records shared by publishers from throughout the GBIF network. * Explore Datasets: Find data from a data publisher, dataset or data network. Information on the data publishers, datasets and data networks that share data through GBIF, including summary information on 10028 datasets from 419 data publishers.

View all literature mentions

CD-HIT (tool)

RRID:SCR_007105

THIS RESOURCE IS NO LONGER IN SERVICE. Documented on February 28,2023. Software program for clustering biological sequences with many applications in various fields such as making non-redundant databases, finding duplicates, identifying protein families, filtering sequence errors and improving sequence assembly etc. It is very fast and can handle extremely large databases. CD-HIT helps to significantly reduce the computational and manual efforts in many sequence analysis tasks and aids in understanding the data structure and correct the bias within a dataset. The CD-HIT package has CD-HIT, CD-HIT-2D, CD-HIT-EST, CD-HIT-EST-2D, CD-HIT-454, CD-HIT-PARA, PSI-CD-HIT, CD-HIT-OTU and over a dozen scripts. * CD-HIT (CD-HIT-EST) clusters similar proteins (DNAs) into clusters that meet a user-defined similarity threshold. * CD-HIT-2D (CD-HIT-EST-2D) compares 2 datasets and identifies the sequences in db2 that are similar to db1 above a threshold. * CD-HIT-454 identifies natural and artificial duplicates from pyrosequencing reads. * CD-HIT-OTU cluster rRNA tags into OTUs The usage of other programs and scripts can be found in CD-HIT user''s guide. CD-HIT was originally developed by Dr. Weizhong Li at Dr. Adam Godzik''s Lab at the Burnham Institute (now Sanford-Burnham Medical Research Institute).

View all literature mentions

featureCounts (tool)

RRID:SCR_012919

A read summarization program, which counts mapped reads for the genomic features such as genes and exons.

View all literature mentions

RepeatMasker (tool)

RRID:SCR_012954

Software tool that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library ) and RepBase ( consensus sequence library ).

View all literature mentions

topGO (tool)

RRID:SCR_014798

Software package which provides tools for testing GO terms while accounting for the topology of the GO graph. Different test statistics and different methods for eliminating local similarities and dependencies between GO terms can be implemented and applied.

View all literature mentions

PAML (tool)

RRID:SCR_014932

Package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood. PAML estimates parameters and tests hypotheses to study the evolutionary process from a phylogenetic tree.

View all literature mentions

BUSCO (tool)

RRID:SCR_015008

Software tool to quantitatively measure genome assembly and annotation completeness based on evolutionarily informed expectations of gene content.

View all literature mentions

DESeq2 (tool)

RRID:SCR_015687

Software package for differential gene expression analysis based on the negative binomial distribution. Used for analyzing RNA-seq data for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates.

View all literature mentions

BBmap (tool)

RRID:SCR_016965

Software tool as a short read aligner for DNA and RNA seq data. Used for large genomes with millions of scaffolds. Can align reads from Illumina, PacBio, 454, Sanger, Ion Torrent, Nanopore. Fast and accurate, particularly with highly mutated genomes or reads with long indels, even whole gene deletions over 100kbp long. It has no upper limit to genome size or number of contigs. Written in Java, can run on any platform.

View all literature mentions

OrthoFinder (tool)

RRID:SCR_017118

Software Python application for comparative genomics analysis. Finds orthogroups and orthologs, infers rooted gene trees for all orthogroups and identifies all of gene duplcation events in those gene trees, infers rooted species tree for species being analysed and maps gene duplication events from gene trees to branches in species tree, improves orthogroup inference accuracy. Runs set of protein sequence files, one per species, in FASTA format.

View all literature mentions

prank (tool)

RRID:SCR_017228

Software package for multiple nucleotide sequence alignment. Probabilistic multiple alignment program for DNA, codon and amino acid sequences. Uses phylogenetic information to distinguish alignment gaps caused by insertions or deletions and, thereafter, handles two types of events differently.

View all literature mentions