2024MAY10: Our hosting provider is experiencing intermittent networking issues. We apologize for any inconvenience.

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

The Capparis spinosa var. herbacea genome provides the first genomic instrument for a diversity and evolution study of the Capparaceae family.

GigaScience | 2022

The caper bush Capparis spinosa L., one of the most economically important species of Capparaceae, is a xerophytic shrub that is well adapted to drought and harsh environments. However, genetic studies on this species are limited because of the lack of its reference genome.

Pubmed ID: 36310248 RIS Download

Research resources used in this publication

Antibodies used in this publication

None found

Associated grants

None

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


Pfam (tool)

RRID:SCR_004726

A database of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). Users can analyze protein sequences for Pfam matches, view Pfam family annotation and alignments, see groups of related families, look at the domain organization of a protein sequence, find the domains on a PDB structure, and query Pfam by keywords. There are two components to Pfam: Pfam-A and Pfam-B. Pfam-A entries are high quality, manually curated families that may automatically generate a supplement using the ADDA database. These automatically generated entries are called Pfam-B. Although of lower quality, Pfam-B families can be useful for identifying functionally conserved regions when no Pfam-A entries are found. Pfam also generates higher-level groupings of related families, known as clans (collections of Pfam-A entries which are related by similarity of sequence, structure or profile-HMM).

View all literature mentions

SNAP (tool)

RRID:SCR_007936

A sequence analysis tool providing a simple but detailed analysis of human genes and their variations. For each gene, a gene-gene relationship network can be generated based on protein-protein interaction data, metabolic pathway connections and extended through phylogenetic relations. Snap provides tools for designing sequence primers and evaluating RNA splicing effects of single SNPs - known from the databases or defined by you. Primers can be designed for the amplification or sequencing of cDNA, genomic DNA, introns only or exons only.

View all literature mentions

Thermo Fisher Scientific (tool)

RRID:SCR_008452

Commercial vendor and service provider of laboratory reagents and antibodies. Supplier of scientific instrumentation, reagents and consumables, and software services.

View all literature mentions

DIAMOND (tool)

RRID:SCR_009457

Software to: view dicom files and assemble them into 3D volumes. View and convert between Analyze, Nifti, and Interfile. Classify and organize dicoms and 3D volumes using metadata. Search and report on a collection of scans.

View all literature mentions

Infernal (tool)

RRID:SCR_011809

Software for searching DNA sequence databases for RNA structure and sequence similarities.

View all literature mentions

KEGG (tool)

RRID:SCR_012773

Integrated database resource consisting of 16 main databases, broadly categorized into systems information, genomic information, and chemical information. In particular, gene catalogs in completely sequenced genomes are linked to higher-level systemic functions of cell, organism, and ecosystem. Analysis tools are also available. KEGG may be used as reference knowledge base for biological interpretation of large-scale datasets generated by sequencing and other high-throughput experimental technologies.

View all literature mentions

OrthoFinder (tool)

RRID:SCR_017118

Software Python application for comparative genomics analysis. Finds orthogroups and orthologs, infers rooted gene trees for all orthogroups and identifies all of gene duplcation events in those gene trees, infers rooted species tree for species being analysed and maps gene duplication events from gene trees to branches in species tree, improves orthogroup inference accuracy. Runs set of protein sequence files, one per species, in FASTA format.

View all literature mentions

tRNAscan-SE (tool)

RRID:SCR_008637

Web server to search for tRNA genes in genomic sequence. If you would like to run tRNAscan-SE locally, you can get the UNIX source code (gzip''d tar file).

View all literature mentions

BUSCO (software resource)

RRID:SCR_015008

Software tool to quantitatively measure genome assembly and annotation completeness based on evolutionarily informed expectations of gene content.

View all literature mentions

Circos (software resource)

RRID:SCR_011798

A software package for visualizing data and information. It visualizes data in a circular layout - this makes Circos ideal for exploring relationships between objects or positions.

View all literature mentions

BEDTools (software resource)

RRID:SCR_006646

A powerful toolset for genome arithmetic allowing one to address common genomics tasks such as finding feature overlaps and computing coverage. Bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line.

View all literature mentions

ggplot2 (data processing software)

RRID:SCR_014601

Open source software package for statistical programming language R to create plots based on grammar of graphics. Used for data visualization to break up graphs into semantic components such as scales and layers.

View all literature mentions

jcvi (software resource)

RRID:SCR_021641

Software tool as collection of Python libraries to parse bioinformatics files, or perform computation related to assembly, annotation, and comparative genomics.

View all literature mentions

DIAMOND (software resource)

RRID:SCR_016071

Software that performs sequence alignment for protein and translated DNA searches and functions. Used for high performance analysis of big sequence data, protein-protein search, and DNA-protein search.

View all literature mentions

CAFE (software resource)

RRID:SCR_005983

R software package for the detection of gross chromosomal abnormalities from gene expression microarray data.

View all literature mentions

PAML (software resource)

RRID:SCR_014932

Package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood. PAML estimates parameters and tests hypotheses to study the evolutionary process from a phylogenetic tree.

View all literature mentions

TimeTree (database)

RRID:SCR_021162

Public knowledge base for information on evolutionary timescale of life. Data from thousands of published studies are assembled into searchable tree of life scaled to time.

View all literature mentions

IQ-TREE (software resource)

RRID:SCR_017254

Software tool as stochastic algorithm for estimating maximum likelihood phylogenies. Used for phylogenomic inference.

View all literature mentions

Gblocks (web application)

RRID:SCR_015945

Software that eliminates poorly aligned positions and divergent regions of a DNA or protein alignment so that it becomes more suitable for phylogenetic analysis.

View all literature mentions

MAFFT (software resource)

RRID:SCR_011811

Software package as multiple alignment program for amino acid or nucleotide sequences. Can align up to 500 sequences or maximum file size of 1 MB. First version of MAFFT used algorithm based on progressive alignment, in which sequences were clustered with help of Fast Fourier Transform. Subsequent versions have added other algorithms and modes of operation, including options for faster alignment of large numbers of sequences, higher accuracy alignments, alignment of non-coding RNA sequences, and addition of new sequences to existing alignments.

View all literature mentions

PANTHER (data analysis service)

RRID:SCR_004869

System that classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function even in absence of direct experimental evidence. Orthologs view is curated orthology relationships between genes for human, mouse, rat, fish, worm, and fly.

View all literature mentions

clusterProfiler (software resource)

RRID:SCR_016884

Software R package for statistical analysis and visualization of functional profiles for genes and gene clusters.

View all literature mentions

miRBase (data repository)

RRID:SCR_003152

Central online repository for microRNA nomenclature, sequence data, annotation and target prediction.Collection of published miRNA sequences and annotation.

View all literature mentions

Barrnap (software resource)

RRID:SCR_015995

THIS RESOURCE IS NO LONGER IN SERVICE. Documented on February 28,2023. Software to predict the location of ribosomal RNA genes in genomes. It supports bacteria, archaea, mitochondria, and eukaryotes. It takes FASTA DNA sequence as input, writes GFF3 as output, and supports multithreading.

View all literature mentions

Rfam (data analysis service)

RRID:SCR_007891

The Rfam database is a collection of RNA families, each represented by multiple sequence alignments, consensus secondary structures and covariance models (CMs). The families in Rfam break down into three broad functional classes: Non-coding RNA genes, structured cis-regulatory elements and self-splicing RNAs. Typically these functional RNAs often have a conserved secondary structure which may be better preserved than the RNA sequence. The CMs used to describe each family are a slightly more complicated relative of the profile hidden Markov models (HMMs) used by Pfam. CMs can simultaneously model RNA sequence and the structure in an elegant and accurate fashion. Rfam is also available via FTP. You can find data in Rfam in various ways... * Analyze your RNA sequence for Rfam matches * View Rfam family annotation and alignments * View Rfam clan details * Query Rfam by keywords * Fetch families or sequences by NCBI taxonomy * Enter any type of accession or ID to jump to the page for a Rfam family, sequence or genome

View all literature mentions

tRNAscan-SE (software resource)

RRID:SCR_010835

Web server to search for tRNA genes in genomic sequence. If you would like to run tRNAscan-SE locally, you can get the UNIX source code (gzip''d tar file).

View all literature mentions

EVidenceModeler (software resource)

RRID:SCR_014659

Software tool for automated eukaryotic gene structure annotation that reports eukaryotic gene structures as weighted consensus of all available evidence. Used to combine ab intio gene predictions and protein and transcript alignments into weighted consensus gene structures. Inputs include genome sequence, gene predictions, and alignment data (in GFF3 format).

View all literature mentions

Trinity (software resource)

RRID:SCR_013048

Software for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data.

View all literature mentions

PASA (software resource)

RRID:SCR_014656

Gene structure annotation and analysis tool that uses spliced alignments of expressed transcript sequences to automatically model gene structures. It also incorporates gene structures based on transcript alignments into existing gene structure annotations. It is one component of a larger eukayotic annotation pipeline implemented at the Broad Institute.

View all literature mentions

GeneMarkS-T (software resource)

RRID:SCR_017648

Software package for ab initio identification of protein coding regions in RNA transcripts. Algorithm parameters are estimated by unsupervised training which makes unnecessary manually curated preparation of training sets. Sets of assembled eukaryotic transcripts can be analyzed by modified GeneMarkS-T algorithm which part of gene prediction programs GeneMark.

View all literature mentions

StringTie (software resource)

RRID:SCR_016323

Software application for assembling of RNA-Seq alignments into potential transcripts. It enables improved reconstruction of a transcriptome from RNA-seq reads. This transcript assembling and quantification program is implemented in C++ .

View all literature mentions

HISAT2 (software resource)

RRID:SCR_015530

Graph-based alignment of next generation sequencing reads to a population of genomes.

View all literature mentions

GeMoMa (software resource)

RRID:SCR_017646

Software tool as homology based gene prediction program that predicts gene models in target species based on gene models in evolutionary related reference species. Utilizes amino acid sequence conservation, intron position conservation, and RNA-seq data to accurately predict protein-coding transcripts. Supports combination of predictions based on several reference species allowing to transfer high quality annotation of different reference species to target species.

View all literature mentions

SNAP 3 (software resource)

RRID:SCR_009400

THIS RESOURCE IS NO LONGER IN SERVICE, documented September 29, 2016. Software program can be used to generate SNP haplotype sequence data of unrelated individuals and nuclear families with a fixed or random number of children.

View all literature mentions

Augustus (software resource)

RRID:SCR_008417

Software for gene prediction in eukaryotic genomic sequences. Serves as a basis for further steps in the analysis of sequenced and assembled eukaryotic genomes.

View all literature mentions

MISA (software resource)

RRID:SCR_010765

Software tool that allows the identification and localization of perfect microsatellites as well as compound microsatellites which are interrupted by a certain number of bases.

View all literature mentions

RepeatMasker (software resource)

RRID:SCR_012954

Software tool that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library ) and RepBase ( consensus sequence library ).

View all literature mentions

SeqKit (software resource)

RRID:SCR_018926

Software tool as cross platform and ultrafast toolkit for FASTA/Q file manipulation.

View all literature mentions

LTRharvest (software resource)

RRID:SCR_018970

Software tool for de novo detection of full length LTR retrotransposons in large sequence sets. Delivers high quality annotations based on known LTR transposon features like length, distance, and sequence motifs.

View all literature mentions

Dfam (database)

RRID:SCR_021168

Open collection of Transposable Element DNA sequence alignments, hidden Markov Models, consensus sequences, and genome annotations.Dfam 3.2 provides early access to uncurated, de novo generated families.

View all literature mentions

Repbase (database)

RRID:SCR_021169

Database of repetitive DNA elements.Database of prototypic sequences representing repetitive DNA from different eukaryotic species. Used in genome sequencing projects worldwide as reference collection for masking and annotation of repetitive DNA.

View all literature mentions

LTR_retriever (data processing software)

RRID:SCR_017623

Software package for identification of long terminal repeat retrotransposons (LTR-RTs). Removes false positives from initial software predictions. Achieves very high specificity, accuracy, and precision without significantly sacrificing sensitivity, hence significantly outperforming existing methods. Can construct LTR libraries directly from self-corrected PacBio reads prior to genome assembly.

View all literature mentions

LACHESIS (data processing software)

RRID:SCR_017644

Software tool for chromosome scale scaffolding of de novo genome assemblies based on chromatin interactions.Method exploits signal of genomic proximity in Hi-C datasets for ultra long range scaffolding of de novo genome assemblies.

View all literature mentions

LTR_Finder (software resource)

RRID:SCR_015247

Web software capable of scanning large-scale sequences for full-length LTR retrotranspsons.

View all literature mentions

CEGMA (data or information resource)

RRID:SCR_015055

THIS RESOURCE IS NO LONGER IN SERVICE, documented on January 19, 2022. Tool to annotate core genes in eukaryotic genomes (that was replaced by BUSCO). Its resulting core gene dataset can be used to train a gene finder or to assess the completeness of the genome or annotations.

View all literature mentions

purge dups (software resource)

RRID:SCR_021173

Software tool to purge haplotigs and overlaps in assembly based on read depth.Used for haplotypic duplication identification. Designed to remove haplotigs and contig overlaps in a de novo assembly based on read depth.

View all literature mentions

Hifiasm (data analysis software)

RRID:SCR_021069

Software tool as haplotype resolved de novo assembler for PacBio Hifi reads. Can assemble human genome in several hours.Introduces new graph binning algorithm and achieves haplotype resolved assembly given trio data. Takes advantage of long high fidelity sequence reads to represent haplotype information in phased assembly graph. Preserves contiguity of all haplotypes.

View all literature mentions

fastp (software resource)

RRID:SCR_016962

Software tool to provide fast all in one preprocessing for FastQ files. Developed in C++ with multithreading supported to afford high performance. Performs quality control, adapter trimming, quality filtering, per read quality pruning and many other operations with a single scan of the FASTQ data.

View all literature mentions

Illumina NovaSeq 6000 Sequencing System (instrument resource)

RRID:SCR_016387

System unleashes groundbreaking innovations that leverage our proven technology. Now you can get scalable throughput and flexibility for virtually any sequencing method, genome, and scale of project.

View all literature mentions

PacBio Sequel II System (instrument resource)

RRID:SCR_017990

Sequencer by Pacific Biosciencies. System provides advantages of SMRT Sequencing and now makes it more affordable for all scientists to drive discovery with comprehensive views of genomes and transcriptomes. Generates 8 times more data than original Sequel System. Provides access to highly accurate long reads (HiFi reads). Reduces project time for faster results. Makes sequencing more affordable.Supports range of SMRT Sequencing applications.

View all literature mentions

FloMax (software resource)

RRID:SCR_014437

Software for acquisition and analysis of flow cytometry data. FloMax data analysis works with data from flow cytometers that support the flow cytometry file standard (FCS). FloMax operates on computers with Windows 95, 98, and 2000.

View all literature mentions

GenomeScope (software resource)

RRID:SCR_017014

Open source software package for fast genome analysis from unassembled short reads. Used to estimate genome heterozygosity, repeat content, and size from sequencing reads using a kmer-based statistical approach.

View all literature mentions

KMC (software resource)

RRID:SCR_001245

Software utility for counting k-mers (sequences of consecutive k symbols) in a set of reads from genome sequencing projects. It scans the raw reads and produces a compact representation of all non-unique reads accompanied with number of their occurrences. The algorithm implemented makes use mostly of disk space rather than RAM, which allows to use KMC even on rather typical personal computers.

View all literature mentions