Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Identification of structural variation in mouse genomes.

Frontiers in genetics | 2014

Structural variation is variation in structure of DNA regions affecting DNA sequence length and/or orientation. It generally includes deletions, insertions, copy-number gains, inversions, and transposable elements. Traditionally, the identification of structural variation in genomes has been challenging. However, with the recent advances in high-throughput DNA sequencing and paired-end mapping (PEM) methods, the ability to identify structural variation and their respective association to human diseases has improved considerably. In this review, we describe our current knowledge of structural variation in the mouse, one of the prime model systems for studying human diseases and mammalian biology. We further present the evolutionary implications of structural variation on transposable elements. We conclude with future directions on the study of structural variation in mouse genomes that will increase our understanding of molecular architecture and functional consequences of structural variation.

Pubmed ID: 25071822 RIS Download

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


BREAKDANCER (tool)

RRID:SCR_001799

A Perl/C++ software package that provides genome-wide detection of structural variants from next generation paired-end sequencing reads. BreakDancerMax predicts five types of structural variants: insertions, deletions, inversions, inter- and intra-chromosomal translocations from next-generation short paired-end sequencing reads using read pairs that are mapped with unexpected separation distances or orientation. (entry from Genetic Analysis Software)

View all literature mentions

DINDEL (tool)

RRID:SCR_001827

Software program for calling small indels from short-read sequence data ("next generation sequence data"). It is currently designed to handle only Illumina data. Dindel takes BAM files with mapped Illumina read data and enables researchers to detect small indels and produce a VCF file of all the variant calls. It has been written in C++ and can be used on Linux-based and Mac computers (it has not been tested on Windows operating systems).

View all literature mentions

European Bioinformatics Institute (tool)

RRID:SCR_004727

Non-profit academic organization for research and services in bioinformatics. Provides freely available data from life science experiments, performs basic research in computational biology, and offers user training programme, manages databases of biological data including nucleic acid, protein sequences, and macromolecular structures. Part of EMBL.

View all literature mentions

SVMerge (tool)

RRID:SCR_004777

Software pipeline to detect structural variants (SVs) by integrating calls from several existing SV callers, which are then validated and the breakpoints refined using local de novo assembly. The output is in BED format allowing for easy downstream analysis or viewing in a genome browser. It is modular and extensible allowing new callers to be incorporated as they become available.

View all literature mentions

RetroSeq (tool)

RRID:SCR_005133

A tool for discovery and genotyping of transposable element variants (TEVs) (also known as mobile element insertions) from next-gen sequencing reads aligned to a reference genome in BAM format. The goal is to call TEVs that are not present in the reference genome but present in the sample that has been sequenced. It should be noted that RetroSeq can be used to locate any class of viral insertion in any species where whole-genome sequencing data with a suitable reference genome is available. RetroSeq is a two phase process, the first being the read pair discovery phase where discorandant mate pairs are detected and assigned to a TE class (Alu, SINE, LINE, etc.) by using either the annotated TE elements in the reference and/or aligned with Exonerate to the supplied library of viral sequences.

View all literature mentions

inGAP (tool)

RRID:SCR_005261

Software mining pipeline guided by a Bayesian principle to detect single nucleotide polymorphisms, insertion and deletions by comparing high-throughput pyrosequencing reads with a reference genome of related organisms. This pipeline is extended to identify and visualize large-size structural variations, including insertions, deletions, inversions and translocations.

View all literature mentions

PEMer (tool)

RRID:SCR_005263

Software package as computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Package is composed of three modules, PEMer workflow, SV-Simulation and BreakDB. PEMer workflow is a sensitive software for detecting SVs from paired-end sequence reads. SV-Simulation randomly introduces SVs into a given genome and generates simulated paired-end reads from novel genome.

View all literature mentions

SPLITREAD (tool)

RRID:SCR_005264

Software for detecting INDELs (small insertions and deletion with size less than 50bp) as well as large deletions that are within the coding regions from the exome sequencing data. It also can be applied to the whole genome sequencing data.

View all literature mentions

mrFAST (tool)

RRID:SCR_005487

Software designed to map short reads generated with the Illumina platform to reference genome assemblies; in a fast and memory-efficient mannerl. Currently Supported Features: * Output in SAM format * Indels up to 8 bp (4 bp deletions and 4 bp insertions) * Paired-end mapping ** Discordant option to generate mapping file ready for VariationHunter to detect structural variants. * One end anchored (OEA) map locations for novel sequence insertion detection with NovelSeq * Matepair library mapping (long inserts with RF orientation). Planned Features: * Multithreading

View all literature mentions

MoDIL (tool)

RRID:SCR_010764

Software for a novel method for finding medium sized indels from high throughput sequencing datasets.

View all literature mentions

CNVer (tool)

RRID:SCR_010820

A method for CNV detection that supplements the depth-of-coverage with paired-end mapping information, where matepairs mapping discordantly to the reference serve to indicate the presence of variation.

View all literature mentions

CNVnator (tool)

RRID:SCR_010821

An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing.

View all literature mentions

RDXplorer (tool)

RRID:SCR_013290

A computational tool for copy number variants (CNV) detection in whole human genome sequence data using read depth (RD) coverage.

View all literature mentions

CNV-seq (tool)

RRID:SCR_013357

A method for detecting DNA copy number variation (CNV) using high-throughput sequencing.

View all literature mentions

1000 Genomes Project and AWS (tool)

RRID:SCR_008801

A dataset containing the full genomic sequence of 1,700 individuals, freely available for research use. The 1000 Genomes Project is an international research effort coordinated by a consortium of 75 companies and organizations to establish the most detailed catalogue of human genetic variation. The project has grown to 200 terabytes of genomic data including DNA sequenced from more than 1,700 individuals that researchers can now access on AWS for use in disease research free of charge. The dataset containing the full genomic sequence of 1,700 individuals is now available to all via Amazon S3. The data can be found at: http://s3.amazonaws.com/1000genomes The 1000 Genomes Project aims to include the genomes of more than 2,662 individuals from 26 populations around the world, and the NIH will continue to add the remaining genome samples to the data collection this year. Public Data Sets on AWS provide a centralized repository of public data hosted on Amazon Simple Storage Service (Amazon S3). The data can be seamlessly accessed from AWS services such Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic MapReduce (Amazon EMR), which provide organizations with the highly scalable compute resources needed to take advantage of these large data collections. AWS is storing the public data sets at no charge to the community. Researchers pay only for the additional AWS resources they need for further processing or analysis of the data. All 200 TB of the latest 1000 Genomes Project data is available in a publicly available Amazon S3 bucket. You can access the data via simple HTTP requests, or take advantage of the AWS SDKs in languages such as Ruby, Java, Python, .NET and PHP. Researchers can use the Amazon EC2 utility computing service to dive into this data without the usual capital investment required to work with data at this scale. AWS also provides a number of orchestration and automation services to help teams make their research available to others to remix and reuse. Making the data available via a bucket in Amazon S3 also means that customers can crunch the information using Hadoop via Amazon Elastic MapReduce, and take advantage of the growing collection of tools for running bioinformatics job flows, such as CloudBurst and Crossbow.

View all literature mentions

Illumina (tool)

RRID:SCR_010233

American company incorporated that develops, manufactures and markets integrated systems for the analysis of genetic variation and biological function. Provides a line of products and services that serve the sequencing, genotyping and gene expression and proteomics markets. Its headquarters are located in San Diego, California.

View all literature mentions

C3H/HeJ (tool)

RRID:IMSR_JAX:000659

Mus musculus with name C3H/HeJ from IMSR.

View all literature mentions

DBA/2J (tool)

RRID:IMSR_JAX:000671

Mus musculus with name DBA/2J from IMSR.

View all literature mentions

AKR/J (tool)

RRID:IMSR_JAX:000648

Mus musculus with name AKR/J from IMSR.

View all literature mentions

BALB/cJ (tool)

RRID:IMSR_JAX:000651

Mus musculus with name BALB/cJ from IMSR.

View all literature mentions

CBA/J (tool)

RRID:IMSR_JAX:000656

Mus musculus with name CBA/J from IMSR.

View all literature mentions

C57BL/6J (tool)

RRID:IMSR_JAX:000664

Mus musculus with name C57BL/6J from IMSR.

View all literature mentions