Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Publication

Rare variant phasing and haplotypic expression from RNA sequencing with phASER.

Stephane E Castel | Pejman Mohammadi | Wendy K Chung | Yufeng Shen | Tuuli Lappalainen

Nature communications | 2016

Haplotype phasing of genetic variants is important for clinical interpretation of the genome, population genetic analysis and functional genomic analysis of allelic activity. Here we present phASER, an accurate approach for phasing variants that are overlapped by sequencing reads, including those from RNA sequencing (RNA-seq), which often span multiple exons due to splicing. Using diverse RNA-seq data we demonstrate that this provides more accurate phasing of rare variants compared with population-based phasing and allows phasing of variants in the same gene up to hundreds of kilobases away that cannot be obtained from DNA sequencing (DNA-seq) reads. We show that in the context of medical genetic studies this improves the resolution of compound heterozygotes. Additionally, phASER provides measures of haplotypic expression that increase power and accuracy in studies of allelic expression. In summary, phasing using RNA-seq and phASER is accurate and improves studies where rare variant haplotypes or allelic expression is needed.

Pubmed ID: 27605262 RIS Download

Research resources used in this publication

None found

Additional research tools detected in this publication

Antibodies used in this publication

None found

Associated grants

Agency: NIDA NIH HHS, United States
Id: R01 DA006227
Agency: NICHD NIH HHS, United States
Id: R01 HD057036
Agency: NIMH NIH HHS, United States
Id: R01 MH101782
Agency: NIMH NIH HHS, United States
Id: R01 MH101810
Agency: NIMH NIH HHS, United States
Id: R01 MH101819
Agency: NIMH NIH HHS, United States
Id: R01 MH090936
Agency: NIMH NIH HHS, United States
Id: R01 MH090951
Agency: NIMH NIH HHS, United States
Id: R01 MH101820
Agency: NIDDK NIH HHS, United States
Id: P30 DK026687
Agency: NIMH NIH HHS, United States
Id: R01 MH101822
Agency: NCRR NIH HHS, United States
Id: UL1 RR024156
Agency: NIDA NIH HHS, United States
Id: R01 DA033684
Agency: NIMH NIH HHS, United States
Id: R01 MH106842
Agency: NIMH NIH HHS, United States
Id: R01 MH101825
Agency: NIMH NIH HHS, United States
Id: R01 MH090948
Agency: NIMH NIH HHS, United States
Id: R01 MH090941
Agency: NIGMS NIH HHS, United States
Id: R01 GM122924
Agency: CCR NIH HHS, United States
Id: HHSN261200800001C
Agency: NIMH NIH HHS, United States
Id: R01 MH090937
Agency: NHLBI NIH HHS, United States
Id: HHSN268201000029C
Agency: NCI NIH HHS, United States
Id: HHSN261200800001E
Agency: NIMH NIH HHS, United States
Id: R01 MH101814

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.

1000 Genomes: A Deep Catalog of Human Genetic Variation (tool)

RRID:SCR_006828

International collaboration producing an extensive public catalog of human genetic variation, including SNPs and structural variants, and their haplotype contexts, in an effort to provide a foundation for investigating the relationship between genotype and phenotype. The genomes of about 2500 unidentified people from about 25 populations around the world were sequenced using next-generation sequencing technologies. Redundant sequencing on various platforms and by different groups of scientists of the same samples can be compared. The results of the study are freely and publicly accessible to researchers worldwide. The consortium identified the following populations whose DNA will be sequenced: Yoruba in Ibadan, Nigeria; Japanese in Tokyo; Chinese in Beijing; Utah residents with ancestry from northern and western Europe; Luhya in Webuye, Kenya; Maasai in Kinyawa, Kenya; Toscani in Italy; Gujarati Indians in Houston; Chinese in metropolitan Denver; people of Mexican ancestry in Los Angeles; and people of African ancestry in the southwestern United States. The goal Project is to find most genetic variants that have frequencies of at least 1% in the populations studied. Sequencing is still too expensive to deeply sequence the many samples being studied for this project. However, any particular region of the genome generally contains a limited number of haplotypes. Data can be combined across many samples to allow efficient detection of most of the variants in a region. The Project currently plans to sequence each sample to about 4X coverage; at this depth sequencing cannot provide the complete genotype of each sample, but should allow the detection of most variants with frequencies as low as 1%. Combining the data from 2500 samples should allow highly accurate estimation (imputation) of the variants and genotypes for each sample that were not seen directly by the light sequencing. All samples from the 1000 genomes are available as lymphoblastoid cell lines (LCLs) and LCL derived DNA from the Coriell Cell Repository as part of the NHGRI Catalog. The sequence and alignment data generated by the 1000genomes project is made available as quickly as possible via their mirrored ftp sites. ftp://ftp.1000genomes.ebi.ac.uk ftp://ftp-trace.ncbi.nlm.nih.gov/1000genomes

View all literature mentions

NumPy (tool)

RRID:SCR_008633

NumPy is the fundamental package needed for scientific computing with Python. It contains among other things: * a powerful N-dimensional array object * sophisticated (broadcasting) functions * tools for integrating C/C and Fortran code * useful linear algebra, Fourier transform, and random number capabilities. Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases. Sponsored by ENTHOUGHT

View all literature mentions

GATK (tool)

RRID:SCR_001876

A software package to analyze next-generation resequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of any size. This software library makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner. (entry from Genetic Analysis Software)

View all literature mentions

GitHub (tool)

RRID:SCR_002630

A web-based hosting service for software development projects that use the Git revision control system offering powerful collaboration, code review, and code management. It offers both paid plans for private repositories, and free accounts for open source projects. Large or small, every repository comes with the same powerful tools. These tools are open to the community for public projects and secure for private projects. Features include: * Integrated issue tracking * Collaborative code review * Easily manage teams within organizations * Text entry with understated power * A growing list of programming languages and data formats * On the desktop and in your pocket - Android app and mobile web views let you keep track of your projects on the go.

View all literature mentions

Systems Transcriptional Activity Reconstruction (tool)

RRID:SCR_005622

A next-generation web-based application that aims to provide an integrated solution for both visualization and analysis of deep-sequencing data, along with simple access to public datasets.

View all literature mentions

BEDTools (tool)

RRID:SCR_006646

A powerful toolset for genome arithmetic allowing one to address common genomics tasks such as finding feature overlaps and computing coverage. Bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line.

View all literature mentions

SciPy (tool)

RRID:SCR_008058

A Python-based environment of open-source software for mathematics, science, and engineering. The core packages of SciPy include: NumPy, a base N-dimensional array package; SciPy Library, a fundamental library for scientific computing; and IPython, an enhanced interactive console.

View all literature mentions

HapCUT (tool)

RRID:SCR_010791

A max-cut based algorithm for haplotype assembly using sequence reads from the two chromosomes of an individual.

View all literature mentions

Phaser (tool)

RRID:SCR_014219

Crystallographic software which solves structures using algorithms and automated rapid search calculations to perform molecular replacement and experimental phasing methods.

View all literature mentions

About

The SciCrunch Infrastructure was developed as a cooperative data platform to be used by diverse communities in making data more FAIR.

Contact Us

FAIR Data Informatics Lab

University of California, San Diego

9500 Gilman Drive, Mail Code 0608

La Jolla, CA 92093-0608

United States

info

scicrunch.org

About SciCrunch | Privacy Policy | Terms of Service

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Rare variant phasing and haplotypic expression from RNA sequencing with phASER.

Research resources used in this publication

Additional research tools detected in this publication

Antibodies used in this publication

Associated grants

This is a list of tools and resources that we have found mentioned in this publication.