Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing.

Nature genetics | 2017

Accurate annotation of genes and their transcripts is a foundation of genomics, but currently no annotation technique combines throughput and accuracy. As a result, reference gene collections remain incomplete-many gene models are fragmentary, and thousands more remain uncataloged, particularly for long noncoding RNAs (lncRNAs). To accelerate lncRNA annotation, the GENCODE consortium has developed RNA Capture Long Seq (CLS), which combines targeted RNA capture with third-generation long-read sequencing. Here we present an experimental reannotation of the GENCODE intergenic lncRNA populations in matched human and mouse tissues that resulted in novel transcript models for 3,574 and 561 gene loci, respectively. CLS approximately doubled the annotated complexity of targeted loci, outperforming existing short-read techniques. Full-length transcript models produced by CLS enabled us to definitively characterize the genomic features of lncRNAs, including promoter and gene structure, and protein-coding potential. Thus, CLS removes a long-standing bottleneck in transcriptome annotation and generates manual-quality full-length transcript models at high-throughput scales.

Pubmed ID: 29106417 RIS Download

Research resources used in this publication

None found

Antibodies used in this publication

None found

Associated grants

  • Agency: NIMH NIH HHS, United States
    Id: R01 MH101814
  • Agency: NHGRI NIH HHS, United States
    Id: U41 HG007000
  • Agency: NHGRI NIH HHS, United States
    Id: U41 HG007234
  • Agency: NHGRI NIH HHS, United States
    Id: U54 HG007004

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


Illumina (tool)

RRID:SCR_010233

American company incorporated that develops, manufactures and markets integrated systems for the analysis of genetic variation and biological function. Provides a line of products and services that serve the sequencing, genotyping and gene expression and proteomics markets. Its headquarters are located in San Diego, California.

View all literature mentions

GENCODE (tool)

RRID:SCR_014966

Human and mouse genome annotation project which aims to identify all gene features in the human genome using computational analysis, manual annotation, and experimental validation.

View all literature mentions

HeLa (tool)

RRID:CVCL_0030

Cell line HeLa is a Cancer cell line with a species of origin Homo sapiens

View all literature mentions

K-562 (tool)

RRID:CVCL_0004

Cell line K-562 is a Cancer cell line with a species of origin Homo sapiens (Human)

View all literature mentions