These tracks represent the results of targeted long-read RNA sequencing
aimed at identifying lowly expressed lncRNAs in adult and embryonic
tissues. The track consists of capture target regions, mappings of pre- and
post-capture reads, and transcript models built from the data.
Portions of this dataset were used to develop the lncRNA annotations
introduced in GENCODE v47. The data are a superset of the data incorporated
into GENCODE. The transcript models for a given RNA do not necessarily match
those in GENCODE and are provided as a guide to exploring the sequencing data.
Detailed descriptions of the data are available at the
GENCODE CLS Project site.
Display Conventions and Configuration
This is a multi-view composite track containing multiple data types (views). Each view includes subtracks that are displayed individually in the browser. Instructions for configuring multi-view tracks are
here.
Views:
Targets: Capture target regions
Models: Transcript models generated from reads and merging
Sample models: Transcript models by sample in which they were observed
Per-experiment reads: Read mappings per experiment
Per-experiment Models: Transcript models generated from the experiments
Methods
This project, led by the
GENCODE consortium,
employed the Capture Long-read Sequencing (CLS) protocol to enrich transcripts from targeted genomic regions. It used a large capture array with orthologous probes in human and mouse genomes, targeting non-GENCODE lncRNA annotations and regions suspected of unannotated transcription. CapTrap-Seq, a cDNA library preparation protocol, was used to enrich for full-length RNA molecules (5′ to 3′).
Matched adult and embryonic tissues from human and mouse were selected to maximize transcriptome complexity. Libraries were sequenced pre- and post-capture using PacBio and Oxford Nanopore Technologies (ONT) long-read platforms, as well as short-read technologies.
Transcript isoform models were built from reads using the LyRic analysis software. These were merged using intron chains, with transcription start and end sites anchored using CAGE and poly(A) data.