RNA sequencing, or RNA-seq, is a method for mapping and quantifying the
total amount of RNA transcripts in a cell at any given time, otherwise known as
the transcriptome, for any organism that has a genomic DNA sequence
assembly. Compared to microarrays that detect and quantify transcripts by
hybridization against known sequences, RNA-seq directly sequences
transcripts and is especially
well-suited for de novo
discovery of RNA splicing patterns and for determining unequivocally
the presence or absence of lower abundance class RNAs.
RNA-seq is performed by reverse-transcribing an RNA sample into
cDNA followed by high throughput DNA sequencing. Most data is produced
in the format of either single reads or paired-end reads.
In the format of single reads each sequence read comes from one end
of a randomly primed cDNA molecule (and represent one end of one cDNA
segment), while paired-end reads are obtained as pairs
from both ends of a randomly primed cDNA (and represent two opposite
ends of one cDNA segment). The resulting sequence reads are then
informatically mapped onto the genome sequence (Alignments).
The current mappers (TopHat and STAR) have the ability to map
reads to annotated and unannotated genomic regions.
Reads mapped to annotated or novel RNA splice junctions are
(Splice
Sites). Earlier versions of this software did not map
reads to unannotated genomic regions.
Some RNA-seq protocols do not specify the coding strand. As a result,
there can be ambiguity at loci where both strands are transcribed.
To view the full description, click here.
|