Low Variance BigWig Track Settings
 
Exon Usage with low variance across cell lines

Display mode:       Reset to defaults

Overlay method:
Type of graph:
Track height: pixels (range: 10 to 300)
Data view scaling: Always include zero: 
Vertical viewing range: min:  max:   (range: 0 to 127)
Transform function:Transform data points by: 
Windowing function: Smoothing window:  pixels
Negate values:
Draw y indicator lines:at y = 0.0:    at y =
Graph configuration help
List subtracks: only selected/visible    all  
     3' Exon Splice Site Usage  Usage of 3' Splice Sites w/ variance <0.01 across cell lines   Schema 
     5' Exon Splice Site Usage  Usage of 5' Splice Sites w/ variance <0.01 across cell lines   Schema 

Description

Exon Usage data aggregated across 19 ENCODE cell lines. The green marks represent 5' splice sites. The blue marks represent 3’ splice sites. Data presented in bigWig format, where the height of each mark represents the usage value.

Methods

The splice site usage of a 5’ or 3’ splice site is meant to estimate the proportion of transcripts from a gene that undergo a splicing event that utilizes that particular site. To accomplish this we analyze exon-exon junction reads obtained from mapping of RNA-seq data. For a given splice site, there are three categories of junction reads which go into calculating its splice site usage: (A) a read that has the splice site as one of the sides of the junction, (B) a read that spans the splice site (i.e. the junction is between a site that is upstream and one that is downstream of the site in question), and (C) for a 5’ (3’) splice site, a read with a junction that has its 5’ (3’) end in the downstream (upstream) intron. From the counts of these three classes of reads, splice site usage is defined as A/(A+B+C). In order to mitigate the corruption of this metric by the false positive splice junctions frequently output by RNA-seq aligners, we only considered junction reads that contained splice sites present in the GENCODEv29 annotation. If there were multiple RNA-seq replicates for a particular cell line or condition, we collapsed the junction read counts from all replicates before calculating splice site usages.

In other words, splice site usage measures how often a particular splice site is "used" during splicing relative to other splice sites in the same gene, with values ranging from 0 (never used) to 1 (every transcript seen in the RNA-seq data uses this splice site). Scripts used to generate usage data can be found here

Because splice site usage can vary greatly across cell types, the values in this track are the average across the indicated cell lines, limited to those with low (less than 0.001) inter-cell type variance.

All ENCODE cell lines used in analysis (individual cell line tracks are also provided):

  • A549
  • AG04450
  • BJ
  • CD14
  • CD20
  • GM12878
  • H1-hESC
  • HeLa-S3
  • HepG2
  • HMEC
  • HSMM
  • HUVEC
  • IMR90
  • K562
  • MCF-7
  • NHEK
  • NHLF
  • SK-N-SH
  • SK-N-SH-RA

Credits

This track was created at the Fairbrother Laboratory at Brown University by Luke Buerer, Camillo Saueressig, and David Glidden.

References

ENCODE Project Consortium. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature, 489(7414), 57.

Contact

william_fairbrother@brown.edu