The aim of the GENCODE
Genes project (Harrow et al., 2006) is to produce a set of
highly accurate annotations of evidence-based gene features on the human reference genome.
This includes the identification of all protein-coding loci with associated
alternative splice variants, non-coding with transcript evidence in the public
databases (NCBI/EMBL/DDBJ) and pseudogenes. A high quality set of gene
structures is necessary for many research studies such as comparative or
evolutionary analyses, or for experimental design and interpretation of the
results.
The GENCODE Genes tracks display the high-quality manual annotations merged
with evidence-based automated annotations across the entire
human genome. The GENCODE gene set presents a full merge
between HAVANA manual annotation and Ensembl automatic annotation.
Priority is given to the manually curated HAVANA annotation using predicted
Ensembl annotations when there are no corresponding manual annotations. With
each release, there is an increase in the number of annotations that have undergone
manual curation.
This annotation was carried out on the GRCh37 (hg19) genome assembly.
Experimental verification details are given in each descriptions for each
track. Transcript Support Levels were determined for version 10 onwards based
on evidence provided by GenBank mRNA and EST sequences. Versions 7 and 10 are
being used in data analysis by the ENCODE consortium.
NOTE: Due to the UCSC Genome Browser using the NC_001807 mitochondrial
genome sequence
(chrM) and GENCODE annotating the NC_012920 mitochondrial sequence, the
GENCODE mitochondrial sequences are not available in the UCSC Genome Browser.
These annotations are available for download in the
GENCODE GTF files.
For more information on the different gene tracks, see our Genes FAQ.
To view the full description, click here.
|