Constraint scores HMC Track Settings

JavaScript is disabled in your web browser

You must have JavaScript enabled in your web browser to use the Genome Browser

HMC - Homologous Missense Constraint Score on PFAM domains

Track collection: Human constraint scores

Description

The "Constraint scores" container track includes several subtracks showing the results of constraint prediction algorithms. These try to find regions of negative selection, where variations likely have functional impact. The algorithms do not use multi-species alignments to derive evolutionary constraint, but use primarily human variation, usually from variants collected by gnomAD (see the gnomAD V2 or V3 tracks on hg19 and hg38) or TOPMED (contained in our dbSNP tracks and available as a filter). One of the subtracks is based on UK Biobank variants, which are not available publicly, so we have no track with the raw data. The number of human genomes that are used as the input for these scores are 76k, 53k and 110k for gnomAD, TOPMED and UK Biobank, respectively.

Note that another important constraint score, gnomAD constraint, is not part of this container track but can be found in the hg38 gnomAD track.

The algorithms included in this track are:

JARVIS - "Junk" Annotation genome-wide Residual Variation Intolerance Score: JARVIS scores were created by first scanning the entire genome with a sliding-window approach (using a 1-nucleotide step), recording the number of all TOPMED variants and common variants, irrespective of their predicted effect, within each window, to eventually calculate a single-nucleotide resolution genome-wide residual variation intolerance score (gwRVIS). That score, gwRVIS was then combined with primary genomic sequence context, and additional genomic annotations with a multi-module deep learning framework to infer pathogenicity of noncoding regions that still remains naive to existing phylogenetic conservation metrics. The higher the score, the more deleterious the prediction. This score covers the entire genome, except the gaps.
HMC - Homologous Missense Constraint: Homologous Missense Constraint (HMC) is a amino acid level measure of genetic intolerance of missense variants within human populations. For all assessable amino-acid positions in Pfam domains, the number of missense substitutions directly observed in gnomAD (Observed) was counted and compared to the expected value under a neutral evolution model (Expected). The upper limit of a 95% confidence interval for the Observed/Expected ratio is defined as the HMC score. Missense variants disrupting the amino-acid positions with HMC<0.8 are predicted to be likely deleterious. This score only covers PFAM domains within coding regions.
MetaDome - Tolerance Landscape Score (hg19 only): MetaDome Tolerance Landscape scores are computed as a missense over synonymous variant count ratio, which is calculated in a sliding window (with a size of 21 codons/residues) to provide a per-position indication of regional tolerance to missense variation. The variant database was gnomAD and the score corrected for codon composition. Scores <0.7 are considered intolerant. This score covers only coding regions.
MTR - Missense Tolerance Ratio (hg19 only): Missense Tolerance Ratio (MTR) scores aim to quantify the amount of purifying selection acting specifically on missense variants in a given window of protein-coding sequence. It is estimated across sliding windows of 31 codons (default) and uses observed standing variation data from the WES component of gnomAD version 2.0. Scores were computed using Ensembl v95 release. The number of gnomAD 2 exomes used here is higher than the number of gnomAD 3 samples (125 exoms versus 76k full genomes), and this score only covers coding regions so gnomAD 2 was more appropriate.
LINSIGHT (hg19 only): LINSIGHT is a statistical model for estimating negative selection on noncoding sequences in the human genome. The LINSIGHT score measures the probability of negative selection on non-coding sites which can be used to prioritize SNVs associated with genetic diseases or quantify evolutionary constraint on regulatory sequences, e.g., enhancers or promoters. More specifically, if a non-coding site is under negative selection, it will be less likely to have a substitution or SNV in the human lineage. In addition, even if we see a SNV at the site, it will tend to segregate at low frequency because of selection. See (Huang et al, Nat Genet 2017).
UK Biobank depletion rank score (hg38 only): Halldorsson et al. tabulated the number of UK Biobank variants in each 500bp window of the genome and compared this number to an expected number given the heptamer nucleotide composition of the window and the fraction of heptamers with a sequence variant across the genome and their mutational classes. A variant depletion score was computed for every overlapping set of 500-bp windows in the genome with a 50-bp step size. They then assigned a rank (depletion rank (DR)) from 0 (most depletion) to 100 (least depletion) for each 500-bp window. Since the windows are overlapping, we plot the value only in the central 50bp of the 500bp window, following advice from the author of the score, Hakon Jonsson, deCODE Genetics. He suggested that the value of the central window, rather than the worst possible score of all overlapping windows, is the most informative for a position. This score covers almost the entire genome, only very few regions were excluded, where the genome sequence had too many gap characters.

To view the full description, click here.

All tracks in this collection (7)

Display mode:

Type of graph:
Track height:	pixels (range: 8 to 128)
Data view scaling:	Always include zero:
Vertical viewing range:	min:	max: (range: 0 to 2)
Transform function:	Transform data points by:
Windowing function:		Smoothing window:	pixels
Negate values:
Draw y indicator lines:	at y = 0.0: at y =

Graph configuration help

View table schema