How do you assign variants to genes?

Variants sourced from the Open Targets Genetics Portal are linked to protein-coding genes based on distance within a 1 Mb window of the lead variant (500 kb each side of the lead variant). The likelihood of a gene being causal for the genetic association represented by the lead variant is captured by the gene prioritisation score. This score is the output of the locus-to-gene pipeline, which integrates fine-mapping, colocalisation and functional genomics.

Variants from other sources than the ones from Open Targets Genetics (e.g. European Variation Archive, UniProt, PheWAS Catalog) are linked to genes and transcripts using the Ensembl Variant Effect Predictor (VEP). If a variant maps to a gene with different transcripts, we will assign it the most severe variant definitions or consequence term (e.g. start loss, stop gained) from the Sequence Ontology (SO). If the variant maps to intergenic regions, we will assign the variant to the 5' end (i.e. nearest likely promoter) of the closest protein coding gene on either strand (forward or reverse) within a 1 Mb window (500 kb each side of the variant).