Annotations

Each VCF file is annotated based on a wide range of annotations sources. These variant annotations are used by Moon to perform automated filtering of variants. In addition, all used annotations are displayed within the Moon UI to facilitate manual review of the results of the automated pipeline and to allow manual filtering of all rare variants within the VCF in the Filter view.

The following annotation sources, with indication of the information used from each source, are included:

  • ClinVar: variant-level clinical classification

  • InvitaeKB: variant-level clinical classification based on Sherloc and expert manual curation available for GRCh37 samples (Nykamp et al., 2017)

  • Snpeff: variant effect annotation (using Ensembl 37.75, 38.86 as reference for the respective genome builds)

  • dbNSFP: in silico conservation and prediction scores

  • EVE: in silico prediction score based on a computational method trained solely on evolutionary sequences (Frazer et al., 2021)

  • dbscSNV: splice site prediction scores for splice region variants, defined as position located −3 to +8 at the 5’ splice site and −12 to +2 at the 3’ splice site” (Jian et al., 2014).

  • SpliceAI: splice prediction scores annotated for SNVs and 1-2 nt sized indels located in exonic regions of Apollo genes, or within 200 nt flanking intronic regions (Jaganathan et al., 2019)

  • Apollo (by Invitae): disorder-related annotations including disease name, associated genes, overlap with the input phenotype, inheritance pattern, age of onset etc.

  • Ensembl: 37.75, 38.86 as reference for the respective genome builds

  • RefSeq: transcript ID annotation. Translation of ENST into RefSeq provided by Ensembl.

  • gnomAD: population allele frequencies for single nucleotide variants

  • gnomAD_MT: population allele frequencies for mitochondrial variants

  • gnomAD_SV: population allele frequencies for annotation of deletions and duplications

  • HPO: Human phenotype ontology terms

  • DGV: population allele frequencies for annotation of deletions and duplications (DGV Gold standard; Database of Genomic Variants)

  • dbVar: Structural Variants with clinical assertions

  • Mitomap: GenBank variant frequencies

  • Mitimpact: in silico conservation and prediction scores of mitochondrial variants

  • Mastermind: Cited Variants Reference linking genetic variants to medical literature (Mastermind Cited Variants Reference)

  • CaseRepoSample: Sample observations, based on samples uploaded in your lab account, annotated as ‘Lab frequency’ for each variant.

  • ClinGen regions: Curated dosage sensitive regions

The version of each annotation source can be checked for each individual analysis through the ‘Annotation Sources’ link, at the bottom of an analysis page.

Each annotation source is available for both genome builds GRCh37 or GRCh38. Based on the extracted genome build from the VCF file, Moon uses the corresponding sets of annotations sources during analysis.

Users should be aware that all annotation sources used by Moon might contain wrong/missing information that might impact results of Moon's automated analysis pipeline. Assessment of relevance of variants not selected by Moon's automated filter pipeline by means of manual filtering of variants is therefore recommended.

With regard to splice sites, Moon only annotates splice site acceptor/donor and splice region effects in variants located at the border of the exon-intron region. Variants that create novel splice sites within non-coding or exonic regions, are not annotated as splice site variants.