Hi all!
I work in a research group in Edinburgh and am looking into the SALL1 gene and the molecular disease mechanism of Townes-Brocks syndrome. For this, I have been compiling data from ClinVar (disease-causing) and gnomAD (neutral) as a potential avenue to discuss haploinsufficiency vs dominant-negative as disease mechanisms. Now, I saw that there are many early frameshift and heterozygous loss of SALL1 variants published on gnomAD, which would fit with a dominant-negative hypothesis. However, 7 of those variants (found in 3,488 individuals) are tagged with an LCR tag as they are found in a repetitive, low-complexity region. I was wondering whether that means that I should exclude them from the neutral variants analysis, as they are most likely found due to a sequencing bias, or whether I should still include them in the analysis due to their high prevalence.
Basically, how much can I trust a variant with a LCR tag, but more than 1,000 individuals having been found with it? I’m attaching a screenshot of the variants below:
Many thanks for your help and best wishes
Sara