Hello,
I have a dataset of predicted loss of function carriers. I am interested in identifying a subset of them that carry pLoF in a gene where being heterozygous might be disease causing. This is just for hypothesis generation. I am looking at the most recent release of constraint data (gnomAD v4.0), in particular I am looking at the column “lof.oe_ci.upper: LOEUF: upper bound of 90% confidence interval for o/e ratio for high confidence pLoF variants (lower values indicate more constrained)”.
Based on the methods/readme below is my understanding correct that transcripts that have a lof.oe_ci.upper < 0.6 are below the threshold where lof is thought to be an issue? What I am also wondering is what would be a hard LOEUF threshold for where just being heterozygous might cause disease? < 0.2?
https://storage.googleapis.com/gcp-public-data--gnomad/release/4.1/constraint/README.txt
Please let me know, thanks!
Hello, thank you for posting and welcome to the forum!
Some notes, let me know if this answers all of your questions:
- The most recent gnomAD Constraint release is onto v4.1, available here
- Genes with lof.oe_ci.upper < 0.6 ‘indicate strong selection against predicted loss-of-function (pLoF) variation in a given gene’ and are indicative of a gene that is indeed constrained.
- Specifically noting Hets and Homs , LOEUF only looks at the total number of times a variant is observed (where Hets have an AC=1 and Homs AC=2) but don’t count anything explicitly about them. Of only minor interest to you may be the ‘Number of Homozygotes’ counter listed for each variant
- Of significantly more interest to you will be the fact that the metric ‘pLI’ explicitly refers to a gene’s likelihood of not tolerating heterozygous variants , as opposed to just not tolerating hom variants or no variants. Consider checking (or cross-checking with lof.oe_ci.upper<0.6) if lof.pLI>=0.90 , if you’re specifically interested in heterozygous tolerance.
ought to be it! Let me know if you have any further comments or questions or concerns