SV genotype quality threshold

Dear gnomAD people,

Thank you very much for offering this forum and for your responsiveness to the questions asked, which is of great help to everyone.

Maybe I didn’t search well enough, but I couldn’t find where you explain in detail what you base your assessment of the quality of SV genotyping on.
Could you indicate what criteria you take into account or tell us where you are explaining it?

Do you have a threshold below which you would no longer consider a particular SV due to a too poor calling quality?

Can you tell us what is the average genotype quality for all SVs present in gnomAD v4 ?

A current example we are dealing with is the multiple deletions in the proximal part of the SHANK3 gene with a low genotype quality :
DEL_CHR22_648A75AC
DEL_CHR22_65A72368
and part of DEL_CHR22_62DEF806

Thank you very much for your help!

Best regards

Louis

Hey, Louis,

Thank you for your interest! We have annotated the quality of SV sites in the “FILTER” column of the vcf, and you can restrict to “PASS” SVs for analyses.

In our pipeline, we have implemented multiple quality evaluation methods based on alignment signatures of short read sequences (you can refer to Collins et al 2020 for most of the methods), Hardy-Weinbery equilibrium, batch effect, and manually evaluated a subset of the SVs. We estimate > 96% precision for SVs outside of segmental duplicates and simple repeats evaluated by matched long-read PacBio sequences (reported in the blog post Structural Variants in gnomAD v4 | gnomAD browser).

Hope that helps.

Best,