Structural variants

Dear members of gnomAD,

We are investigating structural variation in a repetitive region on human chromosome 15. In the SVs v4.1.0 database, we identified two sets of deletion variants - one comprising three calls and the other two - that span overlapping genomic intervals and show very similar allele counts across genetic ancestry groups (with identical homozygous counts). This pattern suggests that these variants may be genetically associated, however, because they delete overlapping/identical sequences, they should not be able to co-occur on the same chromosome, which raises the possibility of an annotation, merging, or representation issue.

Could you please check whether something might be going wrong in how these variants are represented or counted? The variants are:

DEL_CHR15_3430BA77; DEL_CHR15_8CD85032; DEL_CHR15_E92B2C3D

and

DEL_CHR15_1F817031; DEL_CHR15_2D64E68E

Thanks a lot,

Hi,

Thank you for flagging this. After reviewing the raw data, I agree that the overlapping variants likely represent the same underlying event.

Tracing this back through our pipeline, I estimate that such cases account for ~1.5–2% of the entire callset. We do have a checkpoint to resolve these types of redundancies, but it is currently applied only to variants in repetitive regions—where short-read mapping is particularly challenging—and not to all variants.

I appreciate you bringing this to our attention. We will implement a more general merging strategy in the next release to address this issue.

Best,