4.1 CNV site count on gnomAD website differ from VCF/BED files

Hello,

We noticed a difference in the reported site count for the GD_16p11.2-BP4-BP5__DEL variant between the gnomAD website and the VCF file.

  • The gnomAD website reports a site count of 114 for this variant: gnomAD

  • The VCF reports a site count of 58:

$ bcftools query -f '%CHROM:%POS-%END\t%ID\t%INFO/SC\n' \
  https://storage.googleapis.com/gcp-public-data--gnomad/release/4.1/exome_cnv/gnomad.v4.1.cnv.non_neuro_controls.vcf.gz 2>/dev/null | \
  grep "GD_16p11.2-BP4-BP5__DEL" 
chr16:29663598-30186216    GD_16p11.2-BP4-BP5__DEL    58

We also see a count of 58 in the BED file (https://storage.googleapis.com/gcp-public-data--gnomad/release/4.1/exome_cnv/gnomad.v4.1.cnv.non_neuro_controls.bed), which matches what is shown on the UCSC Genome Browser: https://genome.ucsc.edu/cgi-bin/hgc?db=hg38&c=chr16&l=29663597&r=30186216&o=29663597&t=30186216&g=gnomadCopyNumberVariants&i=GD_16p11.2-BP4-BP5__DEL

Could you clarify the reason for this discrepancy?

The gnomAD browser displays the site count computed across all of the 464,297 exomes assessed for rare, coding CNVs, which is why it is showing 114. The files you linked contain CNVs computed on a subset of samples.

The file that contains the site count across all of the 464,297 exomes is https://storage.googleapis.com/gcp-public-data--gnomad/release/4.1/exome_cnv/gnomad.v4.1.cnv.all.vcf.gz:

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO
chr16	29663598	GD_16p11.2-BP4-BP5__DEL	N	<DEL>	.	PASS	END=30186216;SVLEN=522618;SVTYPE=DEL;POSMIN=29663152;POSMAX=29678911;ENDMIN=29963687;ENDMAX=30188229;Genes=AC009133.6,AC093512.2,AC120114.4,ALDOA,ASPHD1,C16orf54,C16orf92,CDIPT,CORO1A,DOC2A,GDPD3,HIRIP3,INO80E,KCTD13,KIF22,MAPK3,MAZ,MVP,PAGR1,PPP4C,PRRT2,QPRT,SEZ6L2,SPN,TAOK2,TBX6,TLCD3B,TMEM219,YPEL3,ZG16;N_EXN_VAR=175;N_INT_VAR=160;SC=114;SC_afr=6;SC_amr=6;SC_asj=3;SC_eas=2;SC_fin=8;SC_mid=0;SC_nfe=77;SC_remaining=5;SC_sas=7;SN=464284;SN_afr=11335;SN_amr=20798;SN_asj=10397;SN_eas=15938;SN_fin=22688;SN_mid=1414;SN_nfe=326256;SN_remaining=20978;SN_sas=34480;SF=0.000245539368145359;SF_afr=0.000529333921482135;SF_amr=0.000288489277815175;SF_asj=0.000288544772530538;SF_eas=0.000125486259254612;SF_fin=0.000352609308885755;SF_mid=0;SF_nfe=0.000236010985238586;SF_remaining=0.000238344932786729;SF_sas=0.000203016241299304;SC_XY=59;SC_afr_XY=1;SC_amr_XY=3;SC_asj_XY=1;SC_eas_XY=1;SC_fin_XY=1;SC_mid_XY=0;SC_nfe_XY=43;SC_remaining_XY=3;SC_sas_XY=6;SN_XY=236825;SN_afr_XY=4678;SN_amr_XY=8824;SN_asj_XY=5441;SN_eas_XY=7994;SN_fin_XY=10861;SN_mid_XY=825;SN_nfe_XY=161710;SN_remaining_XY=9972;SN_sas_XY=26520;SF_XY=0.000249129103768606;SF_afr_XY=0.000213766566908935;SF_amr_XY=0.000339981867633726;SF_asj_XY=0.000183789744532255;SF_eas_XY=0.000125093820365274;SF_fin_XY=9.20725531718995e-05;SF_mid_XY=0;SF_nfe_XY=0.000265908107105312;SF_remaining_XY=0.000300842358604091;SF_sas_XY=0.000226244343891403;SC_XX=55;SC_afr_XX=5;SC_amr_XX=3;SC_asj_XX=2;SC_eas_XX=1;SC_fin_XX=7;SC_mid_XX=0;SC_nfe_XX=34;SC_remaining_XX=2;SC_sas_XX=1;SN_XX=227459;SN_afr_XX=6657;SN_amr_XX=11974;SN_asj_XX=4956;SN_eas_XX=7944;SN_fin_XX=11827;SN_mid_XX=589;SN_nfe_XX=164546;SN_remaining_XX=11006;SN_sas_XX=7960;SF_XX=0.00024180181922896;SF_afr_XX=0.000751089079164789;SF_amr_XX=0.000250542842826123;SF_asj_XX=0.000403551251008878;SF_eas_XX=0.000125881168177241;SF_fin_XX=0.000591866069163778;SF_mid_XX=0;SF_nfe_XX=0.000206629149295638;SF_remaining_XX=0.000181719062329638;SF_sas_XX=0.000125628140703518

The corresponding BED file is https://storage.googleapis.com/gcp-public-data--gnomad/release/4.1/exome_cnv/gnomad.v4.1.cnv.all.bed.