There seems to be an error in the vcf files (at least in genome versions, I did not look at exome) in the INFO
column, for the field vep
Files investigated:
64.64 GiB 2023-11-01T00:03:45Z gs://gcp-public-data--gnomad/release/4.0/vcf/genomes/gnomad.genomes.v4.0.sites.chr1.vcf.bgz
890.03 MiB 2023-11-01T00:03:47Z gs://gcp-public-data--gnomad/release/4.0/vcf/genomes/gnomad.genomes.v4.0.sites.chrY.vcf.bgz
for “synonymous” variants,
the value of the vep
field is typically:
You can see that this string contains a number of =
signs, for instance ENSP00000331704.5:p.Leu329=
However, a field value in the INFO column of a VCF file cannot contain an =
sign since this sign is reserved for assigning the value to the vep
variable/field, (and the field value in a VCF is not isolated by quotes).
This formatting error crashes parsing programs such as snpsift, bcftools, etc…
Thank you for your attention and hoping to help,