Inconsistent SNP MAF values between VCF v4.0 and browser

Hi,

The SNP : gnomAD

I have download the VCF file from: wget https://datasetgnomad.blob.core.windows.net/dataset/release/4.0/vcf/genomes/gnomad.genomes.v4.0.sites.chr1.vcf.bgz

On the browser: The Grpmax (or popmax previously) should be South Asian with MAF value of 0.002102.

However, in the VCF file, grpmax is Non-Finnish European (NFE), and South Asian has 0 MAF (?):

chr1	182843290	rs928033076	A	T	.	AS_VQSR	AC=6;AN=145434;AF=4.12558e-05;grpmax=nfe;fafmax_faf95_max=1.192e-05;fafmax_faf95_max_gen_anc=nfe;AC_XX=2;AF_XX=2.67816e-05;AN_XX=74678;nhomalt_XX=0;AC_XY=4;AF_XY=5.65323e-05;AN_XY=70756;nhomalt_XY=0;nhomalt=0;AC_afr_XX=0;AF_afr_XX=0;AN_afr_XX=20428;nhomalt_afr_XX=0;AC_afr_XY=0;AF_afr_XY=0;AN_afr_XY=17828;nhomalt_afr_XY=0;AC_afr=0;AF_afr=0;AN_afr=38256;nhomalt_afr=0;AC_ami_XX=0;AF_ami_XX=0;AN_ami_XX=460;nhomalt_ami_XX=0;AC_ami_XY=0;AF_ami_XY=0;AN_ami_XY=440;nhomalt_ami_XY=0;AC_ami=0;AF_ami=0;AN_ami=900;nhomalt_ami=0;AC_amr_XX=0;AF_amr_XX=0;AN_amr_XX=6572;nhomalt_amr_XX=0;AC_amr_XY=0;AF_amr_XY=0;AN_amr_XY=8204;nhomalt_amr_XY=0;AC_amr=0;AF_amr=0;AN_amr=14776;nhomalt_amr=0;AC_asj_XX=0;AF_asj_XX=0;AN_asj_XX=1854;nhomalt_asj_XX=0;AC_asj_XY=0;AF_asj_XY=0;AN_asj_XY=1592;nhomalt_asj_XY=0;AC_asj=0;AF_asj=0;AN_asj=3446;nhomalt_asj=0;AC_eas_XX=0;AF_eas_XX=0;AN_eas_XX=2248;nhomalt_eas_XX=0;AC_eas_XY=0;AF_eas_XY=0;AN_eas_XY=2858;nhomalt_eas_XY=0;AC_eas=0;AF_eas=0;AN_eas=5106;nhomalt_eas=0;AC_fin_XX=1;AF_fin_XX=0.000474834;AN_fin_XX=2106;nhomalt_fin_XX=0;AC_fin_XY=2;AF_fin_XY=0.000286287;AN_fin_XY=6986;nhomalt_fin_XY=0;AC_fin=3;AF_fin=0.00032996;AN_fin=9092;nhomalt_fin=0;AC_mid_XX=0;AF_mid_XX=0;AN_mid_XX=142;nhomalt_mid_XX=0;AC_mid_XY=0;AF_mid_XY=0;AN_mid_XY=144;nhomalt_mid_XY=0;AC_mid=0;AF_mid=0;AN_mid=286;nhomalt_mid=0;AC_nfe_XX=1;AF_nfe_XX=2.58184e-05;AN_nfe_XX=38732;nhomalt_nfe_XX=0;AC_nfe_XY=2;AF_nfe_XY=7.12251e-05;AN_nfe_XY=28080;nhomalt_nfe_XY=0;AC_nfe=3;AF_nfe=4.49021e-05;AN_nfe=66812;nhomalt_nfe=0;AC_raw=30;AF_raw=0.000202454;AN_raw=148182;nhomalt_raw=0;AC_remaining_XX=0;AF_remaining_XX=0;AN_remaining_XX=1010;nhomalt_remaining_XX=0;AC_remaining_XY=0;AF_remaining_XY=0;AN_remaining_XY=1020;nhomalt_remaining_XY=0;AC_remaining=0;AF_remaining=0;AN_remaining=2030;nhomalt_remaining=0;AC_sas_XX=0;AF_sas_XX=0;AN_sas_XX=1126;nhomalt_sas_XX=0;AC_sas_XY=0;AF_sas_XY=0;AN_sas_XY=3604;nhomalt_sas_XY=0;AC_sas=0;AF_sas=0;AN_sas=4730;nhomalt_sas=0;AC_joint_XX=543;AF_joint_XX=0.000770458;AN_joint_XX=704776;nhomalt_joint_XX=0;AC_joint_XY=682;AF_joint_XY=0.000995214;AN_joint_XY=685280;nhomalt_joint_XY=0;AC_joint=1225;AF_joint=0.000881259;AN_joint=1390056;nhomalt_joint=0;AC_joint_afr_XX=18;AF_joint_afr_XX=0.000515671;AN_joint_afr_XX=34906;nhomalt_joint_afr_XX=0;AC_joint_afr_XY=4;AF_joint_afr_XY=0.000138677;AN_joint_afr_XY=28844;nhomalt_joint_afr_XY=0;AC_joint_afr=22;AF_joint_afr=0.000345098;AN_joint_afr=63750;nhomalt_joint_afr=0;AC_joint_ami_XX=0;AF_joint_ami_XX=0;AN_joint_ami_XX=460;nhomalt_joint_ami_XX=0;AC_joint_ami_XY=0;AF_joint_ami_XY=0;AN_joint_ami_XY=440;nhomalt_joint_ami_XY=0;AC_joint_ami=0;AF_joint_ami=0;AN_joint_ami=900;nhomalt_joint_ami=0;AC_joint_amr_XX=23;AF_joint_amr_XX=0.00118447;AN_joint_amr_XX=19418;nhomalt_joint_amr_XX=0;AC_joint_amr_XY=32;AF_joint_amr_XY=0.00173819;AN_joint_amr_XY=18410;nhomalt_joint_amr_XY=0;AC_joint_amr=55;AF_joint_amr=0.00145395;AN_joint_amr=37828;nhomalt_joint_amr=0;AC_joint_asj_XX=22;AF_joint_asj_XX=0.00191205;AN_joint_asj_XX=11506;nhomalt_joint_asj_XX=0;AC_joint_asj_XY=22;AF_joint_asj_XY=0.00188034;AN_joint_asj_XY=11700;nhomalt_joint_asj_XY=0;AC_joint_asj=44;AF_joint_asj=0.00189606;AN_joint_asj=23206;nhomalt_joint_asj=0;AC_joint_eas_XX=28;AF_joint_eas_XX=0.00144838;AN_joint_eas_XX=19332;nhomalt_joint_eas_XX=0;AC_joint_eas_XY=40;AF_joint_eas_XY=0.00220629;AN_joint_eas_XY=18130;nhomalt_joint_eas_XY=0;AC_joint_eas=68;AF_joint_eas=0.00181517;AN_joint_eas=37462;nhomalt_joint_eas=0;AC_joint_fin_XX=33;AF_joint_fin_XX=0.0013668;AN_joint_fin_XX=24144;nhomalt_joint_fin_XX=0;AC_joint_fin_XY=35;AF_joint_fin_XY=0.001265;AN_joint_fin_XY=27668;nhomalt_joint_fin_XY=0;AC_joint_fin=68;AF_joint_fin=0.00131244;AN_joint_fin=51812;nhomalt_joint_fin=0;AC_joint_mid_XX=4;AF_joint_mid_XX=0.00189573;AN_joint_mid_XX=2110;nhomalt_joint_mid_XX=0;AC_joint_mid_XY=3;AF_joint_mid_XY=0.0010989;AN_joint_mid_XY=2730;nhomalt_joint_mid_XY=0;AC_joint_mid=7;AF_joint_mid=0.00144628;AN_joint_mid=4840;nhomalt_joint_mid=0;AC_joint_nfe_XX=346;AF_joint_nfe_XX=0.000631391;AN_joint_nfe_XX=547996;nhomalt_joint_nfe_XX=0;AC_joint_nfe_XY=409;AF_joint_nfe_XY=0.00080911;AN_joint_nfe_XY=505494;nhomalt_joint_nfe_XY=0;AC_joint_nfe=755;AF_joint_nfe=0.000716666;AN_joint_nfe=1053490;nhomalt_joint_nfe=0;AC_joint_raw=14261;AF_joint_raw=0.00891116;AN_joint_raw=1600352;nhomalt_joint_raw=9;AC_joint_remaining_XX=35;AF_joint_remaining_XX=0.00127532;AN_joint_remaining_XX=27444;nhomalt_joint_remaining_XX=0;AC_joint_remaining_XY=35;AF_joint_remaining_XY=0.00142057;AN_joint_remaining_XY=24638;nhomalt_joint_remaining_XY=0;AC_joint_remaining=70;AF_joint_remaining=0.00134403;AN_joint_remaining=52082;nhomalt_joint_remaining=0;AC_joint_sas_XX=34;AF_joint_sas_XX=0.00194731;AN_joint_sas_XX=17460;nhomalt_joint_sas_XX=0;AC_joint_sas_XY=102;AF_joint_sas_XY=0.00215983;AN_joint_sas_XY=47226;nhomalt_joint_sas_XY=0;AC_joint_sas=136;AF_joint_sas=0.00210246;AN_joint_sas=64686;nhomalt_joint_sas=0;AC_grpmax=3;AF_grpmax=4.49021e-05;AN_grpmax=66812;nhomalt_grpmax=0;grpmax_joint=sas;AC_grpmax_joint=136;AF_grpmax_joint=0.00210246;AN_grpmax_joint=64686;nhomalt_grpmax_joint=0;faf95=1.781e-05;faf95_afr=0;faf95_amr=0;faf95_eas=0;faf95_nfe=1.192e-05;faf95_sas=0;faf99=1.173e-05;faf99_afr=0;faf99_amr=0;faf99_eas=0;faf99_nfe=6.31e-06;faf99_sas=0;fafmax_faf99_max=6.31e-06;fafmax_faf99_max_gen_anc=nfe;faf95_joint=0.00083983;faf95_joint_afr=0.00023311;faf95_joint_amr=0.00114699;faf95_joint_eas=0.00146817;faf95_joint_nfe=0.00067373;faf95_joint_sas=0.00181403;faf99_joint=0.00082327;faf99_joint_afr=0.00019695;faf99_joint_amr=0.00103702;faf99_joint_eas=0.00134183;faf99_joint_nfe=0.00065673;faf99_joint_sas=0.00170573;fafmax_faf95_max_joint=0.00181403;fafmax_faf95_max_gen_anc_joint=sas;fafmax_faf99_max_joint=0.00170573;fafmax_faf99_max_gen_anc_joint=sas;fafmax_data_type_joint=both;age_hist_het_bin_freq=0|0|0|0|1|1|0|1|1|0;age_hist_het_n_smaller=0;age_hist_het_n_larger=0;age_hist_hom_bin_freq=0|0|0|0|0|0|0|0|0|0;age_hist_hom_n_smaller=0;age_hist_hom_n_larger=0;FS=24.7026;MQ=59.9032;MQRankSum=-0.037;QUALapprox=9191;QD=2.79192;ReadPosRankSum=-0.138;SOR=2.44;VarDP=3292;AS_FS=25.9361;AS_MQ=59.9105;AS_MQRankSum=-0.033;AS_pab_max=1;AS_QUALapprox=7978;AS_QD=2.52708;AS_ReadPosRankSum=-0.156;AS_SB_TABLE=1393,1245|486,129;AS_SOR=2.50242;AS_VarDP=3157;inbreeding_coeff=-0.000202495;AS_culprit=AS_MQ;AS_VQSLOD=-3.367;negative_train_site;allele_type=snv;n_alt_alleles=4;variant_type=mixed;was_mixed;gq_hist_alt_bin_freq=0|0|0|0|0|0|0|2|1|0|0|0|0|0|0|0|0|0|0|3;gq_hist_all_bin_freq=0|0|0|0|29568|16178|8982|5312|4101|2562|1607|1335|855|588|507|334|227|164|125|225;dp_hist_alt_bin_freq=0|0|3|1|0|0|1|0|0|0|0|0|0|1|0|0|0|0|0|0;dp_hist_alt_n_larger=0;dp_hist_all_bin_freq=0|0|373|2552|7001|11100|30426|18041|2183|502|252|125|57|24|15|9|6|1|0|1;dp_hist_all_n_larger=2;ab_hist_alt_bin_freq=0|0|0|0|2|2|0|0|0|0|2|0|0|0|0|0|0|0|0|0;cadd_raw_score=-0.210192;cadd_phred=0.486;spliceai_ds_max=0.25;pangolin_largest_ds=0.1;phylop=-0.648;VRS_Allele_IDs=ga4gh:VA.VakW3lMd6_j5tFL4-PQ2i6UuhJGUqtRg,ga4gh:VA.apo69Ujd7ygoFhUIfwl9ewF-t77Jrf5o;VRS_Starts=182843289,182843289;VRS_Ends=182843290,182843290;VRS_States=A,T;vep=T|splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant|LOW|DHX9|ENSG00000135829|Transcript|ENST00000367549|protein_coding||2/27|ENST00000367549.4:c.112-4A>T|||||||1||1||SNV|HGNC|HGNC:2750|YES|NM_001357.5||1|P1|CCDS41444.1|ENSP00000356520|Q08211-1|Ensembl|||||||||||||||,T|splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant|LOW|DHX9|ENSG00000135829|Transcript|ENST00000483416|processed_transcript||2/5|ENST00000483416.1:n.336-4A>T|||||||1||1||SNV|HGNC|HGNC:2750||||5|||||Ensembl|||||||||||||||,T|upstream_gene_variant|MODIFIER||ENSG00000287808|Transcript|ENST00000661321|lncRNA||||||||||1|3729|-1||SNV|||YES||||||||Ensembl|||||||||||||||,T|splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant|LOW|DHX9|1660|Transcript|NM_001357.5|protein_coding||2/27|NM_001357.5:c.112-4A>T|||||||1||1||SNV|EntrezGene|HGNC:2750|YES|ENST00000367549.4|||||NP_001348.2||RefSeq|||||||||||||||,T|splice_region_variant&splice_polypyrimidine_tract_variant&intron_variant&non_coding_transcript_variant|LOW|DHX9|1660|Transcript|NR_033302.2|misc_RNA||2/28|NR_033302.2:n.244-4A>T|||||||1||1||SNV|EntrezGene|HGNC:2750|||||||||RefSeq|||||||||||||||,T|upstream_gene_variant|MODIFIER|LOC647070|647070|Transcript|NR_148933.1|lncRNA||||||||||1|3699|-1||SNV|EntrezGene||YES||||||||RefSeq|||||||||||||||

I wonder if this is just clerical error?

Hi,

0.002102 is the AF for the South Asian group considering both exomes and genomes. The download link you reference is for the genome data only. In the genome data, the South Asian group has an AC of 0 and AN of 4730, and the European (Finnish) group has the highest AF (with an AF of 0.0003300) but is excluded from grpmax, and European (non-Finnish) has the next highest AF. There are checkboxes for whether or not to include “Exomes”, “Genomes”, or both in the “Genetic Ancestry Group Frequencies” table displayed on the browser.

Thanks! Is there a plan to have both WES and Genome VCF files combined in a single VCF?

Hi,

We do not plan to release combined VCF files, but we do have joint (exome + genome) information available for some annotations. In the Hail Tables, these annotation are “joint_freq”, “joint_grpmax,” “joint_faf”, “joint_fafmax”. In the VCFs, these annotations will contain the term “joint”, for example:
##INFO=<ID=AN_joint,Number=1,Type=Integer,Description=“Total number of alleles in joint subset”>
##INFO=<ID=AN_joint_afr_XX,Number=1,Type=Integer,Description=“Total number of alleles in XX samples of African/African-American ancestry in joint subset”>
##INFO=<ID=fafmax_faf95_max_joint,Number=A,Type=Float,Description=“Maximum filtering allele frequency (using Poisson 95% CI) across genetic_ancestry groups in joint subset”>

We were able to create a “joint” track by subseting the exome and genome VCFs to only the joint_ fields, joining them, and then de-duping the records that are in both.

This is a great track for annotation and filtering, as it has the breadth of the genomes with the allele frequency range of the exomes, so we are using it for our default population frequency filters (joint_grpmax) going forward.

Some more details in our blog post: https://www.goldenhelix.com/blog/gnomad-v4-released-enhanced-data-and-golden-helix-curation-for-varseq-users/