I was looking at the Genetic Ancestry blog post, and I noted that “Hawaiian” was among the labels provided by different sequencing projects. After looking at the documentation, as well as independently validating this myself, it appears that the main dataset only contains the imputed genetic ancestry labels (i.e., afr, ami, etc.) and the ones from the HGDP and 1KG subsets, which contain their own respective ancestry labels. With that being said, these do not contain the “Hawaiian” label. I wanted to confirm that beyond the imputed labels from the main dataset and the aforementioned subsets, there is no further labeled ancestry data in the v4 release (that may include Hawaiian), and instead, that label came from another sequencing project, was used in aggregating and harmonizing the genetic ancestry labels, but was not kept in the final dataset.
Are there any updates on this?
Some samples with the Hawaiian label were aggregated and used to help infer samples belonging to the Admixed American group. The genetic ancestry groups that we have inferred for v4 are listed in the leftmost column of the “Genetic Ancestries in v4” table in the genetic ancestry blog post. We would consider “Hawaiian” a genetic ancestry subgroup. We have limited genetic ancestry subgroups inferred for gnomAD v2 ( such as the “Japanese” subgroup within the “East Asian” group), but we have not yet evaluated the potential of inferring genetic ancestry subgroups in v4, so you will not find any samples inferred as Hawaiian.
Regards,
Kristen