Geographical place of origin of samples

Hello to anyone who might be able to help.

I’d like to know if the geographical place of origin of samples is available in Gnomad data files or elsewhere. In the blog post:

It says: “Note that while some labels indicate a sample’s geographical origin (e.g., Finland), we do not know how all of these labels were collected or in what context.”

Is geographical origin combined with genetic ancestry available?

The original 1000genomes project had codes that combines them, as in the following document:

https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/README_populations.md

An excerpt:

“ CHB Han Chinese Han Chinese in Beijing, China
JPT Japanese Japanese in Tokyo, Japan
CHS Southern Han Chinese Han Chinese South
CDX Dai Chinese Chinese Dai in Xishuangbanna, China
KHV Kinh Vietnamese Kinh in Ho Chi Minh City, Vietnam
CHD Denver Chinese Chinese in Denver, Colorado (pilot 3 only)

So one could compare Chinese in Denver to Chinese in Beijing, for example.

Does Gnomad maintain this anywhere and if so, where?

Thank you so much in advance for any help.

Hi,
Thank you for your inquiry. Unfortunately the geographical origin of the sample is not something we collect for our metadata so we do not have that data available. The statement you quoted, is in regards to the ancestry/ethnicity labels that are sometimes provided by the data contributors. However, we have data from over 300 different contributors and over 100 studies, each of which provide different types labels (patient-provided, researcher-provided, country of recruitment), which is why we put that disclaimer.

Please let me know if you have any follow up questions,
Sam