Blocked when using API to get AF

Hi,

I’m trying to annotate a small (several hundred) dataset of variants. I’m using the API but after about 10 queries I seem to get blocked (I’m using gnomadR so I don’t see the error message). is there a way to submit a set of variant_ids and get an AF back? currently I’m iterating one by one, and that is surely causing needless network traffic and compute…what’s the “right” way to do this (other than spinning up my on cluster…)?

Regarding gnomadR, my familiarity with this specific library is limited. I’ve created a Python script below designed to fetch data for a list of variants without breaching rate limits. It’s important to note that our API caters to the browser’s front end, where the typical use case involves examining variants individually or within specific genes or regions. Consequently, we do not currently offer support for batch queries of a miscellaneous list of variants. Should there be an interest in accessing variants by gene or region, I can provide examples tailored to those scenarios. We recognize the constraints of the current system and hope to expand the API’s functionality in the future.

# pip install requests
import requests
import time

# GraphQL endpoint
url = "https://gnomad.broadinstitute.org/api"

# The GraphQL query template
query = """
query GnomadVariant($variantId: String!, $datasetId: DatasetId!) {
  variant(variantId: $variantId, dataset: $datasetId) {
    variant_id
    reference_genome
    chrom
    pos
    ref
    alt
    colocated_variants
    faf95_joint {
      popmax
      popmax_population
    }
    coverage {
      exome {
        mean
        over_20
      }
      genome {
        mean
        over_20
      }
    }
    exome {
      ac
      an
      ac_hemi
      ac_hom
      faf95 {
        popmax
        popmax_population
      }
      filters
      populations {
        id
        ac
        an
        ac_hemi
        ac_hom
      }
    }
    genome {
      ac
      an
      ac_hemi
      ac_hom
      faf95 {
        popmax
        popmax_population
      }
      filters
      populations {
        id
        ac
        an
        ac_hemi
        ac_hom
      }
    }
    flags
    lof_curations {
      gene_id
      gene_symbol
      verdict
      flags
      project
    }
    rsids
    transcript_consequences {
      domains
      gene_id
      gene_version
      gene_symbol
      hgvs
      hgvsc
      hgvsp
      is_canonical
      is_mane_select
      is_mane_select_version
      lof
      lof_flags
      lof_filter
      major_consequence
      polyphen_prediction
      sift_prediction
      transcript_id
      transcript_version
    }
    in_silico_predictors {
      id
      value
      flags
    }
  }
}
"""


# Function to query for a single variant
def query_variant(variant_id, dataset_id="gnomad_r4"):
    # Query variables
    variables = {"variantId": variant_id, "datasetId": dataset_id}

    # HTTP POST request
    response = requests.post(url, json={"query": query, "variables": variables})
    if response.status_code == 200:
        return response.json()  # Returns the JSON response
    else:
        raise Exception(
            f"Query failed to run by returning code of {response.status_code}. {response.text}"
        )


# List of variants to query
variants = ["1-55052746-GT-G", "1-55058620-TG-T"]  # Add your variants here

# Results list
results = []

# Loop through each variant and query
for variant in variants:
    result = query_variant(variant)
    results.append(result)
    time.sleep(6)  # Sleep to respect the 10 queries per minute limit

# results now contains the response for each variant