How are the context-dependent mutation rates calculated for constraint modelling?

agastya · December 6, 2023, 5:42am

Hi, I am trying to create a context-dependent mutation model for constraint modelling. I am using the gnomad v3 paper as a guide.
I want to understand how are the mutation rates calculated for the trinucleotide context? For example, if there are 500 ACA > AGA mutations and ACA occurs 2000 times in the whole genome, is the mutation rate (ACA > AGA) = 500/2000 or is it 500/(2000*3)? I assume *3 because ACA can mutate to 3 different contexts.
Or is it something completely different? I assume I need to normalise the rates, but I don’t know how to go about it. Any help would be appreciated. Thanks.

Siwei_Chen · December 14, 2023, 7:55pm

Hi Agastya, it should be 500/2000 as you are specifically computing the mutation rate for ACA > AGA. Although ACA can be mutated to three different contexts, it does not affect the mutation rate of ACA > AGA here. Say If you want to compute ACA > ATA, then you would count how many ACA > ATA instances and divide this number by 2000.

agastya · December 15, 2023, 6:01am

Right, thanks that’s what I have been doing. So, if the number of samples increases (and consequently the number of mutations) the rates also increase. So, calculating the rates from just 1000 genomes and then calculating expected variants, leads to a underestimation of expected variants compared to observed variants of my complete dataset (10,000 samples) and consequently a very low constraint.
Presently, I have been using the rates from 10,000 samples to calculate rates, to calculate constraint. But of course, that over-extimates the CG rich contexts’ mutation rates. How can I transform the rates from 1000 genomes, which can be applied to 10,000 genomes?
Thanks a lot.

Topic		Replies	Views
Mutation rate table Constraint	0	41	July 25, 2024
Feature Suggestion: Proposal to Integrate GeneBe ACMG Criteria in gnomAD Feature requests	0	48	July 17, 2024
Comparing constraint between different population sizes Constraint	0	30	October 1, 2024
Population AF versus gnomAD AF General	2	371	January 16, 2024
I need Help with the percentage of participants having a certain allele General	1	294	April 25, 2024

How are the context-dependent mutation rates calculated for constraint modelling?

Related topics