1LocusSim: SIMULATION OF A LOCUS WITH GENETIC DRIFT, MUTATION AND SELECTION

 

Features
  1. 1LocusSim is a simple and adaptable (mobile-friendly) simulator to visualize the effect of genetic drift, selection and mutation on allele frequency.
  2. It is programmed in Python based on the NumPy library.

 

Contact
If you have any questions please contact me .

 

 

Back to AC-R home
Simulation of selection
Natural selection
Natural selection under the basic model of a biallelic gene (locus) with a deleterious allele will tend to decrease its frequency. Therefore, the variance σ2 of the allele frequency will decrease. This can be measured by calculating the variance of allele frequency for a set of populations or replicates (Figure 3). The effect is clear if the population size, N, is large (the effective population size Ne, even if it is less than N due to selection, it will remain relatively large). On the contrary, if the population size is small, the deterministic effect of the selection will be overcome by the randomness generated by genetic drift, so that the frequency of the deleterious allele will depend more on chance. For example, with 20 populations and 100 generations we can see in Figure 3 the comparison of the selection effect between the case N = 10 and N=1000 for a coefficient of selection s = 0.1 and dominance h=0.5.
N = 10
N = 1000
Figure 3. Effect of selection on allelic frequency. Parameters: μ=0, s=0.1, h=0.5.
Visually, the much higher variance between lines is clearly appreciated for the case N=10. The mean frequency and variance values were 0.35 and 0.23 with N=10, while they were 0.005 and 0 with N=1000. In the first case, the strong random component of the drift hinders the selection effect. In the second case, with N=1000, the effect of the genetic drift is negligible with respect to the strength of selection.

 

Figure 4 shows the basic equations that govern the effect of selection on allele frequency for the case of a biallelic locus, one of whose alleles is deleterious, with selection coefficient s and dominance h. The frequency in the next generation, q', depends on the frequencies before selection, p and q, and on the selection and dominance coefficients, and is normalized (so that the new frequencies p' and q' add up to 1) dividing by the mean population fitness W. The change in frequency Δq is simply the difference between the new and the old frequency. The changes of p and q are equal with opposite sign because the total number of alleles is constant since N is constant.
Figure 4. Equations of the effect of selection on the allelic frequency.
In general, in the absence of genetic drift (large N), with random mating and without mutation or migration, the equations in Figure 4 allow prediction of frequency evolution for any number of generations t. Being s>0 and since there is no mutation or migration that can replace the deleterious allele, it will be extinguished sooner or later regardless of the value of the dominance coefficient. But the time that the deleterious allele takes to be eliminated does depend on the dominance coefficient. Table 1 shows the equations for the change in frequency of a lethal allele (s=1) after t generations, depending on whether the allele is dominant, additive or recessive.
Table 1. Frequency change equations for a lethal allele
s h qt Δq
110 (t>0) -q
10.5q0/2t-q/2
10q0 /(1+tq0) -q/(1+q)
Obviously, if the allele is lethal and dominant (h=1), it will disappear in a single generation (all its carriers die). If the relationship is additive (h=0.5) and the initial frequency is 0.5, we expect that, for example, after 5 generations its frequency will be approximately 0.016 (97% reduction), but if it is recessive (h=0), after 5 generations the frequency will still be 0.14 (72% reduction). This shows why the strongly deleterious alleles that can be observed in populations are recessive, otherwise they disappear very quickly (Figure 5).
h = 0.5
h = 0
Figure 5. Effect of selection on the allele frequency of a lethal allele after 5 generations. Paremeters: N=104, μ=0, s=1.

 

Overdominance
A special case in dominance relationships is overdominance, which implies that the heterozygote is fitter than the two homozygotes. The classical model defines the fitness of homozygotes as WAA =1-s1 and W aa=1-s2, where the fitness of the heterozygote is WAa = 1, under this model the equilibrium frequency is

 

   qeq= s1 / (s1+s2).

 

1LocusSim does not allow entering two selection coefficients but this is not a problem because the overdominance classical model can also be described by the parameters s and h defined with values such that they cause the heterozygote to has higher fitness. We distinguish two cases depending on whether they are s>0 and h<0 or s<0 and h>1. Both cases are shown in tables 2 and 3.
Table 2. Obtaining an overdominance model with s>0 y h<0 (WAa>WAA>Waa)
WAA WAa Waa
11-sh1-s
11+s|h|1-s
1/(1+s|h|)1(1-s)/(1+s|h|)
1-s111 -s2
s1 =s|h|/(1+s|h|) s2 =s(|h|+1)/(1+s|h|)
qeq =|h|/(2|h|+1)
Table 3. Obtaining an overdominance model with s<0 y h>1 (WAa>Waa>WAA)
WAA WAa Waa
11-sh1-s
11+|s|h1+|s|
1/(1+|s|h)1(1+|s|)/(1+|s|h)
1 -s111 -s2
s1 =|s|h/(1+|s|h) s2 =|s|(h-1)/(1+|s|h)
qeq =h/(2h-1)
It is interesting to appreciate that under both models, the value of the equilibrium frequency does not depend on the selection coefficient s but on the absolute value of h. This is a not so intuitive mathematical result and it is perfect to check it using the simulator. We will define a scenario with low genetic drift, that is, a population with large N, for example, N=104 for 104 generations. The initial frequency is 0.5. In Table 4 we can compare the expected equilibrium value with that observed under both models and with different selection and dominance coefficients. Each simulation is run a couple of times.
Table 4. Overdominance: comparison of expected and observed equilibrium values
Modelo s h qeq qobs
s>0 h<0 0.1 -0.01 0.0098 0.0
s>0 h<0 0.5 -0.01 0.0098 0.0
s>0 h<0 1 -0.01 0.0098 0.0
s>0 h<0 0.1 -0.1 0.083 0.078-0.105
s>0 h<0 0.5 -0.1 0.083 0.075-0.076
s>0 h<0 1 -0.1 0.083 0.075-0.090
s>0 h<0 0.1 -0.5 0.25 0.25-0.27
s>0 h<0 0.5 -0.5 0.25 0.25-0.26
s>0 h<0 1 -0.5 0.25 0.25-0.25
s>0 h<0 0.1 -100 0.4975 0.4920-0.4930
s>0 h<0 0.5 -100 0.4975 0.4950-0.5010
s>0 h<0 1 -100 0.4975 0.4940-0.4960
s<0 h>1 -0.1 1.05 0.955 0.977-0.954
s<0 h>1 -1 1.05 0.955 0.957-0.946
s<0 h>1 -0.1 1.25 0.83 0.83-0.83
s<0 h>1 -1 1.25 0.83 0.83-0.84
s<0 h>1 -0.1 100 0.5025 0.5020-0.5030
s<0 h>1 -1 100 0.5025 0.4970-0.5010
We can see that the equilibrium predictions work very well. The only exception is the case with h=-0.01 which will need a larger N value to achieve equilibrium. The Table clearly shows that the allele frequency values after 104 generations depend on h and not on s as indicated by the values of equilibrium calculated in Tables 2 and 3.

 

In Figure 6 we see one of the comparisons shown in Table 4, h=-0.5, with expected qeq=0.25. The observed equilibrium is the same, but the variance between generations is lower with less genetic drift (higher Ns) as expected.
A     s=0.1, h=-0.5Fig 6A
B     s=1, h=-0.5 Fig 6B
Figure 6. Effect of selection with overdominance. Parameters: N=104, μ=0.
Exercises
Use the formulas in Figure 4 and Table 1 to solve the following exercises.

 

Exercise 1
In a population where drift is negligible and in the absence of mutation, how many generations does it take for a lethal allele with an additive effect to halve its frequency? And so that it is reduced by 1/16?

 

Because it is lethal, the selection coefficient is s=1 and the additivity implies h=0.5. From Table 1 we know that qt=q0/2t. It is requested that qt=q0/2 therefore, q0/2=q0/2t=>(1/ 2)=(1/2t)=>t=1 generation.
For it to be reduced by 1/16 we would have to have 1/16=(1/2t)=>2t=16=>t=4 generations. You can check it by simulating one population with 104 individuals, initial frequency 0.8, s=1, h=0.5, for 4 generations.

Exercise 2
In a population where drift is negligible and in the absence of mutation, how many generations does it take for a recessive lethal allele to halve its frequency? Give the number of generations as a function of the initial frequency. Check the results with some simulation.

 

Because it is lethal, the selection coefficient is s=1 and since it is recessive h=0. From Table 1 we know that qt=q0/(1+tq0). It is requested that qt=q0/2 so that, q0/2=q0/(1+tq 0)=>t=1/q0. That is, the time it takes for a lethal recessive to reduce its frequency by half is inversely proportional to the initial frequency, it will decrease faster the higher its frequency. You can check this by simulating a population with 104 individuals, s=1, h=0 and testing different cases, for example, 10 generations with q0=0.8, q0=0.5 and q0=0.4.

Exercise 3
To study the change in allele frequency caused by selection in one generation, we simulated, in 100 populations of size 104 and without mutation, a deleterious allele with a selection coefficient of 0.5. The initial frequency of the q0 mutant allele was 0.5 and the frequency obtained after one generation was 3/7. What is the coefficient of dominance? HINT: Use the formula for q' in Figure 4.

 

The frequency after selection q' is equal to [pq(1-sh)+q 2(1-s)]/W. With the values given in the problem, we first calculate W according to the formula indicated in Figure 4 to obtain W=(7-2h)/8. And then we calculate the numerator obtaining (3-h)/8 therefore q'=3/7=(3-h)/(7-2h)=>h=0.

 

 

A. Carvajal-Rodriguez - Departamento de Bioquímica Genética e Inmunología - Universidad de Vigo. ( Updated: March 2023)