Mutations (Part 1)
Mutations can be added during the course of a simulation.
These may change the cell fitness and therefore the cell behaviour.
There are quite a few mutation-related parameters that can be defined: - the mutation rate - the proportions of synonymous and non-synonymous mutations - the distribution of fitness effects for non-synonymous mutations - how the effects of multiple mutations combine to produce the overall cell fitness
This last point is can get pretty complicated and is covered in more detail in later guides.
This guide describes the basics of using mutations in simulations.
Setting the mutation rate
If mutation_rates is left undefined or set to zero, no mutations occur
from clone_competition_simulation import Parameters, TimeParameters, PopulationParameters
import matplotlib.pyplot as plt
p = Parameters(
algorithm='Moran',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(initial_cells=1000)
)
s = p.get_simulator()
s.run_sim()
s.muller_plot(figsize=(5, 5))
plt.show()

This plot is just a single colour, because there is a single clone (the initial clone of 1000 cells)
Setting a non-zero mutation_rates randomly introduces mutations. The value is the probability of a mutation occurring per cell division.
To introduce mutations, we must also define what effect the mutations will have. This is explained further in Controlling the mutation effects below.
from clone_competition_simulation import (
FitnessParameters,
Gene,
MutationGenerator,
NormalDist
)
# We'll add some mutations, so we have to define the fitness of the mutated cells
mutation_generator = MutationGenerator(
genes=[Gene(name="Gene1", mutation_distribution=NormalDist(mean=1.1, var=0.1),
synonymous_proportion=0.5)]
)
p = Parameters(
algorithm='Moran',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(initial_cells=1000),
fitness=FitnessParameters(
mutation_rates=0.01,
mutation_generator=mutation_generator
),
)
s = p.get_simulator()
s.run_sim()
s.muller_plot(figsize=(5, 5))
plt.show()

The mutations that occur are marked with a X and are followed by the clones they form. Red Xs mark non-synonymous mutations and blue Xs mark synonymous mutations
The mutation rate can vary. Here we set a high mutation rate from time 1 until time 3 and then a lower mutation rate after that
The mutation_rates are set as a list/array of pairs of [start_time, mutation_rate]
The mutation_rate will apply until the next start_time point or until the end of the simulation.
Before the first start_time (if not zero) the mutation_rate is zero.
p = Parameters(
algorithm='Moran',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(initial_cells=1000),
fitness=FitnessParameters(
mutation_rates=[
[1, 0.1], # At time 1, start a mutation rate of 0.1 per cell division
[4, 0.01] # At time 4, start a mutation rate of 0.01 per cell division. This continues until the end.
],
mutation_generator=mutation_generator
),
)
s = p.get_simulator()
s.run_sim()
s.muller_plot(figsize=(5, 5))
plt.show()

Can see the higher density of mutations between times 1 and 4.
Controlling the mutation effects
The fitness effects of mutations are defined in two classes: Gene and MutationGenerator.
Gene is used to define the proportion of non-synonymous and synonymous mutations and the distribution of fitness effects of the non-synonymous mutations.
The MutationGenerator is used to define how multiple mutations (which can be from multiple genes) combine to form the overall cell fitness.
In this guide, we will only use the default parameters from the MutationGenerator, which is that all fitness effects
are multiplied together.
More options are described in other guides.
Let's define a new gene
from clone_competition_simulation import Gene, UniformDist
gene1 = Gene(
name='Gene1', # This is the name for the gene.
mutation_distribution=UniformDist(0.5, 1.5), # This defines the fitness effects of non-synonymous mutations
synonymous_proportion=0.4 # This means 40% of the mutations will be synonymous
)
and use that gene in a simulation
from clone_competition_simulation import MutationGenerator
import numpy as np
mut_gen = MutationGenerator(genes=[gene1])
np.random.seed(0)
p = Parameters(
algorithm='Moran2D',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(initial_cells=100, cell_in_own_neighbourhood=False),
fitness=FitnessParameters(
mutation_rates=0.01,
mutation_generator=mut_gen
)
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info())
| clone id | label | fitness | generation born | parent clone id | last gene mutated | |
|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 1.000000 | 0 | -1 | None |
| 1 | 1 | 0 | 1.000000 | 15 | 0 | Gene1 |
| 2 | 2 | 0 | 1.000000 | 21 | 0 | Gene1 |
| 3 | 3 | 0 | 1.339029 | 42 | 0 | Gene1 |
| 4 | 4 | 0 | 0.933265 | 45 | 0 | Gene1 |
| 5 | 5 | 0 | 0.942416 | 48 | 0 | Gene1 |
| 6 | 6 | 0 | 0.876017 | 60 | 0 | Gene1 |
| 7 | 7 | 0 | 1.339029 | 62 | 3 | Gene1 |
| 8 | 8 | 0 | 0.942416 | 76 | 5 | Gene1 |
| 9 | 9 | 0 | 0.643681 | 97 | 0 | Gene1 |
The fitness values of the new clones are between 0.5 and 1.5, and the gene mutated is Gene1. Some mutations will be synonymous and will not change the fitness
There are a few pre-written options for the distribution of fitness effects.
Custom distribution classes can be used as long as they have:
- a __call__ function that doesn't take any argument and returns a fitness value
- a get_mean function that returns the mean of the distribution
We can run a simulation with multiple genes showing some of the options already available.
from clone_competition_simulation import NormalDist, ExponentialDist, FixedValue
# One with a normal distribution
gene_norm = Gene(name='GeneNorm', mutation_distribution=NormalDist(mean=0.6, var=0.1), synonymous_proportion=0.5)
# One with an exponential distribution
gene_exp = Gene(name='GeneExp', mutation_distribution=ExponentialDist(mean=1.05, offset=1), synonymous_proportion=0.4)
# One where every non-synonymous mutation has the same fixed value of fitness
gene_fix = Gene(name='GeneFix', mutation_distribution=FixedValue(value=1.01), synonymous_proportion=0.3)
mut_gen = MutationGenerator(genes=[gene1, gene_norm, gene_exp, gene_fix])
np.random.seed(0)
p = Parameters(
algorithm='Moran2D',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(initial_cells=100, cell_in_own_neighbourhood=False),
fitness=FitnessParameters(
mutation_rates=0.03,
mutation_generator=mut_gen
)
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info())
| clone id | label | fitness | generation born | parent clone id | last gene mutated | |
|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 1.000000 | 0 | -1 | None |
| 1 | 1 | 0 | 1.010000 | 3 | 0 | GeneFix |
| 2 | 2 | 0 | 1.000000 | 6 | 0 | GeneNorm |
| 3 | 3 | 0 | 1.010000 | 7 | 0 | GeneFix |
| 4 | 4 | 0 | 1.034740 | 7 | 0 | GeneExp |
| 5 | 5 | 0 | 1.010000 | 15 | 0 | GeneFix |
| 6 | 6 | 0 | 0.516414 | 16 | 0 | GeneNorm |
| 7 | 7 | 0 | 1.000000 | 21 | 0 | GeneFix |
| 8 | 8 | 0 | 1.110256 | 27 | 0 | GeneExp |
| 9 | 9 | 0 | 1.000000 | 28 | 0 | GeneNorm |
| 10 | 10 | 0 | 0.505791 | 29 | 0 | GeneNorm |
| 11 | 11 | 0 | 0.510849 | 42 | 10 | GeneFix |
| 12 | 12 | 0 | 1.042249 | 42 | 0 | GeneExp |
| 13 | 13 | 0 | 1.000000 | 44 | 0 | GeneNorm |
| 14 | 14 | 0 | 1.004789 | 44 | 0 | GeneExp |
| 15 | 15 | 0 | 0.994093 | 47 | 0 | Gene1 |
| 16 | 16 | 0 | 1.000000 | 55 | 0 | GeneFix |
| 17 | 17 | 0 | 1.000000 | 58 | 0 | GeneExp |
| 18 | 18 | 0 | 1.010000 | 59 | 0 | GeneFix |
| 19 | 19 | 0 | 1.010000 | 60 | 0 | GeneFix |
| 20 | 20 | 0 | 0.659173 | 61 | 0 | GeneNorm |
| 21 | 21 | 0 | 1.038381 | 61 | 0 | GeneExp |
| 22 | 22 | 0 | 0.552378 | 63 | 16 | GeneNorm |
| 23 | 23 | 0 | 1.010000 | 64 | 0 | GeneFix |
| 24 | 24 | 0 | 1.000000 | 71 | 0 | GeneNorm |
| 25 | 25 | 0 | 1.000000 | 72 | 0 | GeneNorm |
| 26 | 26 | 0 | 1.000000 | 75 | 0 | GeneFix |
| 27 | 27 | 0 | 1.015429 | 77 | 0 | GeneExp |
| 28 | 28 | 0 | 1.298854 | 89 | 0 | Gene1 |
| 29 | 29 | 0 | 0.606296 | 93 | 0 | GeneNorm |
| 30 | 30 | 0 | 0.512835 | 95 | 0 | GeneNorm |
| 31 | 31 | 0 | 1.049568 | 100 | 0 | GeneExp |
To give different genes different mutation rates you can use the weight argument in Gene.
The weights can be any non-negative number.
The overall mutation rate is controlled by the Parameters mutations_rates argument.
The Gene weight controls the relative mutation rate of the genes.
By default weight=1, so all genes will have the same mutation rate.
# Two genes, where gene1 is 3 times more likely to be mutated than gene2
gene1 = Gene(name='Gene1', mutation_distribution=UniformDist(0.5, 1.1), synonymous_proportion=0.4, weight=3)
gene2 = Gene(name='Gene2', mutation_distribution=UniformDist(1.1, 1.5), synonymous_proportion=0.4, weight=1)
mut_gen = MutationGenerator(genes=[gene1, gene2])
p = Parameters(
algorithm='Moran2D',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(initial_cells=100, cell_in_own_neighbourhood=False),
fitness=FitnessParameters(
mutation_rates=0.1,
mutation_generator=mut_gen
)
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info()['last gene mutated'].value_counts())
last gene mutated
Gene1 72
Gene2 23
Name: count, dtype: int64
dN/dS
The use of synonymous and non-synonymous mutations means we can calculate a dN/dS value.
This can be done for the entire simulation or for each gene.
gene1 = Gene(name='Gene1', mutation_distribution=UniformDist(0.5, 1.1), synonymous_proportion=0.4, weight=3)
gene2 = Gene(name='Gene2', mutation_distribution=UniformDist(1.1, 1.5), synonymous_proportion=0.5, weight=1)
mut_gen = MutationGenerator(genes=[gene1, gene2])
p = Parameters(
algorithm='Moran2D',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(initial_cells=62500, cell_in_own_neighbourhood=False),
fitness=FitnessParameters(
mutation_rates=0.01,
mutation_generator=mut_gen
)
)
s = p.get_simulator()
s.run_sim()
Just running the sim.get_dnds function returns the overall dN/dS ratio at the end of the simulation
print(s.get_dnds())
1.0231533250016447
Adding a time gets the dN/dS at that time point
print(s.get_dnds(t=5))
0.9779496754690142
You can also get the dN/dS for each gene
print(s.get_dnds(gene='Gene1'), s.get_dnds(gene='Gene2'))
(0.9395973154362416, 1.3317757009345794)
For sequencing experiments, you usually cannot detect very small clones. You can calculate the dN/dS for larger clones only
print(s.get_dnds(gene='Gene2', min_size=5))
1.8591549295774648
You can plot the dN/dS over time. The early time points here are not reliable due to low sample size
s.plot_dnds()
plt.show()

s.plot_dnds(gene='Gene1')
s.plot_dnds(gene='Gene2', clear_previous=False)
plt.show()

s.plot_dnds(gene='Gene2', min_size=5)
plt.show()
