Mutations (Part 1)

Mutations can be added during the course of a simulation.
These may change the cell fitness and therefore the cell behaviour.

There are quite a few mutation-related parameters that can be defined: - the mutation rate - the proportions of synonymous and non-synonymous mutations - the distribution of fitness effects for non-synonymous mutations - how the effects of multiple mutations combine to produce the overall cell fitness

This last point is can get pretty complicated and is covered in more detail in later guides.

This guide describes the basics of using mutations in simulations.

Setting the mutation rate

If mutation_rates is left undefined or set to zero, no mutations occur

from clone_competition_simulation import Parameters, TimeParameters, PopulationParameters
import matplotlib.pyplot as plt

p = Parameters(
    algorithm='Moran', 
    times=TimeParameters(max_time=10, division_rate=1), 
    population=PopulationParameters(initial_cells=1000)
)
s = p.get_simulator()
s.run_sim()
s.muller_plot(figsize=(5, 5))
plt.show()

png

This plot is just a single colour, because there is a single clone (the initial clone of 1000 cells)

Setting a non-zero mutation_rates randomly introduces mutations. The value is the probability of a mutation occurring per cell division.

To introduce mutations, we must also define what effect the mutations will have. This is explained further in Controlling the mutation effects below.

from clone_competition_simulation import (
    FitnessParameters, 
    Gene, 
    MutationGenerator, 
    NormalDist
)

# We'll add some mutations, so we have to define the fitness of the mutated cells
mutation_generator = MutationGenerator(
    genes=[Gene(name="Gene1", mutation_distribution=NormalDist(mean=1.1, var=0.1), 
                synonymous_proportion=0.5)]
)

p = Parameters(
    algorithm='Moran',
    times=TimeParameters(max_time=10, division_rate=1), 
    population=PopulationParameters(initial_cells=1000),
    fitness=FitnessParameters(
        mutation_rates=0.01, 
        mutation_generator=mutation_generator
    ),
)
s = p.get_simulator()
s.run_sim()
s.muller_plot(figsize=(5, 5))
plt.show()

png

The mutations that occur are marked with a X and are followed by the clones they form. Red Xs mark non-synonymous mutations and blue Xs mark synonymous mutations

The mutation rate can vary. Here we set a high mutation rate from time 1 until time 3 and then a lower mutation rate after that

The mutation_rates are set as a list/array of pairs of [start_time, mutation_rate] The mutation_rate will apply until the next start_time point or until the end of the simulation. Before the first start_time (if not zero) the mutation_rate is zero.

p = Parameters(
    algorithm='Moran',
    times=TimeParameters(max_time=10, division_rate=1), 
    population=PopulationParameters(initial_cells=1000),
    fitness=FitnessParameters(
        mutation_rates=[
            [1, 0.1],  # At time 1, start a mutation rate of 0.1 per cell division
            [4, 0.01]  # At time 4, start a mutation rate of 0.01 per cell division. This continues until the end.
        ], 
        mutation_generator=mutation_generator
    ),
)
s = p.get_simulator()
s.run_sim()
s.muller_plot(figsize=(5, 5))
plt.show()

png

Can see the higher density of mutations between times 1 and 4.

Controlling the mutation effects

The fitness effects of mutations are defined in two classes: Gene and MutationGenerator.

Gene is used to define the proportion of non-synonymous and synonymous mutations and the distribution of fitness effects of the non-synonymous mutations.

The MutationGenerator is used to define how multiple mutations (which can be from multiple genes) combine to form the overall cell fitness.

In this guide, we will only use the default parameters from the MutationGenerator, which is that all fitness effects are multiplied together.
More options are described in other guides.

Let's define a new gene

from clone_competition_simulation import Gene, UniformDist

gene1 = Gene(
    name='Gene1',  # This is the name for the gene. 
    mutation_distribution=UniformDist(0.5, 1.5),  # This defines the fitness effects of non-synonymous mutations
    synonymous_proportion=0.4   # This means 40% of the mutations will be synonymous
)

and use that gene in a simulation

from clone_competition_simulation import MutationGenerator
import numpy as np

mut_gen = MutationGenerator(genes=[gene1])
np.random.seed(0)
p = Parameters(
    algorithm='Moran2D',
    times=TimeParameters(max_time=10, division_rate=1),
    population=PopulationParameters(initial_cells=100, cell_in_own_neighbourhood=False),
    fitness=FitnessParameters(
        mutation_rates=0.01,
        mutation_generator=mut_gen
    )
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info())

	clone id	fitness	generation born	parent clone id	last gene mutated
0	0	1.000000	0	-1	None
1	1	1.000000	15	0	Gene1
2	2	1.000000	21	0	Gene1
3	3	1.339029	42	0	Gene1
4	4	0.933265	45	0	Gene1
5	5	0.942416	48	0	Gene1
6	6	0.876017	60	0	Gene1
7	7	1.339029	62	3	Gene1
8	8	0.942416	76	5	Gene1
9	9	0.643681	97	0	Gene1

The fitness values of the new clones are between 0.5 and 1.5, and the gene mutated is Gene1. Some mutations will be synonymous and will not change the fitness

There are a few pre-written options for the distribution of fitness effects.
Custom distribution classes can be used as long as they have: - a __call__ function that doesn't take any argument and returns a fitness value - a get_mean function that returns the mean of the distribution

We can run a simulation with multiple genes showing some of the options already available.

from clone_competition_simulation import NormalDist, ExponentialDist, FixedValue

# One with a normal distribution
gene_norm = Gene(name='GeneNorm', mutation_distribution=NormalDist(mean=0.6, var=0.1), synonymous_proportion=0.5)

# One with an exponential distribution
gene_exp = Gene(name='GeneExp', mutation_distribution=ExponentialDist(mean=1.05, offset=1), synonymous_proportion=0.4)

# One where every non-synonymous mutation has the same fixed value of fitness
gene_fix = Gene(name='GeneFix', mutation_distribution=FixedValue(value=1.01), synonymous_proportion=0.3)

mut_gen = MutationGenerator(genes=[gene1, gene_norm, gene_exp, gene_fix])
np.random.seed(0)
p = Parameters(
    algorithm='Moran2D',
    times=TimeParameters(max_time=10, division_rate=1),
    population=PopulationParameters(initial_cells=100, cell_in_own_neighbourhood=False),
    fitness=FitnessParameters(
        mutation_rates=0.03,
        mutation_generator=mut_gen
    )
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info())

	clone id	fitness	generation born	parent clone id	last gene mutated
0	0	1.000000	0	-1	None
1	1	1.010000	3	0	GeneFix
2	2	1.000000	6	0	GeneNorm
3	3	1.010000	7	0	GeneFix
4	4	1.034740	7	0	GeneExp
5	5	1.010000	15	0	GeneFix
6	6	0.516414	16	0	GeneNorm
7	7	1.000000	21	0	GeneFix
8	8	1.110256	27	0	GeneExp
9	9	1.000000	28	0	GeneNorm
10	10	0.505791	29	0	GeneNorm
11	11	0.510849	42	10	GeneFix
12	12	1.042249	42	0	GeneExp
13	13	1.000000	44	0	GeneNorm
14	14	1.004789	44	0	GeneExp
15	15	0.994093	47	0	Gene1
16	16	1.000000	55	0	GeneFix
17	17	1.000000	58	0	GeneExp
18	18	1.010000	59	0	GeneFix
19	19	1.010000	60	0	GeneFix
20	20	0.659173	61	0	GeneNorm
21	21	1.038381	61	0	GeneExp
22	22	0.552378	63	16	GeneNorm
23	23	1.010000	64	0	GeneFix
24	24	1.000000	71	0	GeneNorm
25	25	1.000000	72	0	GeneNorm
26	26	1.000000	75	0	GeneFix
27	27	1.015429	77	0	GeneExp
28	28	1.298854	89	0	Gene1
29	29	0.606296	93	0	GeneNorm
30	30	0.512835	95	0	GeneNorm
31	31	1.049568	100	0	GeneExp

To give different genes different mutation rates you can use the weight argument in Gene.
The weights can be any non-negative number.

The overall mutation rate is controlled by the Parameters mutations_rates argument.
The Gene weight controls the relative mutation rate of the genes.
By default weight=1, so all genes will have the same mutation rate.

# Two genes, where gene1 is 3 times more likely to be mutated than gene2
gene1 = Gene(name='Gene1', mutation_distribution=UniformDist(0.5, 1.1), synonymous_proportion=0.4, weight=3)
gene2 = Gene(name='Gene2', mutation_distribution=UniformDist(1.1, 1.5), synonymous_proportion=0.4, weight=1)

mut_gen = MutationGenerator(genes=[gene1, gene2])
p = Parameters(
    algorithm='Moran2D',
    times=TimeParameters(max_time=10, division_rate=1),
    population=PopulationParameters(initial_cells=100, cell_in_own_neighbourhood=False),
    fitness=FitnessParameters(
        mutation_rates=0.1,
        mutation_generator=mut_gen
    )
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info()['last gene mutated'].value_counts())

last gene mutated
Gene1    72
Gene2    23
Name: count, dtype: int64

dN/dS

The use of synonymous and non-synonymous mutations means we can calculate a dN/dS value.
This can be done for the entire simulation or for each gene.

gene1 = Gene(name='Gene1', mutation_distribution=UniformDist(0.5, 1.1), synonymous_proportion=0.4, weight=3)
gene2 = Gene(name='Gene2', mutation_distribution=UniformDist(1.1, 1.5), synonymous_proportion=0.5, weight=1)

mut_gen = MutationGenerator(genes=[gene1, gene2])
p = Parameters(
    algorithm='Moran2D',
    times=TimeParameters(max_time=10, division_rate=1),
    population=PopulationParameters(initial_cells=62500, cell_in_own_neighbourhood=False),
    fitness=FitnessParameters(
        mutation_rates=0.01,
        mutation_generator=mut_gen
    )
)
s = p.get_simulator()
s.run_sim()

Just running the sim.get_dnds function returns the overall dN/dS ratio at the end of the simulation

print(s.get_dnds())

1.0231533250016447

Adding a time gets the dN/dS at that time point

print(s.get_dnds(t=5))

0.9779496754690142

You can also get the dN/dS for each gene

print(s.get_dnds(gene='Gene1'), s.get_dnds(gene='Gene2'))

(0.9395973154362416, 1.3317757009345794)

For sequencing experiments, you usually cannot detect very small clones. You can calculate the dN/dS for larger clones only

print(s.get_dnds(gene='Gene2', min_size=5))

1.8591549295774648

You can plot the dN/dS over time. The early time points here are not reliable due to low sample size

s.plot_dnds()
plt.show()

png

s.plot_dnds(gene='Gene1')
s.plot_dnds(gene='Gene2', clear_previous=False)
plt.show()

png

s.plot_dnds(gene='Gene2', min_size=5)
plt.show()

png