Mutations (Part 3)
This guide describes some of the more complex options for defining the fitness effects of mutations. Most of the complexity comes from defining how the fitness of a cell with multiple mutations should be calculated.
multi_gene_array=False
This is the default and simplest option.
Each new mutation in a cell combines with the previous mutations without any regard for the gene it came from.
Run a simulation where all new mutations have a fitness of 1.1 These multiply the previous fitness of the cell
import numpy as np
from clone_competition_simulation import (
Parameters,
TimeParameters,
PopulationParameters,
FitnessParameters,
Gene,
MutationGenerator,
FixedValue
)
mut_gen = MutationGenerator(
genes=[Gene(name='Gene1', mutation_distribution=FixedValue(1.1), synonymous_proportion=0)],
combine_mutations='multiply' # This is also the default option
)
np.random.seed(1)
p = Parameters(
algorithm='Moran',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(initial_size_array=np.ones(4)),
fitness=FitnessParameters(
fitness_array=np.linspace(1, 1.3, 4), # Giving the initial clones some different fitness values
mutation_rates=0.1,
mutation_generator=mut_gen,
)
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info())
| clone id | label | fitness | generation born | parent clone id | last gene mutated | |
|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 1.000 | 0 | -1 | None |
| 1 | 1 | 0 | 1.100 | 0 | -1 | None |
| 2 | 2 | 0 | 1.200 | 0 | -1 | None |
| 3 | 3 | 0 | 1.300 | 0 | -1 | None |
| 4 | 4 | 0 | 1.320 | 22 | 2 | Gene1 |
| 5 | 5 | 0 | 1.452 | 32 | 4 | Gene1 |
| 6 | 6 | 0 | 1.452 | 39 | 4 | Gene1 |
The fitness of clone_id 4 is 1.32 = 1.2 (the fitness of the parent clone 2) x 1.1 (the fitness of the new mut) The fitness of clone_id 5 is 1.452 = 1.32 (the fitness of the parent clone 4) x 1.1 (the fitness of the new mut)
For comparison with later, this is the raw_fitness_array, used to calculate the overall fitness With multi_gene_array=False it is not very interesting, just one column
print(s.raw_fitness_array)
[[1. ],
[1.1 ],
[1.2 ],
[1.3 ],
[1.32 ],
[1.452],
[1.452]])
Mutation combination options
There are various other options as well as fitness multiplication
from clone_competition_simulation import MutationCombination
print(MutationCombination._member_names_)
['MULTIPLY', 'ADD', 'REPLACE', 'MAX', 'MIN']
For example, to add fitness values
mut_gen = MutationGenerator(
genes=[
Gene(name='Gene1', mutation_distribution=FixedValue(1.1), synonymous_proportion=0)
],
combine_mutations='add' # Using add. Could also use MutationCombination.ADD
)
# The simulation is otherwise the same as before
np.random.seed(1)
p = Parameters(
algorithm='Moran',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(initial_size_array=np.ones(4)),
fitness=FitnessParameters(
fitness_array=np.linspace(1, 1.3, 4), # Giving the initial clones some different fitness values
mutation_rates=0.1,
mutation_generator=mut_gen,
)
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info())
| clone id | label | fitness | generation born | parent clone id | last gene mutated | |
|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 1.0 | 0 | -1 | None |
| 1 | 1 | 0 | 1.1 | 0 | -1 | None |
| 2 | 2 | 0 | 1.2 | 0 | -1 | None |
| 3 | 3 | 0 | 1.3 | 0 | -1 | None |
| 4 | 4 | 0 | 1.3 | 22 | 2 | Gene1 |
| 5 | 5 | 0 | 1.4 | 32 | 4 | Gene1 |
| 6 | 6 | 0 | 1.4 | 39 | 4 | Gene1 |
Note this adds the fitness advantage from the new mutation.
So if the fitness is 1.1, there is a 0.1 advantage over neutral.
The fitness of clone_id 4 is now 1.3 = 1.2 + 0.1
multi_gene_array=True
This allows much more control on the combination of fitness, but is also more complicated.
The mutations from each gene are treated separately, and then combined after.
Mutations within each gene are combined based on MutationGenerator.combine_mutations.
Fitness from all genes is then combined based on MutationGenerator.combine_array.
This is the same simulation we ran before with multi_gene_array=False.
Running a simulation where all new mutations have a fitness of 1.1. These multiply the previous fitness of the cell
mut_gen = MutationGenerator(
genes=[
Gene(name='Gene1', mutation_distribution=FixedValue(1.1), synonymous_proportion=0)
],
multi_gene_array=True,
combine_mutations='multiply', # This is the default option
combine_array='multiply', # This is also the default
)
np.random.seed(1)
p = Parameters(
algorithm='Moran',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(initial_size_array=np.ones(4)),
fitness=FitnessParameters(
fitness_array=np.linspace(1, 1.3, 4), # Giving the initial clones some different fitness values
mutation_rates=0.1,
mutation_generator=mut_gen,
)
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info())
| clone id | label | fitness | generation born | parent clone id | last gene mutated | |
|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 1.000 | 0 | -1 | None |
| 1 | 1 | 0 | 1.100 | 0 | -1 | None |
| 2 | 2 | 0 | 1.200 | 0 | -1 | None |
| 3 | 3 | 0 | 1.300 | 0 | -1 | None |
| 4 | 4 | 0 | 1.320 | 22 | 2 | Gene1 |
| 5 | 5 | 0 | 1.452 | 32 | 4 | Gene1 |
| 6 | 6 | 0 | 1.452 | 39 | 4 | Gene1 |
The clone info results are the same as before
But compare the raw_fitness_array to earlier.
With multi_gene_array=True, this is now a multidimensional array.
Each row is for a clone.
The first column is a fitness from the initial clones (often wild type/neutral fitness).
The next set of columns contain the fitness from each gene.
The total fitness for each clone is calculated from all the values in each row.
Here is it multiplied because MutationGenerator.combine_array='multiply' (any nan values are ignored).
print(s.raw_fitness_array)
[[1. , nan],
[1.1 , nan],
[1.2 , nan],
[1.3 , nan],
[1.2 , 1.1 ],
[1.2 , 1.21],
[1.2 , 1.21]])
To see this along with the clone data, can use the include_raw_fitness argument
print(s.view_clone_info(include_raw_fitness=True))
| clone id | label | fitness | generation born | parent clone id | last gene mutated | Initial clone fitness | Gene1 | |
|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 1.000 | 0 | -1 | None | 1.0 | NaN |
| 1 | 1 | 0 | 1.100 | 0 | -1 | None | 1.1 | NaN |
| 2 | 2 | 0 | 1.200 | 0 | -1 | None | 1.2 | NaN |
| 3 | 3 | 0 | 1.300 | 0 | -1 | None | 1.3 | NaN |
| 4 | 4 | 0 | 1.320 | 22 | 2 | Gene1 | 1.2 | 1.10 |
| 5 | 5 | 0 | 1.452 | 32 | 4 | Gene1 | 1.2 | 1.21 |
| 6 | 6 | 0 | 1.452 | 39 | 4 | Gene1 | 1.2 | 1.21 |
Using multiple genes adds extra columns to the raw_fitness_array.
You can combine effects between genes in a different way to combining mutations within a gene.
For example, if further mutations in the same gene have no further effect, we can use
combine_mutations='max', 'replace', or 'min'.
But then you can still add the fitness effects from different genes.
mut_gen = MutationGenerator(
genes=[
Gene(name='Gene1', mutation_distribution=FixedValue(1.1), synonymous_proportion=0),
Gene(name='Gene2', mutation_distribution=FixedValue(1.05), synonymous_proportion=0)
],
multi_gene_array=True,
combine_mutations='replace', # With FixedValue this means further mutations will do nothing
combine_array='add', # Add the effects from the two genes
)
np.random.seed(0)
p = Parameters(
algorithm='Moran',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(initial_cells=6),
fitness=FitnessParameters(
mutation_rates=0.15,
mutation_generator=mut_gen,
)
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info(include_raw_fitness=True))
| clone id | label | fitness | generation born | parent clone id | last gene mutated | Initial clone fitness | Gene1 | Gene2 | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 1.00 | 0 | -1 | None | 1.0 | NaN | NaN |
| 1 | 1 | 0 | 1.10 | 8 | 0 | Gene1 | 1.0 | 1.1 | NaN |
| 2 | 2 | 0 | 1.10 | 13 | 0 | Gene1 | 1.0 | 1.1 | NaN |
| 3 | 3 | 0 | 1.05 | 18 | 0 | Gene2 | 1.0 | NaN | 1.05 |
| 4 | 4 | 0 | 1.15 | 25 | 2 | Gene2 | 1.0 | 1.1 | 1.05 |
| 5 | 5 | 0 | 1.15 | 35 | 2 | Gene2 | 1.0 | 1.1 | 1.05 |
| 6 | 6 | 0 | 1.10 | 48 | 0 | Gene1 | 1.0 | 1.1 | NaN |
You can see that the overall fitness of each clone is the sum of fitness advantage from the last three columns.
The options for combination across genes are similar to the options within genes.
priority is an odd case, where only the last mutated gene in the list (or the last epistatic effect) is used.
It might be relevant for cases where the knockout of one gene will override the function of downstream genes.
from clone_competition_simulation import ArrayCombination
print(ArrayCombination._member_names_)
['MULTIPLY', 'ADD', 'MAX', 'MIN', 'PRIORITY']
Epistatic effects
The combination of mutations in different genes may be more complex than simply addition or multiplication.
Defining epistatic effects allows more control of the combinations of fitness.
For example, this can be used for haplosufficient or haploinsufficient genes by defining a separate Gene object for each allele and an epistatic effect for when both alleles are mutated.
from clone_competition_simulation import EpistaticEffect
mut_gen = MutationGenerator(
genes=[
Gene(name='Gene1', mutation_distribution=FixedValue(1.1), synonymous_proportion=0),
Gene(name='Gene2', mutation_distribution=FixedValue(1.05), synonymous_proportion=0)
],
# Add an epistatic effect.
# If both Gene1 and Gene2 are mutated, the clone has a fitness of 3
# This replaces the individual effects of both genes
epistatics=[
EpistaticEffect(
name='Epi1',
gene_names=['Gene1', 'Gene2'],
fitness_distribution=FixedValue(3)
)
],
multi_gene_array=True,
combine_mutations='replace', # With FixedValue this means further mutations will do nothing
)
np.random.seed(0)
p = Parameters(
algorithm='Moran',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(initial_cells=6),
fitness=FitnessParameters(
mutation_rates=0.15,
mutation_generator=mut_gen,
)
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info(include_raw_fitness=True))
| clone id | label | fitness | generation born | parent clone id | last gene mutated | Initial clone fitness | Gene1 | Gene2 | Epi1 | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 1.00 | 0 | -1 | None | 1.0 | NaN | NaN | NaN |
| 1 | 1 | 0 | 1.10 | 8 | 0 | Gene1 | 1.0 | 1.1 | NaN | NaN |
| 2 | 2 | 0 | 1.10 | 13 | 0 | Gene1 | 1.0 | 1.1 | NaN | NaN |
| 3 | 3 | 0 | 1.05 | 18 | 0 | Gene2 | 1.0 | NaN | 1.05 | NaN |
| 4 | 4 | 0 | 3.00 | 25 | 2 | Gene2 | 1.0 | 1.1 | 1.05 | 3.0 |
| 5 | 5 | 0 | 3.00 | 35 | 4 | Gene2 | 1.0 | 1.1 | 1.05 | 3.0 |
| 6 | 6 | 0 | 3.00 | 48 | 4 | Gene1 | 1.0 | 1.1 | 1.05 | 3.0 |
Note that for the last three clones, their fitness is 3 because both genes are mutated.
Checking fitness rules are correct
It can be a complex system for defining the rules for mutation combinations.
To check that the fitness values are what is intended, you can plot the (mean) fitness of the combination of gene mutations.
This assumes that there is at most a single mutation per gene.
Use this function from the MutationGenerator to plot the fitness of each combination of mutations. This is for the epistatic example above.
import matplotlib.pyplot as plt
fitness_combinations = mut_gen.plot_fitness_combinations()
print(f"Fitness combinations: {fitness_combinations}")
plt.show()
Fitness combinations: [1. 1.1 1.05 3. ]

Add an extra gene and change some options and check again For any non-FixedValue distributions of fitness effects, the mean value is used.
from clone_competition_simulation import UniformDist
mut_gen2 = MutationGenerator(
genes=[
Gene(name='Gene1', mutation_distribution=FixedValue(1.1), synonymous_proportion=0),
Gene(name='Gene2', mutation_distribution=FixedValue(1.05), synonymous_proportion=0),
Gene(name='Gene3', mutation_distribution=UniformDist(1, 2), synonymous_proportion=0)],
epistatics=[
EpistaticEffect(
name='Epi1',
gene_names=['Gene1', 'Gene2'],
fitness_distribution=FixedValue(3)
)
],
multi_gene_array=True,
combine_array='multiply'
)
fitness_combinations = mut_gen2.plot_fitness_combinations()
print(f"Fitness combinations: {fitness_combinations}")
plt.show()
Fitness combinations: [1. 1.1 1.05 3. 1.5 1.65 1.575 4.5 ]

Can see the difference using combine_array='max' instead of combine_array='multiply'
mut_gen2 = MutationGenerator(
genes=[
Gene(name='Gene1', mutation_distribution=FixedValue(1.1), synonymous_proportion=0),
Gene(name='Gene2', mutation_distribution=FixedValue(1.05), synonymous_proportion=0),
Gene(name='Gene3', mutation_distribution=UniformDist(1, 2), synonymous_proportion=0)],
epistatics=[
EpistaticEffect(
name='Epi1',
gene_names=['Gene1', 'Gene2'],
fitness_distribution=FixedValue(3)
)
],
multi_gene_array=True,
combine_array='max'
)
fitness_combinations = mut_gen2.plot_fitness_combinations()
print(f"Fitness combinations: {fitness_combinations}")
plt.show()
Fitness combinations: [1. 1.1 1.05 3. 1.5 1.5 1.5 3. ]

Assigning genes to the initial mutations
You can start a simulation with multiple clones and assign a gene to each of them.
This may be useful for lineage tracing style simulations in which a mutant is induced (defined at the start), and then subsequent mutations can appear on top of the initial mutations.
# Have similar epistatic rules as above
# But stop Gene1 from mutating at random by setting weight=0
mut_gen = MutationGenerator(
genes=[
Gene(name='Gene1', mutation_distribution=FixedValue(1.1), synonymous_proportion=0, weight=0),
Gene(name='Gene2', mutation_distribution=FixedValue(1.05), synonymous_proportion=0)
],
epistatics=[
EpistaticEffect(
name='Epi1',
gene_names=['Gene1', 'Gene2'],
fitness_distribution=FixedValue(2)
)
],
multi_gene_array=True,
combine_mutations='replace', # With FixedValue this means further mutations will do nothing
)
# This time, start with some mutations in Gene1.
# Then mutate Gene2, which has the epistatic interaction with Gene1
np.random.seed(1)
p = Parameters(
algorithm='Moran',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(
initial_size_array=[6, 1, 1, 1, 1], # Set up 4 1-cell clones and 6 WT cells
),
fitness=FitnessParameters(
mutation_rates=0.08,
mutation_generator=mut_gen,
# The initial_mutant_gene_array defines the genes to be associated with the gene
# The value is the index of the Gene in MutationGenerator.genes
# Use -1 for the clones with no mutation/associated gene
initial_mutant_gene_array=[-1, 0, 0, 0, 0],
# Also need to define the fitness array or all initial clones will have a fitness=1
# These fitness values do not have to equal the fitness values from the Gene assigned.
fitness_array=[1, 1.1, 1.1, 1.1, 1.1],
)
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info(include_raw_fitness=True))
| clone id | label | fitness | generation born | parent clone id | last gene mutated | Initial clone fitness | Gene1 | Gene2 | Epi1 | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 1.00 | 0 | -1 | None | 1.0 | NaN | NaN | NaN |
| 1 | 1 | 0 | 1.10 | 0 | -1 | Gene1 | 1.0 | 1.1 | NaN | NaN |
| 2 | 2 | 0 | 1.10 | 0 | -1 | Gene1 | 1.0 | 1.1 | NaN | NaN |
| 3 | 3 | 0 | 1.10 | 0 | -1 | Gene1 | 1.0 | 1.1 | NaN | NaN |
| 4 | 4 | 0 | 1.10 | 0 | -1 | Gene1 | 1.0 | 1.1 | NaN | NaN |
| 5 | 5 | 0 | 1.05 | 22 | 0 | Gene2 | 1.0 | NaN | 1.05 | NaN |
| 6 | 6 | 0 | 2.00 | 32 | 4 | Gene2 | 1.0 | 1.1 | 1.05 | 2.0 |
| 7 | 7 | 0 | 2.00 | 39 | 3 | Gene2 | 1.0 | 1.1 | 1.05 | 2.0 |
| 8 | 8 | 0 | 2.00 | 66 | 6 | Gene2 | 1.0 | 1.1 | 1.05 | 2.0 |
| 9 | 9 | 0 | 2.00 | 73 | 8 | Gene2 | 1.0 | 1.1 | 1.05 | 2.0 |
| 10 | 10 | 0 | 2.00 | 83 | 8 | Gene2 | 1.0 | 1.1 | 1.05 | 2.0 |
| 11 | 11 | 0 | 2.00 | 86 | 8 | Gene2 | 1.0 | 1.1 | 1.05 | 2.0 |
The initial clones (generation_born=0) now can have epistatic interactions.
Diminishing returns
As a clone gains more mutations and becomes fitter, there may be less and less room for improvement.
You can apply a transformation of the combined fitness effects of all mutations to introduce diminishing returns
(or other function).
# Run a simulation with loads of mutations that multiply in fitness
mut_gen = MutationGenerator(
genes=[Gene(name='Gene1', mutation_distribution=FixedValue(1.4), synonymous_proportion=0)],
combine_mutations='multiply'
)
np.random.seed(0)
p = Parameters(
algorithm='Moran',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(
initial_cells=1000
),
fitness=FitnessParameters(
mutation_rates=0.1,
mutation_generator=mut_gen
)
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info()[-10:])
| clone id | label | fitness | generation born | parent clone id | last gene mutated | |
|---|---|---|---|---|---|---|
| 1000 | 1000 | 0 | 2295.856929 | 99 | 985 | Gene1 |
| 1001 | 1001 | 0 | 426.878854 | 99 | 806 | Gene1 |
| 1002 | 1002 | 0 | 304.913467 | 100 | 860 | Gene1 |
| 1003 | 1003 | 0 | 1639.897806 | 100 | 930 | Gene1 |
| 1004 | 1004 | 0 | 304.913467 | 100 | 841 | Gene1 |
| 1005 | 1005 | 0 | 1639.897806 | 100 | 930 | Gene1 |
| 1006 | 1006 | 0 | 1171.355576 | 100 | 886 | Gene1 |
| 1007 | 1007 | 0 | 1171.355576 | 100 | 891 | Gene1 |
| 1008 | 1008 | 0 | 597.630396 | 100 | 863 | Gene1 |
| 1009 | 1009 | 0 | 2295.856929 | 100 | 985 | Gene1 |
You can see that the fitness values of the clones has become very high!
The final clones have many mutations, each multiplying the fitness by 1.4.
This compound interest of mutations can lead to very high fitness
print(s.get_clone_ancestors(1009))
[1009,985,984,886,848,847,791,790,786,735,620,483,470,464,405,390,346,340,339,319,198,116,115,0,-1]
One (extreme) option to limit to growth of fitness values is to use the 'max' option - for this simulation, this would limit the fitness for clones at 1.4.
However, there is another option that enables diminishing returns to be applied.
The BoundedLogisticFitness class places an upper bound on fitness. Here fitness cannot get above 3.
from clone_competition_simulation import BoundedLogisticFitness
b = BoundedLogisticFitness(3)
plt.plot(np.linspace(0, 10, 100), b.fitness(np.linspace(0, 10, 100)))
plt.xlabel('Raw fitness')
plt.ylabel('Transformed fitness')
plt.show()

The shape of the fitness transformation can also be altered.
b = BoundedLogisticFitness(3)
plt.plot(np.linspace(0, 10, 100), b.fitness(np.linspace(0, 10, 100)))
b2 = BoundedLogisticFitness(3, 10)
plt.plot(np.linspace(0, 10, 100), b2.fitness(np.linspace(0, 10, 100)))
plt.xlabel('Raw fitness')
plt.ylabel('Transformed fitness')
plt.show()

Running a simulation again, but limiting the max fitness.
mut_gen = MutationGenerator(
genes=[Gene(name='Gene1', mutation_distribution=FixedValue(1.4), synonymous_proportion=0)],
combine_mutations='multiply',
mutation_combination_class=BoundedLogisticFitness(3) # This limits fitness to 3
)
np.random.seed(0)
p = Parameters(
algorithm='Moran',
times=TimeParameters(max_time=10, division_rate=1),
population=PopulationParameters(
initial_cells=1000
),
fitness=FitnessParameters(
mutation_rates=0.1,
mutation_generator=mut_gen
)
)
s = p.get_simulator()
s.run_sim()
print(s.view_clone_info(include_raw_fitness=True)[-10:])
| clone id | label | fitness | generation born | parent clone id | last gene mutated | Gene1 | |
|---|---|---|---|---|---|---|---|
| 1000 | 1000 | 0 | 2.926558 | 99 | 964 | Gene1 | 5.378240 |
| 1001 | 1001 | 0 | 2.926558 | 99 | 357 | Gene1 | 5.378240 |
| 1002 | 1002 | 0 | 2.999569 | 100 | 464 | Gene1 | 10.541350 |
| 1003 | 1003 | 0 | 2.926558 | 100 | 716 | Gene1 | 5.378240 |
| 1004 | 1004 | 0 | 2.926558 | 100 | 360 | Gene1 | 5.378240 |
| 1005 | 1005 | 0 | 2.926558 | 100 | 691 | Gene1 | 5.378240 |
| 1006 | 1006 | 0 | 2.222816 | 100 | 586 | Gene1 | 2.744000 |
| 1007 | 1007 | 0 | 2.999994 | 100 | 611 | Gene1 | 14.757891 |
| 1008 | 1008 | 0 | 2.991267 | 100 | 454 | Gene1 | 7.529536 |
| 1009 | 1009 | 0 | 2.999569 | 100 | 953 | Gene1 | 10.541350 |
Now those final clones have their fitness capped.
The raw fitness is also lower, since the fitness reduction also reduces the advantage of the fittest clones
(which through expansion would gain more subclones and become even fitter)