(See Arrival of the frequent for context)
See also Wright-Fisher model
The Hamming distance (i.e. the number of differing letters, or mutations) d is then distributed binomially:
h(d)=(dL)μd(1−μ)L−d
The expected number of individuals with genotype p that arises at generation t can be written as:
mp(t)=∑iN∑d=1Lh(d)Φp(gi,si,d)=∑iNΦp~(gi,si) | Eq.1 |
where Φd(gi,si,d) is the probability that a d-fold mutation of genotype gi (selected for reproduction according to fitness 1+si) generates an individual with phenotype p. It takes into account the genotype-phenotype map. gi is the genotype of the ith member of the population, with a total of N members. See derivation of this below:
As the number is distributed binomially, the average number is mp=N(probability for single offspring to get phenotype p). Then we define Φp~(gi,si)=(the probability for the single offspring to get to phenotype p given it inherits a mutated version of parent i). Furthermore, (probability for single offspring to get phenotype p) = ∑i=1N(probability of single offspring to get phenotype p through parent i) = ∑i=1NΦp~(gi,si)×(probability to inherit from parent i) = ∑i=1NΦp~(gi,si)∑j=1N(1+sj)(1+si). Finally,
mp=N(probability for single offspring to get phenotype p) = ∑i=1NΦp~(gi,si)∑j=1N(1+sj)N(1+si) ≡∑i=1NΦp′(gi,si)
By fine-graining the transitions from gi to a phenotype-p genotype into transitions with particular mutation numbers d, we can write Φp′(gi,si)≡∑d=1LΦp(gi,si,d), recovering Eq. 1
[#[manual links]] (try to upgrade TW to make this work)
The actual number of individuals with genotype p will follow a binomial distribution (as explained for a simple case in Wright-Fisher model), with probability mp(t)/N, and number of trials N. The probability of none of the offspring having phenotype p is: (1−mp(t)/N)N≈e−mp(t), the approximation holds for large N, and may be seen as approximating the Binomial distribution by a Poisson distribution.
If we assume that Ld≪1, i.e. the average number of mutations per genotype is very small, then h(d)≪h(1) for all d>1, and h(1)≈Lμ (h(0)≈1 while h(0)<1 of course).
With the above assumption that Ld≪1, Φp′(gi,si)=∑d=1Lh(d)Φp(gi,si,d)≈Φp(gi,si,0)+Φp(gi,si,1)Lμ. Also, Φp(gi,si,0)=0, if p≠q. Next, if we assume, si=0, for all i with gi mapping to phenotype q (i.e. in space Nq), and that it all starts within Nq, we have
mp(t)=∑i=1NΦp′(gi,si)≈∑i=1NΦp(gi,0,1)Lμ | Eq.2 |
We can also define the averaged {expected number of offspring with phenotype p at one generation, which inherited from genotype gi at the previous generation via a single mutation}, i.e the average of Φp(gi,0,1), over all gi in Nq. We will write abuse notation, and use the label i in gi to label a genotype in Nq, so that i=1,2,...Nq. The average is then:
Φpq=Nq1∑i=1NqΦp(gi,0,1)
Furtheremore, we should note that, as Φp′(gi,si)=Φp~(gi,si)∑j=1N(1+sj)N(1+si) (and a similar expression for the d dependent quantities). When si=0, we find Φp′(gi,si)=Φp~(gi,si), and also, for example, that Φp(gi,0,1)=Φp~(gi,0,1), where Φp~(gi,si,d)=(the probability for the single offspring to get to phenotype p given it inherits a mutated version of parent i, via a single-point mutation (d=1)). Thus Φpq is the average of this probability.
We also define the robustness of phenotype q, ρ as equal to the average probability over all Nq of a neutral mutation (i.e. one from Nq to Nq). Under the approximate assumptions above, Φqq≈ρ. If we assume also that the population is large enough (more precisely, we are in the Polymorphic limit (Wright-Fisher model)), we can use a mean field approximation: approximate Φp(gi,0,1) by Φpq. This approximate works best if the population is large enough that most of the neutral space Nq is populated (or in the author of the paper word's "1-mutant neighbourhood of the population is similar to that of the whole neutral space"). Using this in Eq.2:
mp(t)≈Lμ∑i=1NΦpq=NLμΦpq | Eq.3 |