Computability, Gödel's incompleteness theorem, and an inherent limit on the predictability of evolution

The process of evolutionary diversification unfolds in a vast genotypic space of potential outcomes. During the past century, there have been remarkable advances in the development of theory for this diversification, and the theory's success rests, in part, on the scope of its applicability. A great deal of this theory focuses on a relatively small subset of the space of potential genotypes, chosen largely based on historical or contemporary patterns, and then predicts the evolutionary dynamics within this pre-defined set. To what extent can such an approach be pushed to a broader perspective that accounts for the potential open-endedness of evolutionary diversification? There have been a number of significant theoretical developments along these lines but the question of how far such theory can be pushed has not been addressed. Here a theorem is proven demonstrating that, because of the digital nature of inheritance, there are inherent limits on the kinds of questions that can be answered using such an approach. In particular, even in extremely simple evolutionary systems, a complete theory accounting for the potential open-endedness of evolution is unattainable unless evolution is progressive. The theorem is closely related to Gödel's incompleteness theorem, and to the halting problem from computability theory.

Introduction combinatorial complexity that arises thereby allows evolution to be effectively open-ended.
Indeed, as will be argued below, digital inheritance allows one to characterize evolution 154 (i.e., the change in genetic composition of a population) as a dynamical system on the 155 natural numbers, and therefore the theorem proved below holds for any such dynamical Because we can view evolution as a dynamical system on the natural numbers, 234 evolutionary theory can be viewed as a set of specific rules for manipulating and deducing 235 statements about such numbers. Computability theory deals with functions that map 236 positive integers to themselves, and thus provides a natural set of tools to analyze the 237 problem. A function is called 'computable' if there exists some algorithmic procedure that 238 can be followed to evaluate the function in a finite number of steps (Appendix 2). 239 Again, focusing on the deterministic case, given the assumption that we are able to predict 240 the state of the population from one time step to the next, the function E(n) is 241 computable (see Appendix 2). Furthermore, the set of all computable functions is 242 denumerable (Cutland, 1980). Therefore, denoting the k th such function by φ k (n), it is 243 clear the evolutionary process, E(n), must correspond to a member of this set. Denote this 244 specific member by φ E (n), and again note that, if we change the integer-coding used to 245 identify specific population states, we will obtain a different functionÊ(n), and thus a 246 different member of the set, φÊ(n) (Fig. 1). 247 During evolution, a set of population states will be visited over time (in the stochastic case 248 we consider a state as being visited if the probability of it occurring at some point is larger 249 than a threshold value; Appendix 5). These will be referred to as 'evolutionarily attainable' 250 states. In terms of our formalism, this corresponds to the function φ E (n) taking on various 251 values of its range, R E , as n increases (Fig. 1). A negation-complete evolutionary theory 252 would be one that can determine whether a code, x, satisfies x ∈ R E or whether it satisfies 253 x / ∈ R E instead. In the language of computability theory, this corresponds to asking 254 whether the predicate 'x ∈ R E ' is decidable (Appendix 2; (Cutland, 1980)). In terms of the 255 influenza example presented earlier, if x is the population state corresponding to a 256 pandemic with the 1918 strain, then the statement 'the Spanish flu will happen again' 257 corresponds to the number-theoretic statement x ∈ R E . Likewise, the statement 'it is not 258 true that the Spanish flu will happen again' corresponds to the number-theoretic statement 259 x / ∈ R E .
By hypothesis there exists a computable bijectionĈ such that, for the corresponding description of the evolutionary process, φÊ(n + 1) > φÊ(n) for all n. For any population 280 state, x, in the original coding, letx be the corresponding code under the bijectionĈ, and 281 define z(x) = µi(φÊ(i) ≥x), where µi(H(i)) denotes the minimum value of i for which the 282 argument H(i) is true (Appendix 2). Further, define R k (n) = {x : φ k (i) = x, i ≤ n} (i.e., 283 the range of φ k (n) visited by step n; Appendix 2). Clearly 'x ∈ RÊ(z(x))' is decidable since 284 RÊ(z(x)) is finite and can be enumerated, and furthermorex ∈ RÊ(z(x)) ⇔x ∈ RÊ owing 285 to the progressive nature of evolution. Therefore, 'x ∈ RÊ' is decidable as well. Finally, 286 using S denote the set of population states that are evolutionarily attainable, we have that Part 2: If the predicate 'x ∈ R E ' is decidable then there exists a codingĈ such that 290 φÊ(n + 1) > φÊ(n) for all n.

291
We can construct the required computable bijection between population states and an 292 appropriate coding as follows. First, take any effective coding of population states. By

293
hypothesis 'x ∈ R E ' is decidable and therefore we can proceed through the population 294 states, x, in increasing order, applying the following algorithm:

295
(i) if x / ∈ R E and it is the k th such state up to that point, use the k th odd number as its , and use the i th even number as its new code.

298
Thus, RÊ is the set of even numbers, and they are visited in increasing order as evolution 299 proceeds. In particular, usingĈC −1 to denote the above mapping described in points (i) 300 and (ii), where C −1 is the inverse mapping of the coding that generated x (i.e., it takes 301 code x and returns the corresponding population state, s), we have 302 φÊ(n + 1) =ĈC −1 φ E (n + 1) = 2(n + 1). The last equality follows from the fact that 303 Ĉ C −1 φ E (n + 1) determines the time at which state φ E (n + 1) occurs (which is n + 1), and 304 assigns it a new code equal to twice this value (point (ii) above analyses from computability theory and mathematical logic. It also draws a formal 314 connection between the extent to which such a theory is possible and the notion of 315 progressive evolution.

316
Because the theorem states an equivalence relationship between the possibility of 317 developing a negation-complete theory and progressive evolution, it can be read in two 318 distinct ways. First, it states that if evolution is progressive then a negation-complete 319 theory is possible. This is, perhaps, not too surprising. If evolution is progressive then 320 there would be a good deal of regularity to the process that one ought to be able to exploit 321 in constructing theory. The second way to read the theorem is from the perspective of the 322 reverse implication. This is somewhat more surprising; it states that if evolution is not 323 progressive then a negation-complete theory will not be possible.

324
These results rest on the fact that digital inheritance allows evolution to be open-ended 325 (Maynard Smith and Szathmáry, 1995). If, instead, the hereditary system allowed for only 326 a finite number of discrete possible types, then evolution would either display periodic 327 behaviour or would reach an equilibrium (possibly with stochastic fluctuations; Appendix 328 5). A negation-complete theory of evolution would then be trivially possible in such cases 329 because, in principle, we could simply develop a finite list of all evolutionary outcomes that 330 can occur (as described in the influenza example earlier).

331
Of course, despite the existence of digital inheritance, there is nevertheless presumably a

339
The notion of progressive evolution is somewhat slippery, and there does not exist a 340 general yet precise definition of progression that is universally agreed upon. As a result, 341 this has led to disagreement over the extent to which progressive evolution occurs 342 (Dawkins, 1997;Gould, 1997). A complete discussion of the idea of progressive evolution is 343 beyond the scope of this article but a few points are worth making here.

344
Most discussions of progressive evolution involve quantities like mean fitness, body size, predicts (possibly in a probabilistic way) the genetic composition of the influenza 392 population in the next time step, as a function of its current composition. Then we ask, is 393 there a significant probability that another flu pandemic with the 1918 strain will ever 394 occur? The above theorem states that, even if we had such a perfect model, this kind of 395 question is unanswerable unless influenza evolution is progressive. In other words, unless 396 some characteristic of the influenza population changes directionally during evolution (e.g.,

397
some aspect of the antigenicity profile changes directionally) such a prediction will not be 398 possible. Moreover, this limitation arises because, even though we can use our perfect 399 model to map out the course of influenza evolution over time, this need not be enough to 400 map out the parts of genotype space that influenza will not explore.

401
The above limitations apply to predictions about the genetic evolution of the population,

402
but what if we are interested only in phenotypic predictions? For example, could we 403 predict whether or not an influenza pandemic similar in severity to that of 1918 will ever 404 occur again, regardless of which strain(s) cause the pandemic? Likewise, could we predict whether or not resistance to antiviral medication will ever evolve, regardless of its genetic 406 underpinnings? If the genotype-phenotype map is one-to-one, then predicting phenotypic 407 evolution will be no different than predicting genotypic evolution. Even if many different 408 genotypes can produce the same phenotype, however, predicting phenotypic evolution still 409 involves predicting whether or not certain subsets of genotype space are visited during 410 evolution. As a result, all of the aforementioned limitations should still apply to such cases.

411
The only exception is if the genotype-phenotype map resulted in the dimension of 412 phenotype space being finite even though the dimension of the genotype space was 413 effectively infinite. Even in this case, however, the above limitations to prediction would 414 still apply unless phenotypic knowledge alone was sufficient to predict the state of the 415 population from one time step to the next (i.e., if we didn't need to consider genetic state 416 to understand evolution). While this might be possible for some phenotypes of interest, it 417 seems unlikely that it would be possible for all possible phenotypes.

418
One might argue, however, that some patterns of phenotypic evolution are very predictable.

419
For example, the application of drug pressure to populations seems inevitably to lead to 420 the evolution of resistance to the drug. How are these sorts of findings reconciled with the 421 results presented here? First, although the evolution of resistance does appear to be 422 somewhat predictable, we must distinguish between inductive versus deductive predictions.

423
One reason we feel confident about predicting the evolution of drug resistance is that we says that it will not be possible to make negation-complete predictions about any arbitrary 442 aspect of evolution unless the evolutionary process is progressive.

443
As already mentioned, all of the results presented here begin with the assumption that we provide an interesting physical quantity that might change directionally, the relationship 491 between entropy and quantities that are of biological interest need not be simple.

492
In a similar vein one might argue that, because biological evolution takes place within a to asking new evolutionary questions, unless evolution is progressive, there will remain 515 some such questions that are unanswerable. Furthermore, although it will likely be difficult 516 to use the theorem as a means of proving that evolution is progressive (i.e., by developing a 517 negation-complete theory) or to use the theorem to prove that a complete evolutionary 518 theory is possible (i.e., by determining that evolution is progressive) the result does 519 nevertheless reveal that these two important, and somewhat distinct, biological ideas are 520 fundamentally one and the same thing.

521
My intention was not to imply that the theorem could be used to determine decidability 522 from knowledge of progression, or the reverse. Rather, it was to prove (within the set of 523 assumptions used) that decidability and progression can be viewed as one of the same 524 thing.

525
The theorem presented here has close ties to Gödel's Incompleteness Theorem for 526 axiomatic theories of the natural numbers (Gödel, 1931;Nagel and Newman, 1958;Davis, 527 1965;van Heijenoort ed., 1967;Smith, 2007). An axiomatic theory consists of a set of 528 symbols, a logical apparatus (e.g., the predicate calculus), a set of axioms involving the 529 symbols, and a set of rules of deduction through which new statements involving the 530 symbols can be derived (termed 'theorems'; Smith (2007)). Given such a system, theorems 531 can be derived through the repeated algorithmic application of the rules of deduction.

532
In the early 1900's there was a concerted attempt to produce such an axiomatic theory 533 that was meant to represent the natural numbers, with the proviso that it yield all true 534 statements about the natural numbers, and no false ones; (Whitehead and Russell, 1910;535 Smith, 2007). Gödel's Incompleteness Theorem (Gödel, 1931;Nagel and Newman, 1958;Davis, 1965;van Heijenoort ed., 1967;Smith, 2007) predicate 'x ∈ R E ', and we know that this is not always possible as the results presented 545 here illustrate.

546
The Halting Problem from computability theory (Turing, 1936;Cutland, 1980) is also 547 intimately related to the results presented here. As already detailed, the question of 548 whether a population state is evolutionarily attainable is equivalent to the question of 549 whether a given positive integer is in the range of a particular computable function.

550
Moreover, this latter question is directly connected to the analogous question of whether a 551 given integer is in the domain of a computable function (i.e., whether, given a particular 552 integer input, the function returns a value in finite time). The latter problem is precisely 553 the Halting Problem, and it is known that there is no general algorithmic procedure for 554 solving the Halting problem for arbitrary computable functions (Turing, 1936;Cutland, 555 1980).

556
As mentioned earlier, in a very general sense, the results presented here are applicable to 557 any system that can be faithfully described by a Markov dynamical system over an infinite 558 set of discrete possibilities (i.e., an open-ended dynamical system). Therefore, one might 559 ask whether there is anything in the results presented that is particular to evolution per se? 560 In one sense the answer is 'no', but therein lies the power of such mathematical abstraction; evolutionary theory whose theorems are recursively enumerable (Appendix 2); i.e., any 723 theory whose theorems can be derived through the use of a finite (but possible large) 724 number of mechanical, or algorithmic, steps (e.g., as laid out in the rules of inference;

725
Appendix 2). This is clearly true for any such theory based on computation, since 726 computers do nothing more than mechanically follow rules (Cutland, 1980). It is also true 727 for any axiomatic theory, since the theorems of any such theory can be derived simply by 728 applying the mechanical rules of inference to the axioms (Smith, 2007).

729
A great deal of current quantitative theory in evolutionary biology fits the above template.
Definition: The predicate 'n ∈ A' is decidable if its characteristic function is computable.

764
Definition: The set A is recursive if the predicate 'n ∈ A' is decidable.

765
Definition: The partial characteristic function of a set of natural numbers, A, is Definition: The predicate 'n ∈ A' is partially decidable if its partial characteristic 767 function is computable for n ∈ A. numbers is recursive (Cutland, 1980).

780
Third, the notion of an 'unbounded search' is central in computability theory. In 781 particular, it is standard to use the notation µy(f (y) = k) to denote 'the smallest value of 782 y such that f (y) = k'.

783
Fourth, a fundamental theorem of computability theory demonstrates that the set of all 784 computable functions is denumerable (Cutland, 1980). Thus, we can use φ k (n) to denote 785 the k th computable function, and R k and D k as its range and domain respectively. We will 786 also make use of the notation R k (n) = {x : φ k (i) = x, i ≤ n}. In other words, if φ k (n) is 787 evaluated for increasing values of n, then R k (n) is the subset of the range of φ k (n) that has 788 been visited by step n. This is clearly computable for any n if φ k (n) is total. Given A is r.e., the partial characteristic function of A is computable; i.e., is computable. Now first choose an a ∈ A. This is a computable operation since we can 853 simply use the bijection B : n → (T 1 (n), T 2 (n)) to evaluatec T 2 (n) A (T 1 (n)) for increasing n 854 until it returns a value of 1, and then identify the corresponding value T 1 (n). Next, we can 855 define the computable function Then, again we can use the computable bijection B : n → (T 1 (n), T 2 (n)) to define function φ E (i) as before, whose range is now thought of as the set of feasible population 959 states. The argument i here is now no longer meant to be evolutionary time, however, but 960 rather is simply an index whose meaning is described below. The second computable 961 function we denote by φ E * (n), and it specifies the number of feasible population states in 962 generation n in the following way: the set of all feasible population states at time 1; i.e., 963φ E (1) is given by {φ E (1), φ E (2), ..., φ E (k 1 )}, where φ E * (1) = k 1 . Likewise, 964φ E (2) = {φ E (k 1 + 1), ..., φ E (k 1 + k 2 )}, where φ E * (2) = k 2 , and so on. In this way, we can 965 apply the same notions of computability to the set-valued functionφ E (n) by applying them 966 to its component, integer-valued, functions φ E (i) and φ E * (n). We will assume that the set 967φ E (n) is finite for all n, which guarantees that it be computable. Nevertheless, it seems 968 reasonable to expect that some formulations in which this set is infinite would still be 969 computable, and thus would still fit within the results that follow.

970
As in the deterministic case, we must also specify the initial conditions, in addition to the 971 mapping, H. Then, in terms of the mapping, H, if x ∈φ E (n) is a feasible population state 972 at time n, the set of feasible population states at time n + 1 is given by 973 φ E (n + 1) = x∈φ E (n) supportH(x), where supportH(x) denotes the set of states for which Contrary to the assertion, suppose instead that R E is infinite but that there is some time, there is some way to recode the populations states such that, the code number of these new states that are visited over time increases. Likewise, Lemma 2 shows that at least one new of course, 2|R E (n) + 1| giving min σÊ(n + 1) = 2|R E (n) + 1|. As a result, min σÊ(n + 1) > min σÊ(n) because |R E (n)| is strictly increasing with n (from Lemma 2). the predicate 'x ∈ R E ' is decidable if its characteristic function can be evaluated, for any input value x, in a finite number of steps. Likewise, the mappingĈ of the theorem is process is identical to the reference infinite state space process, φ E (n), over time until the 1110 point η + 1 at which the finite process begins to revisit previously visited states. where T is independent of η (i.e., independent of system size).

1117
Definition: A one-to-one mapping of the population states by the positive integers,Ĉ, is 1118 *computable if, for any input there exists a T < ∞ such that the mapping can be 1119 evaluated in no more than T steps, where T is independent of η.

1120
The main theorem of the text can again be seen to hold when |R E | < ∞ if we use the 1121 above definitions. In particular,

1122
Theorem: 'x ∈ R E ' is *decidable if, and only if, there exists an *computable one-to-one 1123 coding of the population states by a subset of the positive integers,Ĉ, such that the 1124 corresponding description of the evolutionary process, F η E (n), satisfies F η E (n + 1) > F η E (n) 1125 for all n ≤ η.

1126
Notice that there is one difference from the main theorem of the text; namely, the altered 1127 characterization of progressive evolution. Now, because R E is finite, we say that evolution 1128 is progressive if there is some quantity that increases over time before the process begins to 1129 repeat. Also note that, in addition to the altered definition of 'computable' and 'decidable' of the main text. Recall that F η E (n) denotes the computable function corresponding to the 1134 finite evolutionary system of interest.
Part 1: ∃ *Ĉ s.t. F η E (n + 1) > F η E (n) ∀n ≤ η ⇒ 'x ∈ R E ' *decidable 1137 As before, take any input x and find its new code,x. By hypothesis the number of steps 1138 required is bounded by a constant that is independent of system size. Next, we can begin 1139 to successively evaluate F η E (n) for increasing values of n. We suppose that the number of 1140 steps required in this computation for any n ≤ η is independent of η. This is a reasonable 1141 assumption because the outputs are identical to those of φ E (n) when n ≤ η, and the 1142 number of steps required to evaluate φ E (n) is independent of η for any n. To each output 1143 of F η E (n) we can apply the above mapping,Ĉ to obtain F η E (n), which by hypothesis,