Abstract
We survey the logical structure of constructive set theories and point towards directions for future research. Moreover, we analyse the consequences of being extensible for the logical structure of a given constructive set theory. We finally provide examples of a number of set theories that are extensible.
This article is part of the theme issue ‘Modern perspectives in Proof Theory’.
1. Introduction
Constructive set theories are formal systems in which we can conduct intuitionistic or constructive mathematics. Many systems of constructive set theory have strong connections to type theory and are interesting because they allow us to analyse the computational content of mathematical statements.
When a constructive mathematical theory is defined, we usually pay special attention to ensure that not all instances of the law of excluded middle,
Consider, for a well-known example, the famous theorem by Diaconescu [1] and Goodman & Myhill [2] that adding the axiom of choice to intuitionistic set theory IZF (Intuitionistic Zermelo–Fraenkel set theory) results in the classical set theory. This result is often taken as evidence for rejecting the axiom of choice as constructively valid. However, showing that a certain instance of excluded middle is not constructively valid is not sufficient for ensuring that only principles of intuitionistic logic are satisfied. After all, there could be intermediate principles that are derivable in . Dick de Jongh was the first to investigate this phenomenon: in his doctoral dissertation [3], he showed that the propositional principles valid in Heyting arithmetic under all substitutions are exactly those of propositional intuitionistic logic (IPC). This fact is now known as de Jongh’s theorem and has sparked fruitful investigations into the logical structure of intuitionistic arithmetic. A generalization of de Jongh’s theorem would be to consider not only the propositional principles of but also its propositional admissible rules (see definition 2.3). Visser [4] showed that the propositional admissible rules of are exactly those of A further generalization would be to consider the two notions in the setting of predicate logic.
In this article, we survey results about the logical structure of constructive set theories and also provide a few new results in the area as well as questions for future research.
The study of the logical properties of classical and non-classical theories goes back at least several decades. Early results on the logics of theories date from the 1970s, and Rybakov’s seminal work on admissibility [5] marked the beginning of a period during which the admissible rules were investigated for many theories, see [6] for an overview. What is characteristic of that period is that the theories involved were mainly arithmetical theories, such as HA, or extensions of propositional logics, such as modal, intermediate and substructural logics. This naturally leads to the more general questions studied today, questions that ask whether the phenomena observed for arithmetic apply to other well-known constructive theories, and whether the phenomena observed for propositional logics also hold for predicate logics. Behind all this lie conceptual issues as well: What does it mean for a theory to have the same logic or rules as a given theory? Should it be required of a genuine constructive theory that it has the same admissible rules as intuitionistic propositional or predicate logics? These questions characterize the modern perspective on the area. We do not know the answers to all of them yet, but the results in this article provide some answers for constructive set theories.
(a) Overview
We will introduce what we understand as the logical structure of a theory in §2 and then survey the techniques and results obtained so far in §3. Section 4 will be dedicated to obtaining a few new results. We mention suggestions for future research throughout the article and also list some more in §5. Finally, we discuss the logics and admissible rules of the extreme case of classical theories in §6.
2. Preliminaries: the logical structure of a theory
We denote intuitionistic propositional logic by and intuitionistic first-order logic by . Classical propositional logic is denoted by and classical first-order logic by . If a logic satisfies or , then it is a (propositional or first-order) intermediate logic.
In this article, we work with first-order logics formulated in the language consisting of countably many predicate symbols for in every arity. The formulas of first-order logic are recursively defined in the usual way with the connectives and quantifiers , , , , and . We also consider and to be part of the logical vocabulary. Note that we take constants and functions to be expressed as predicates in the usual way. The language of propositional logic is obtained by restricting to nullary predicates (i.e. the propositional letters) and forgetting the quantifiers. Finally, we assume that all first-order languages share the same set of variables .
Definition 2.1.
Let be a theory. A map is a -assignment if and only if it maps each predicate symbol to a -formula such that the number of free variables of is not bigger than the arity of .
For example, given a binary relation symbol , we could obtain a -assignment by setting . We denote the formulas of first-order logic by and those of a theory by . We can then obtain substitutions from assignments as follows.
Definition 2.2.
Let be a theory. A map is a -substitution if it obtained from a -assignment such that , and, moreover, commutes with , and the logical connectives. If the range of is , then it is a ground substitution.
The theory is often clear from the context, and we will then speak of substitutions instead of -substitutions. As usual, a rule is a pair of finite sets of logical formulas.
Definition 2.3.
Let be a theory. A rule is admissible in , , if and only if for every substitution , if for all , then there is such that .
By , we will denote the admissibility relation in which we consider only propositional formulas. We will consider more closely now the propositional and first-order tautologies of a theory.
Definition 2.4.
A formula of first-order logic is a tautology of if and only if .
Proposition 2.5.
A formula is a tautology of if and only if for all substitutions .
Proof.
Note that is by definition equivalent to implying for all . As the premise is true for any , the implication is equivalent to for every .
Given this definition, we can define the propositional and first-order logics of a theory.
Definition 2.6.
Let be a theory. The propositional logic of , , consists of all propositional tautologies of . The first-order logic of , , consists of all first-order tautologies of .
We can now ask four natural questions about the logical structure of a given theory :
(i) | What are the first-order admissible rules of ? | ||||
(ii) | What are the propositional admissible rules of ? | ||||
(iii) | What is the first-order logic of ? | ||||
(iv) | What is the propositional logic of ? |
Given the aforementioned definitions, it is clear that answers to some of these questions imply answers to others: If we know an answer to (i), then we know answers to (ii)–(iv). If we know an answer to (ii), then to (iv) as well. Finally, if we know an answer to (iii), then also to (iv). Still, there are several reasons to consider these questions separately. First, the following theorem shows that an informative answer—in the sense of a recursive description—to (i) is in general not possible.
Theorem 2.7. (Visser [4])
Let be a theory such that can be relatively interpreted in . If has the disjunction property, then is -complete.
Proof.
This is Visser’s result; however, as we are not working with relative interpretations here, one needs to add this explicitly on the logical side by replacing the quantifiers and with and , respectively, where is a new relation symbol.
However, one may of course still wonder whether the first-order admissible rules of any two theories are the same (question 5.1) or study the admissibility of specific interesting rules (see, e.g. van den Berg & Moerdijk [7]). Another reason to consider these questions separately is that, of course, the more complex objects require more complicated proofs than the simpler ones. In a sense, question (iv) can often be seen as a litmus test for the more difficult questions (ii) and (iii): if we conjecture that , then showing that is a step in the right direction (though there are counter examples, e.g. but , see §3). It may even be necessary to know an answer to (iv) before applying certain techniques for obtaining an answer to (ii), viz. the propositional admissible rules of . We will see much more detail in the remainder of this article. For now, we will dive a bit deeper into the logics and propositional admissible rules of a given theory.
We say that a theory is based on a logic if is axiomatized over . As the results discussed in this article show, if a theory is based on , we do not necessarily have that . Crucially, we cannot only study the logics of a theory based on intuitionistic logic but, given an intermediate logic , we can consider the theory obtained as the closure of under . de Jongh et al. [8] studied these theories and defined the de Jongh property as follows.
Definition 2.8.
Let be a propositional or first-order intermediate logic. A theory has the de Jongh property for just in case if is propositional or if is first order.
We also say that a theory satisfies de Jongh’s theorem if , and that satisfies de Jongh’s theorem for first-order logic if . To illustrate research in this area, we now give an incomplete history of de Jongh’s theorem for arithmetical theories (see de Jonghet al. [8] for a more complete history). De Jongh [3] proved that —; in other words, he proved that satisfies the de Jongh property for . Leivant [9] showed using proof theoretic means that ; van Oosten later gave a model theoretic proof of the same fact. De Jongh et al. [8] proved that has the de Jongh property for several classes of intermediate logics. The following propositions are immediate from proposition 2.5.
Proposition 2.9.
Let be theories in the same language. If is a tautology of , then it is a tautology of . In particular, and .
Proposition 2.10.
Let be a propositional or first-order intermediate logic. If are theories in the same language and satisfies the de Jongh property for , then so does . In particular, if satisfies de Jongh’s theorem, then so does .
A first crucial observation on the admissible rules of a theory is that these are bounded by its logic.
Theorem 2.11. (Visser [4])
Let be a theory. If , then ; i.e. . Similarly, if , then ; i.e. .
Not much is known about the converse direction. A counterexample could be obtained with theorem 2.7 if it turns out that the predicate admissible rules of are of complexity lower than -completeness. We close these preliminaries with two particularly helpful results for studying the propositional admissible rules of a given theory. A theory is called extensible if any Kripke model of can be extended by adding a new root and corresponding domain to obtain a new model of (see also definition 4.1).
Theorem 2.12. Visser ([4], Lemma 4.1)
Let be a theory with . If is extensible, then the propositional admissible rules of are exactly those of , .
The Visser rule is the following rule for propositional formulas , and :
Theorem 2.13.
Let be a theory with . If the Visser rules are admissible in , then the admissible rules of are exactly those of .
Proof.
Let and be propositional formulas. By theorem 2.11, it suffices to show that if , then . So assume that . By a theorem of Iemhoff ([10], Theorem 3.9), this means that , where denotes the derivability relation of extended with all of Visser’s rules. In other words, there is a proof tree, potentially using as a premise, all steps in which are instances of the rules of and , and whose conclusion is . Given that Visser’s rules are admissible in , and using the fact that is based on intuitionistic logic, it is straightforward to see that every rule application in the proof tree is admissible in . Hence, is an admissible consequence of , .
The two theorems give two routes to proving that the propositional admissible rules of a theory are exactly those of . We will see that theorem 2.12 can be seen as an instance of theorem 2.13 because Visser’s rules are admissible in any extensible theory (lemma 4.7). In this section, we covered the preliminaries for theories based on intuitionistic logic. There is more to say for the extreme case of theories based on classical logic, see §6.
3. Survey on techniques and results
(a) Constructive set theories
In this section, we can only give a very brief introduction to constructive set theory. As usual, the language of set theory is a first-order language with equality and a single binary relation symbol ‘’ to denote set membership. We consider the bounded quantifiers and to be abbreviations for and , respectively. The crucial aim of intuitionistic and constructive set theories is to provide a set-theoretic foundation for mathematics on the basis of intuitionistic instead of classical logic. We will now introduce the relevant systems.
In figure 1, we have spelt out all the axioms of set theory, giving rise to the following theories. Note that ‘ is full in and ’ means that every element of is a total relation between and and for every total relation between and , and there is some such that . We can now define the relevant theories. The bounded separation consists of all instances of (separation), where is a bounded formula (i.e. is ). In a similar way, we obtain the axiom scheme of bounded collection.
Definition 3.1.
We define the following set theories on the basis of intuitionistic first-order logic.
(i) | Intuitionistic Zermelo–Fraenkel set theory (IZF) consists of the axioms and schemes of extensionality, empty set, union, pairing, infinity, separation, -induction, collection and power set. | ||||
(ii) | Constructive Zermelo–Fraenkel set theory (CZF) consists of the axioms and schemes of extensionality, empty set, union, pairing, strong infinity, bounded separation, -induction, strong collection and subset collection. | ||||
(iii) | Intuitionistic Kripke–Platek set theory (IKP) consists of the axioms and schemes of extensionality, empty set, union, pairing, infinity, bounded separation, -induction and bounded collection. | ||||
(iv) | Basic constructive set theory (BCST) consists of the axioms and schemes of extensionality, empty set, union, pairing, replacement and bounded separation. | ||||
(v) | Elementary constructive set theory (ECST) consists of the axioms and schemes of extensionality, empty set, pairing, union, strong infinity, bounded separation and replacement. |
Note that is usually formulated with the axiom of infinity instead of strong infinity. Obviously, the axiom of strong infinity implies infinity. The converse is also true on the basis of without any infinity axioms [11, section 2.1]; so we can consider to be formulated with the axiom of strong infinity instead of infinity, as this will be important later. For a detailed development of the mathematics of and in constructive set theories, we refer the reader to the notes of Aczel & Rathjen [12]. Whenever we discuss the results for an arbitrary set theory in this article, we assume that contains at least intuitionistic logic.
(b) Techniques and results
We will now survey what is known about the logical structure of constructive set theories. For many years, only the following two negative results were known.
Theorem 3.2. (Diaconescu [1]; Goodman & Myhill [2])
Let be a set theory that proves as well as the axioms of extensionality, pairing, and the separation scheme. If proves the axiom of choice, then proves all instances of the law of excluded middle.
Theorem 3.3. (Friedman & Ščedrov [13])
Let be a set theory containing the axioms and schemes of extensionality, separation, pairing and (finite) union. Then . In particular, .
While we do not wish to spell out the proofs of these theorems in detail, we will point out here that both of them make crucial use of the separation scheme to define sets that allow to derive a logical scheme for all formulas. This observation is crucial as we will see in a while that Passmann [14] showed that both theorems fail on the basis of , i.e. when the set theory does not contain the full separation scheme. Note that theorem 3.3 leads to the following open question.
Question 3.4.
Let be a set theory, based on intuitionistic logic, satisfying the conditions of theorem 3.3. What is ? In particular, what is ?
Before considering more recent results, we will briefly recall some notation and results on Kripke models. A Kripke frame consists of a partial order on a set . A valuation on is a function assigning sets of propositional letters to nodes such that and entail . A Kripke model consists of a Kripke frame and a valuation on . The valuation can be extended to the forcing relation between Kripke models and valuation Kripke models and propositional formulas in the usual way. The defining property of valuations is called persistence and transfers to all propositional formulas in Kripke models, i.e. and entail (this property also holds for all first-order formulas in the case of first-order models, see below). We refer to the literature for the standard results about Kripke models for intuitionistic logic.
We are now ready to move to more recent results. All the results we are going to mention are obtained by model theoretic methods, i.e. a result of the form is usually obtained by, first, showing that proves all instances of (this is usually easy), and second, for every logical principle such that , one constructs a countermodel of , which also fails that principle. In other words, there is then a -substitution such that but . We say that an intermediate logic is characterized by a class of Kripke models if and only if if and only if for every . The class of finite trees consists of the finite partial orders such that for any , the set is linearly ordered.
Theorem 3.5. (Passmann [15])
Let be a set theory. If be an intermediate propositional logic that is characterized by a class of finite trees, then . In particular, .
This result was proved by using so-called blended Kripke models, and we will see an adaptation of this technique in §4. For now, we just note that Passmann’s blended models are inspired by the models that Lubarksy [16] used to show various independence results around .
The next two results were proved by Iemhoff & Passmann [17]. Note that is obtained by adding weak versions of strong collection and subset collection to (for details see their paper). An intermediate logic is called Kripke-complete if there is a class of Kripke models such that is characterized by .
Theorem 3.6. (Iemhoff & Passmann [17])
Let be a set theory. If is a Kripke-complete intermediate propositional logic, then .
Theorem 3.7. (Iemhoff & Passmann [17])
Let be a set theory. If is a Kripke-complete intermediate first-order logic contained in the least transitive model of , then . In particular, .
For these results, Iemhoff & Passmann use (what they call) Kripke models with classical domains. These models are obtained by equipping each node of a Kripke frame with a classical model of set theory in a coherent way. An earlier result by Iemhoff [18] entails that these models will always satisfy . Passmann showed in his master’s thesis [19] that this result is in a certain sense optimal: there are Kripke models with classical domains that do not satisfy the exponentiation axiom (which is a consequence of ). The aforementioned results could be proved for large classes of logics because the simple structure of the Kripke models with classical domains allows much control about their logical structure.
Finally, Iemhoff & Passmann also obtained the following negative result for first-order logic with equality. Note that is obtained just like with the additional requirement that substitutions commute with ‘’.
Theorem 3.8. (Iemhoff & Passmann [17], theorem 61)
Let be a set theory containing the axioms of extensionality, empty set and pairing. Then the first-order logic with equality of , , is strictly stronger than , i.e. .
Question 3.9.
Given any set theory , what is its first-order logic with equality ?
The following results were obtained by combining realizability techniques with transfinite computability. The latter is a generalization of classical computability by allowing machines to run for an infinite amount of time and/or use an infinite amount of space. For a thorough introduction, we refer the reader to Carl’s book [20].
Theorem 3.10. (Carl et al. [21])
The propositional admissible rules of are exactly those of , i.e. .
Theorem 3.11. (Passmann [14])
The propositional admissible rules of are exactly those of , i.e. .
Theorem 3.12. (Passmann [14])
Let be a set theory. The first-order logic of is intuitionistic first-order logic , .
It is crucial to note here that the proof method of theorem 3.12 does not allow to prove the result for any intermediate logic .
Question 3.13.
Can theorem 3.12 be generalized to a class of intermediate first-order logics?
In general, all results surveyed in this section introduce assumptions on the logic under consideration. We may therefore ask the following question (table 1).
✓ | ✓ | ✓ | |
✓ | ✓ | ✓ (corollary 4.13) | |
✓ | ✓ | ✓ (corollary 4.13) | |
✓ | ✓ | ✓ (corollary 4.13) | |
✓ | ✓ | ✓ | |
✓ | ✓ | ✓ | |
✓ | ✗; | ✓ (corollary 4.13) | |
✓ | ✗; | ✗; | |
✗; | ✗; | ✗; |
Question 3.14.
Is it possible to extend the results of this section to larger classes of logics? If not, find counterexamples for which these theorems fail if the assumptions on the logics are weakened.
4. Logics and rules of set theories
(a) Logic, rules and the extension property
In this section, we will prove a set-theoretic analogue of Visser’s [4, lemma 4.1] Main Lemma. Given a Kripke frame , we write for the frame extended with a new root. The construction of adding a new root to a Kripke model was first used by Smorynski [22] for models of to give an alternative proof of de Jongh’s theorem for . In the arithmetical case, it suffices to equip the new root with the standard model of arithmetic. The case of the set theory requires a more elaborate construction, as we will now see.
Definition 4.1.
Let be a set of sentences in the language of set theory. A set of sentences is -extensible if for every Kripke model of set theory with underlying frame , there is a model based on such that and ; is extensible if it is -extensible.
To avoid cumbersome notation, we will say that a sentence is (-)extensible just in case is (-)extensible. We will later need the following two brief observations.
Lemma 4.2.
Let be sets of formulas. If is -extensible, then is extensible.
Proof.
Let . As , we have . By -extensibility, it follows that .
Lemma 4.3.
Let be sets of formulas. If a sentence is -extensible, then it is -extensible.
Proof.
Let . As , we have . By -extensibility, .
A Kripke frame is a finite splitting tree if is finite, connected and every has either no successors or at least two immediate successors. A node with no successors is also called a leaf. An easy induction on the height of finite splitting trees allows to show that every node in such a tree is uniquely determined by the set of leaves above it. A set theory is called subclassical if there is a classical model of .
The following theorem follows the same idea as Smorynski’s proof of de Jongh’s theorem for [22]. For this reason, this proof method is also sometimes referred to as Smorynski’s trick.
Theorem 4.4.
Let be a propositional intermediate logic characterized by a class of finite splitting trees. If is a subclassical recursively enumerable extensible set theory, then is the propositional logic of , i.e. .
Proof.
Let be the set of binary sequences of finite length, and be the set of binary sequences of length . If is a recursively enumerable theory, let be its Gödel sentence. Let , and . Clearly both and are consistent by Gödel’s incompleteness theorem. By recursion on the length of , we define
We now observe that, given of the same length with , it must be that and are jointly inconsistent: Let be minimal such that . Then , and we can assume, without loss of generality, that and . The sentences and are defined as conjunctions in such a way that contains the conjunct , and contains the conjunct . Since , it follows that . We can conclude for and that
Let be a set theory and be a logic, as given in the statement of the theorem. To prove that , we will proceed as follows: Let be a class of finite splitting trees that characterizes the logic . It is clear that . To show that , we proceed by contraposition. So assume that for some propositional formula , then there is a finite splitting tree and a valuation on such that but . On the basis of this propositional Kripke model, we will construct a Kripke model of set theory and a propositional translation such that . As is based on the frame , it follows that . Hence, it follows that .
As every finite splitting tree can be constructed from its set of leaves by iterating the operation of adding a new root, we can obtain a model of on the frame as follows. Find such that there are at least as many as there are leaves in ; let be an injective map assigning sequences to leaves. Assign the models to the leaves of and use the extensibility of to construct a Kripke model of with underlying frame . Given , let be the set of leaves . Then consider the formula
We are now ready to construct the translation . Given a propositional letter , let
To finish the proof of the theorem, it now suffices to show that if and only if ; it then follows by induction that , if and only if . So assume that . Equivalently, for some . We have seen that this is equivalent to for some , which, by persistence, holds if and only if .
Note that is characterized by the class of finite trees [23, theorem 6.12]. To show that is characterized by the class of finite splitting tress, note that duplicating branches of a tree does not change the formulas satisfied at the root.
Corollary 4.5.
Let be a subclassical recursively enumerable set theory. If is extensible, then the propositional logic of is intuitionistic propositional logic , i.e. .
Let us first observe the following helpful fact. Recall that a theory has the disjunction property whenever implies or .
Lemma 4.6.
If is an extensible set theory, then has the disjunction property.
Proof.
By contraposition. Assume that and , then there are models and of such that and . Let be the disjoint union of these models, then as is extensible. Moreover, persistence implies that ; hence, .
Lemma 4.7.
If is an extensible set theory, then Visser’s rules are admissible in .
Proof.
By lemma 4.6, it is sufficient to show that the following rules are admissible:
Let be any substitution and assume that , i.e. for . By Kripke completeness, let be a Kripke model of with root such that . In this situation, we can assume without loss of generality that , but .
Now, let be the disjoint union of the models. As is extensible, consider a model extending with a new root such that . By persistence, for all . Hence, for all , but . Hence, .
We may conclude that the rule is admissible in .
The following theorem is a direct consequence of corollary 4.5, lemma 4.7 and theorem 2.13.
Theorem 4.8.
Let be a subclassical recursively enumerable set theory. If is extensible, then the admissible rules of are exactly those of , i.e. .
Finally, the following corollary is immediate from theorem 4.4.
Corollary 4.9.
Let be a subclassical recursively enumerable extensible set theory, and be a sentence in the language of set theory. If is -extensible, then .
In conclusion, the axiom of choice (AC) is not extensible.
Question 4.10.
Is the axiom of choice extensible?
(b) Extensible set theories
In this section, we assume the existence of a proper class of inaccessible cardinals (see remark 4.17). We will prove the extension property for a variety of constructive set theories by providing one particular construction for extending a given model. This construction is an adaptation of Passmann’s [15] blended models. A Kripke model for set theory is a first-order Kripke model of the form , where is a collections of domains, is a collection of transition functions between domains, and is a collection of interpretations of the ‘’-predicate. Note that the are assumed to be sets. Of course, the transition functions and interpretations satisfy the usual coherence conditions (for more details see, e.g. [17]).
Definition 4.11.
Let be a Kripke model of set theory. The extended model is defined as follows:
(i) | Let , the so-called new root. Then extend and as follows: | ||||||||||||||||||||||||||||||||||||||||||||||
(ii) | The domains for are already given. The domain at the new root is defined inductively as follows:
| ||||||||||||||||||||||||||||||||||||||||||||||
(iii) | The membership relation at are already defined; at the new root , define by if and only if | ||||||||||||||||||||||||||||||||||||||||||||||
(iv) | The transition functions are already defined for with . Given , define by . |
To help the reader digest this construction, we will give a simple example and provide some general intuition for the construction. Let be any classical model for set theory. In other words, is a one-point Kripke model; call its single point . The extended model has then a new root . An element of the root —a set at the root—is a function with domain such that and . Moreover, if , then . Intuitively, a set at node may thus already contain some elements at the root but may also collect new elements when transitioning to . Another way to think of this construction idea is as adding a new root whose elements are approximations of the sets already existing in the model we are starting from.
The next step is to observe that many set-theoretical axioms are extensible. For convenience, we abbreviate the axioms of extensionality, empty set, pairing and union with , , and , respectively.
Theorem 4.12.
The following axioms and axiom schemes are -extensible: extensionality, empty set, pairing, union, -induction, separation and -separation, power set, replacement and exponentiation. Moreover, the axiom of strong infinity is -extensible.
Proof.
We will prove all statements of the theorem by assuming that a model satisfies the relevant axiom or scheme and then show that the extended model satisfies them as well.
For extensionality, let . For the non-trivial direction, assume that . By persistence and extensionality in , we have for every ; hence, for all . To see that also , observe that if and only if . By assumption, the latter is equivalent to , which holds if and only if . In conclusion, .
For the empty set axiom, let be the unique (by extensionality) witness for the empty set axiom at . Define a function with domain such that for and ; by uniqueness, is defined, and it follows that . To see that witnesses the empty set axiom, let . We have to show that . For this, it suffices to show that for all , , but this is trivially true as is empty and is the empty set for all .
For the pairing axiom, let , . By pairing and extensionality in , there is a unique such that for all . Define to be the function with domain such that and for all . Clearly, is defined by uniqueness of the . To see that indeed witnesses the pairing axiom, observe that, clearly, and . Moreover, if , then it follows by definition of that .
For the union axiom, let . As mentioned earlier, by extensionality and the union axiom in , we can find a unique witness such that for every . Then define a function with domain such that for all and . To verify that indeed , note that implies that there is some such that . Now by , we know that , so, clearly, . It is now a straightforward computation to see that witnesses the union axiom at .
Regarding -induction, suppose for contradiction that . Then there is some such that but . As satisfies -induction, it must be that , and so by persistence . By -induction in , , so all failures of this instance of -induction must happen at the new root . Thus, as , it follows that there is some such that . Using the antecedent of -induction, this means that there must be some such that and . Iterating this construction, we obtain a sequence such that . This straightforwardly gives rise to an infinitely decreasing -chain. A contradiction.
Next, we consider the separation axiom. Let and be a formula, possibly with parameters. Now, for every , let be the unique result of separating from with and parameters at node . With persistence and extensionality, it follows that the function with domain such that and is a well-defined element of ; if is the least such that , then it is easy to see that . It follows straightforwardly from the definition of that it witnesses separation from with at : is equivalent to , and the latter holds by definition if and only if . Note that the proof of -separation schema is a special case of the proof for the separation schema.
For the power set axiom, consider and let such that . If such that , then and hence, as well. Let consist of those such that , then . Moreover, let be the unique element of such that (using extensionality). By persistence, is a well-defined element of . It is then straightforward to check that is the power set of at .
For replacement, let and be a formula (potentially with parameters) such that . Given this, let consist of those for which there exists some with . Moreover, let be the witness for applying replacement with on at . By inaccessibility of and persistence in , it follows that . As in the previous cases, it is now straightforward to check that witnesses replacement.
For exponentiation, let . Let be the set of functions from to at , and let be the set of functions from to at the node . It follows that , and it is easy to check that witnesses the exponentiation axiom.
Recall that the axiom of strong infinity asserts the existence of a least inductive set. So, for every , let be this least inductive set. At the new root , we recursively construct sets as follows: Let be the empty set as defined from the empty set axiom. Then, given , use pairing and union, to obtain such that . As each is the least inductive set at , it must be the case that for all . Therefore, the set , defined by and for , is a well-defined set at , i.e. . By construction, we must have that if , then . Hence, witnesses strong infinity.
A combination of theorem 4.12 and lemmas 4.2 and 4.3 yields the next corollary. An application of theorem 4.8 then yields corollary 4.14.
Corollary 4.13.
The theories , , and are extensible.
Corollary 4.14.
The propositional admissible rules of , , and are exactly those of . In other words, if is one of these theories, then .
Remark 4.15.
It seems difficult to use the aforementioned method to prove that the axiom schemes of collection, strong collection and subset collection are extensible. The reason for this is that these axiom schemes do not require that their witnesses are unique—this is in contrast to their weakened versions, replacement and exponentiation.
Question 4.16.
Are the axiom schemes of collection, strong collection and subset collection extensible?
Remark 4.17.
It is not strictly necessary to assume the existence of a proper class of inaccessible cardinals as (at least) the following two alternatives are available: Firstly, we could work with Kripke models that have (definable) class domains. Secondly, we could apply the downwards Löwenheim–Skolem theorem (which does hold in our classical metatheory) to work, without loss of generality, only with models with countable domains. In our view, the solution with inaccessible cardinals is the most elegant as it allows us to ignore any worries about size restrictions.
5. Questions
We close with a few suggestions for future research.
Question 5.1.
Is it the case that ?
In general, one can ask the same question for any two constructive theories , of interest.
Question 5.2.
Are and extensible?
Question 5.3.
Of course, both and can be extended with further axioms. Natural examples of such further axioms include the so-called large set axioms—the constructive counterparts to large cardinal axioms. What are the propositional and first-order logics of extensions of and with further axioms? Also, what are their admissible rules?
Question 5.4.
In this article, we gave a first structural analysis of the logical structure by observing the consequences of the extension property. Are there other structural properties of set theories that determine (parts of) the logical structure of a given set theory?
Question 5.5.
Let be a theory. Given a class of set-theoretic formulas, we can obtain the notions of -propositional logic and -first-order logic , as well as -admissible rules by restricting to those -substitutions that arise from -assignments with domain in . Natural classes to consider are, of course, the classes of the Lévy hierarchy but other cases are thinkable (such as prenex formulas). What are the logics that arise in this way?
Note, regarding the previous question, that the substitutions used to prove the results in §3 are usually of rather low complexity, e.g. or . Therefore, the question is often only interesting for classes of lower complexity. For instance, Visser [4, sections 3.7–3.9] considers -substitutions.
In this article, we have only considered set theories on the basis of intuitionistic logic. Of course, one can also analyse the logical structure of set theories based on other logics. Löwe et al. [24] take some first steps towards an analysis of the logical structure of certain paraconsistent set theories. Another related problem is to determine the provability logics of constructive and intuitionistic set theories.
6. The classical case
In this final section, we briefly discuss the logical structure of classical theories. Recall that a logic is post-complete if it has no consistent extension.
Theorem 6.1.
Let be a classical theory. If is consistent, then satisfies the de Jongh property for .
Proof.
Since is classical, it must be that . As is consistent, . Hence, as is post-complete.
The case for first-order logics of a given theory is more complicated as they are not post-complete; Yavorsky [25] proved that expressively strong arithmetical theories , such as , satisfy (see Yavorsky’s paper for definitions and details). Yavorsky also shows that there are classical theories such that .
Theorem 6.2.
Let be a classical theory. Then if and only if . Thus, .
Proof.
Theorem 6.1 implies that . As is well known, (a proof can be found in [26]). Thus, the second part of the theorem follows from the first.
For the first part, the direction from left to right has been proven in theorem 2.11. We treat the other direction. Arguing by contraposition, assume for some propositional formulas and that and for some substitution from the language of propositional logic to the language of . We have to show that . Assumption implies that there exists a propositional model of and a valuation such that . Define a substitution in propositional logic as follows:
In the remainder of this section, we show that the first part of theorem 6.2 can be lifted to the predicate case, at least for theories for which , and the second part cannot. The predicate version of the first part of the previous theorem states that and the second part states that . To show that the first statement is true for theories such that , we first have to prove that the second statement, while not true, is almost true; we have to show that is almost structurally complete.
As remarked earlier, is structurally complete, which means that any admissible rule is derivable: for all propositional formulas and
It is easy to see that, unlike , is not structurally complete. Consider the formula and observe that for no substitution , the formula is derivable in . For if it were, it would hold in models consisting of one element, quod non. This implies that . And since , this shows that is not structurally complete, i.e. that . However, the admissible rule has a particular property, it is passive.
Definition 6.3.
Admissible rules are called passive if the premise is not unifiable, i.e. for no substitution .
We will show that the only non-derivable admissible rules in are passive. In the terminology of admissibility: is almost structural complete. This result was first obtained by Pogorzelski and Prucnal [27]. Here, we prove a modest generalization that applies to a class of consequence relations instead of a single one. Recall that a rule is derivable in if , where is a consequence relation of , by which we mean a consequence relation that has the same theorems as : iff holds in . Thus, strictly speaking, the notion of derivable rule depends on the given consequence relation of , which is why we make it explicit in the discussion below. Since the derivability of a formula does not depend on the consequence relation (because all have the same derivable formulas), we can leave it implicit in such a setting, as in the next lemma, where denotes any consequence relation for .
Lemma 6.4.
A formula is unifiable if and only if there exists a ground substitution such that and for all predicates and sequences of terms .
Proof.
It suffices to show the direction from left to right. Since is unifiable, there exists a substitution such that , from which it follows that for any ground substitution , as is closed under uniform substitution. Clearly, is a ground substitution too. That satisfies the second part of the lemma follows immediately from the definition of substitutions.
Theorem 6.5.
Let be a consequence relation for such that for all formulas . Then implies for all unifiable formulas and all formulas .
Proof.
Since is unifiable, there exists, by lemma 6.4, a ground substitution such that and for all predicates and sequences of terms . Firstly, note that this implies that for any formula . Let be the free variables in and define a substitution as follows:
Claim
For any formula , we have that
Proof of the Claim.
By induction on .
Case . Trivial. Case . If , then as commutes with connectives, we must have that and , which by the induction hypothesis imply that
If , then there are three cases.
Case 1. | and . By induction hypothesis, and . Thus, . | ||||
Case 2. | and . By induction hypothesis, and . Thus, we have that | ||||
Case 3. | and , analogous to case 2. |
Case . If , then since commutes with the connectives, we must have that , which by induction hypothesis implies that . Thus,
This proves the claim.
Now, since , we know by the claim that . But , thus . Since , we have that .
By the claim, it is easy to see that . By assumption , we obtain that .
Corollary 6.7.
is almost structurally complete with respect to any consequence relation that satisfies the constraint in theorem 6.5.
Corollary 6.8.
For any classical theory such that : if and only if , i.e. .
Proof.
The assumption gives implies by theorem 2.11. For the other direction, assume that . If is unifiable in , then by theorem 6.5. Hence, and thus . If is not unifiable in , then it suffices to show that is not unifiable in . For this implies that for any , and thus, would follow. Arguing by contradiction, suppose is unifiable in , say for some substitution . Then for any ground substitution , , and as , this implies , contradicting the assumption that is not unifiable in .
Data accessibility
This article has no additional data.
Authors' contributions
R.I.: funding acquisition, investigation, methodology, writing—original draft and writing—review and editing; R.P.: funding acquisition, investigation, methodology, project administration, writing—original draft and writing—review and editing.
Both authors gave final approval for publication and agreed to be held accountable for the work performed therein.
Conflict of interest declaration
We declare we have no competing interests.
Funding
The first author gratefully acknowledges support by the Netherlands Organisation for Scientific Research (grant no. 639.073.807) and by the MOSAIC project (EU H2020-MSCA-RISE-2020 Project no. 101007627). The research of the second author was partially funded by a doctoral scholarship of the Studienstiftung des deutschen Volkes (German Academic Scholarship Foundation).
Acknowledgements
We would like to thank two anonymous reviewers for their very helpful comments.