ZFC: Why? What? And, how?

Published in

Cantor’s Paradise

11 min readMay 20, 2021

Zermelo-Fraenkel set theory with the axiom of choice is considered the standard foundation for mathematics. But why? What are its axioms? And how does this theory allow us to settle the paradoxes of naïve set theory?

Exercises in set theory. (Photo by author)

Naïve set theory doesn’t do the job

Our journey begins with naïve set theory. That’s not a formal logical theory axiomatised in a formal language. Rather, naïve set theory is an informal collection of assumptions about sets, formulated in natural language: For any two sets there’s a union and an intersection. We have a set of natural numbers, and from those we can construct the real numbers. The crucial assumption of naïve set theory, however, is the so-called unrestricted comprehension schema stating that:

For any property P(x), there is a set consisting of exactly those x that satisfy P.

At first look, comprehension certainly makes sense: Given any property we should be able to talk about the set of all those objects satisfying the property. For example, when we talk about the set of prime numbers — those x that are natural numbers and have exactly two divisors.

It turns out, however, that the unrestricted comprehension schema is highly problematic.

Betrand Russell discovered the following paradox in 1901 — Russell’s paradox: Take P(x) to be the property that ‘x does not contain itself,’ or ‘x ∉ x’ in symbolic notation. By the unrestricted comprehension schema, there must be a set y consisting of all those sets x that satisfy P(x). That is, y consists of all sets that do not contain themselves.

Does y contain itself? Is it the case that y ∈ y? If so, then P(y) must hold. But then y ∉ y. That’s a contradiction. It must therefore be the case that y ∉ y. But this means that P(y) cannot hold. This gets us that y ∈ y. Yet another contradiction.

Russell’s paradox therefore shows that the unrestricted comprehension schema is inconsistent. If we assume it, then we introduce proper contradictions into our reasoning.

More set-theoretic paradoxes were found around the turn of the 20th century. Another example is Burali-Forti’s paradox: He used the unrestricted comprehension schema to define the set z of all ordinals. That is, he chose P(x) to mean that ‘x is an ordinal’ and then applied comprehension to get the set z of all sets satisfying P(x). It then turns out that z itself satisfies the definition of an ordinal. So z ∈ z, and therefore z < z. But no ordinal is less than itself. A contradiction.

These paradoxes illustrate that naïve set theory doesn’t do the job. It’s contradictory. What we need is a proper axiomatic theory of sets that allow us to settle these paradoxes. But before we can find such axioms, we first need to find out what it is we’re trying to describe with these axioms.

The Cumulative Hierarchy

The cumulative hierarchy is one possible way to understand what set theorists are trying to describe with their axiomatic theory. The idea is this: We start with no sets at all, and then construct the set-theoretic universe in stages. At each stage, a new set is added if all its elements are already available in the previous stage.

We begin with the 0h stage: there are no sets. Then, at stage 1, the empty set ∅ gets added. Why? Because all its elements are already available in the previous stage. Stage 2 then contains two sets: the empty set ∅, and the set containing the empty set {∅}. Just continue like this. Stage 3 contains the four sets that you can build from the two sets in stage 2.

This hierarchical construction of sets never stops. It continues into the infinite. The first infinite level, stage ω (omega), contains all the sets constructed in the finite stages. Then, the stage ω + 1 is just obtained like any other stage: by taking those sets of which all members are contained in stage ω. And so on, continue forever.

This cumulative hierarchy is also called the iterative conception of sets. The resulting universe of sets is the von Neumann universe. This concept is named after John von Neumann but was first published by Ernst Zermelo in 1930.

While I will use this understanding of sets to explain and motivate the ZFC axioms below, it was introduced historically only after most of the axioms had been introduced.

Philosophically, the von Neumann universe can be understood in different ways. The formalist understanding sees the hierarchical construction as a consequence of the ZFC axioms (that I yet have to introduce). On the other hand, the realist interpretation thinks of the von Neumann universe as describing a factual reality that the ZFC axioms are trying to capture.

Moreover, philosophers have advocated for a potentialist understanding of this hierarchy in recent years. In this view, the set-theoretic universe is never considered complete but always has the potential to grow even further.

The Axioms of Set Theory

Roughly speaking, the purpose of the axiom of set theory is to give explicit rules about which sets exist and what their properties are. ZFC wasn’t defined in one go: Zermelo proposed a first axiomatisation in 1908, and this was later extended with axioms due to Fraenkel, Skolem and von Neumann.

Zermelo’s Set Theory Z

Let’s begin with a modern presentation of Zermelo’s original set theory — also called Z.

The first of Zermelo’s axioms is the axiom of extensionality stating that two sets are equal if and only if they have the same elements:

One can understand this axiom as giving a condition for when two sets are equal — namely, when they have the same elements. If two sets are equal, then they must have the same elements. The axiom of extensionality can be seen as providing the converse to this implication: If two sets have the same elements, then they are equal.

Why ‘extensionality’? The name of the axiom refers to a set’s extension. The extension of a set consists of all those elements that are members of the set. The axiom therefore states that two sets are equal if they have the same extension.

Why does this talk about extension matter? Because sets may have the same extension but different intensions. It is a complicated philosophical puzzle to give a precise definition of what an intension is. For our purposes, it is sufficient to think of the intension of a set as its definition. Consider the following two sets A and B:

The set A contains all real numbers that are between 0 and 1. The set B contains all real numbers that are obtain from a real number x by applying cos(x) and taking the absolute value |cos(x)|. As you will quickly see, these sets contain the same elements even though their intension — think: definition — is different.

The axiom of extensionality tell us that the only thing that matters for the equality of any two sets is whether they have the same elements.

The first set that we created in the cumulative hierarchy was the empty set. To make sure that this set exists, the axiom of empty set is introduced. It provides the starting point for the cumulative hierarchy:

This axiom states that a set without any elements, the so-called empty set exists.

The axiom of pairs is also quite simple:

Whenever you have two sets x and y, the axiom of pairs allows you to construct a new set that contains exactly those two sets.

This axiom also makes sense in view of the cumulative hierarchy. If you have two sets x and y at a certain stage, then the set {x,y} consisting of exactly x and y will be added at the next stage.

The axiom schema of separation is this:

This axiom states that, given a set x and a property P, you can form the set of those elements y of x that satisfy P. Does this remind you of something?

You might be reminded of the paradoxical comprehension scheme. There is, however, a big difference between those two schemes: the comprehension scheme allows to form a set consisting of all sets satisfying a certain property. The axiom scheme of separation, on the other hand, can only select those elements from a set that satisfy the property P.

The comprehension scheme is unrestricted — given a property P, it allows you to form the sets of all sets satisfying that property. The separation scheme, on the other hand, is restricted — given some set x, it only guarantees the existence of the subset consisting of those elements of x that satisfy P.

The next axiom on Zermelo’s list is the axiom of power set. It states that for every set x, there exists a set containing all subsets of x:

The power set axiom is crucial for defining the cumulative hierarchy: given a stage in the hierarchy, its power set forms the next stage.

The axiom of union is crucial as well:

The axiom says this: Given a set x, then there is a set y that is the union of all elements of x. The usual set-theoretic union ‘x ∪ y’ can be obtained by applying the axiom of union to the set {x,y}. The latter, of course, can be obtained by the axiom of pairing.

In fact, consider the axiom of binary unions, which informally reads as, ‘If x and y are sets, then their union x ∪ y is a set.’ The argument of the previous paragraph shows that the axioms of union and pairing prove the axiom of binary unions.

Also the infamous axiom of choice was on Zermelo’s list:

Given a family of non-empty sets, the axiom of choice guarantees the existence of a choice function that picks an element of each set in the family.

Why is this axiom added? Properly understanding its necessity and implications can fill a researcher’s career. For now, let’s just go with this rough estimation: Whenever you’re making infinitely many choices at once, then you’re using the axiom of choice.

The last axiom on Zermelo’s list was the axiom of infinity. It states that there is an infinite set:

On first look, this axiom might seem rather cryptic. But what it’s doing is to ensure that there is an infinite set. How so? By requiring the existence of a set x that contains the empty set, and, for every set y in x, the set y ∪ {y} is in x. By the axiom of extensionality, y must be different from y ∪ {y}. But then x must be infinite.

Why do we need an axiom of infinity? Every mathematician knows infinite sets: For example, the sets of natural numbers, or the set of real numbers. But even a function f from the naturals to the naturals is an infinite set consisting of all the pairs (n,f(n)).

Without the axiom of infinity, talking about such infinite sets would be impossible. In fact, there is a model of set theory in which every set is finite but all axioms of ZFC, except for the axiom of infinity, hold. This illustrates that we cannot do without the axiom of infinity!

That’s Zermelo’s list of axioms: extensionality, empty set, pairs, separation, power set, union, choice, and infinity. But two more axioms are missing to get full ZFC.

ZFC

First, we need to add the axiom schema of replacement:

Informally speaking, the replacement ensures that whenever you have a set x and a function f, then applying the image of x under f is a set as well.

This gap in Zermelo’s theory was independently discovered by both Fraenkel and Skolem.

Finally, von Neumann introduced the axiom of foundation in 1925. Sometimes it’s also called the axiom of regularity:

Together with the axioms of pairing and extensionality, this axiom proves that no set can be an element of itself: Take some set x, and use the axiom of pairing to form the set {x, x}. By extensionality, the latter is just {x}. Now, applying the axiom of foundation to {x}, we get that there must be some y ∈ {x} such that y ∩ {x} = ∅. But there’s only one set in {x}, so y = x, and we get that x ∩ {x} = ∅. This means that x ∉ x.

Such and similar consequences make the axiom of foundation attractive — it fits our intuition about the cumulative hierarchy extremely well. Imagine there was a set containing itself. How should that happen? According to the cumulative hierarchy, a new set is added if all its elements are already available in the previous stage. So if x contains itself, then x must have already been added before it was added. Sounds paradoxical. Luckily, the axiom of foundation saves us.

That’s it! Zermelo-Fraenkel set theory with the axiom of choice, ZFC, consists of the 10 axioms we just learned about: extensionality, empty set, pairs, separation, power set, union, choice, infinity, replacement, and foundation.

The name ‘Zermelo-Fraenkel set theory’ was apparently first used by von Neumann in 1928.

Resolving the Paradoxes

I promised above that ZFC would allow us to solve the paradoxes and problems of naïve set theory — but how does ZFC do this?

Think again of Russell’s paradox concerning the set of all sets that do not contain themselves. Using the axiom of foundation, we have seen that no set contains itself. Therefore, the set of all sets that don’t contain themselves is actually not a set — it’s a proper class, consisting of all the sets there are. ‘Proper class’ means that the collection defined as ‘containing all sets that do not contain themselves’ is not a set because it’s too big. (This can also be shown from the axioms.) The question whether this class contains itself doesn’t make sense because it contains only sets.

Axiomatic set theory resolves paradoxes by demystifying them. The Zermelo-Fraenkel axioms of set theory give us a better understanding of sets, according to which we can then settle the paradoxes.

There are many ways to continue from here: large cardinals, alternatives to the axiom of choice, set theories based on non-classical logics, and more. Let me know what you’re curious about — and have a look at my other stories on the continuum hypothesis, junk theorems, and the law of excluded middle.