The final installment in a series of posts on the size of the infinite, as described in mathematical set theory. The first post can be read here, the second here, and the third here.
We have taken a long, strange journey into the properties of infinity. Over the course of three posts, we have seen that we can characterize the different “sizes” of infinity, though not in the way one might think. We have found, in fact, that there are an infinity of infinities! The smallest one we looked at was the infinite set of counting numbers (labeled ); the next largest we found was the continuum (labeled ): the set of real numbers between 0 and 1. We then found that, for any size infinity, we can construct a larger one.
This leads to an intriguing notion: if we arrange the different size infinities we have found in order, we might have a set of the form
This would seem to suggest a really elegant possibility: if these are all the infinities, then we could imagine that the set of all infinities form a countable infinity themselves, of size , and then we could build up the larger infinities again from this, continuing an endless cycle! For instance, the set of all subsets of the set of all infinities would then be of size , and so on.
For this to be true, however, we need to know whether there are any other infinities between those we have been able to derive so far. We have shown that there are an infinite number of infinities, but we have not shown that these are the only infinities. To condense this into the simplest problem, we can ask:
Are there infinite sets of an intermediate size between and the continuum ?
This is what is known as the continuum problem, and it has vexed mathematicians for well over a hundred years, ever since Georg Cantor first formulated set theory in the 1870s.
But here is where we arrive at what may be the oddest part of the story of infinity! If we look at the history of the continuum problem, the answer to the question has changed over the years:
- We don’t know the answer (c. 1870s)
- We can’t know the answer (c. 1950s)
- The answer is whatever we prefer it to be (today)
Huh? Okay, this is going to take a bit of explanation…
In order to make sense of this, we need to “look under the hood” of mathematics, so to speak, and take a glimpse at some of the machinery that makes it work. We have already noted that the study of different sizes of infinity — so-called transfinite numbers — originated with Georg Cantor’s work back in the 1870s. It was Cantor himself who posed the continuum problem, and he hypothesized that there is no infinity between the countables and the continuum.
Cantor’s work formed the basis of what is now known in mathematics as set theory. So what is “set theory?” Speaking loosely, it is the mathematical theory of collections of objects, known as sets. Set theory is now considered a fundamental piece of the foundation of mathematics — most math involves collections of something or other.
If one reads Cantor’s original description of sets, the language is very intuitive and non-formal. In Cantor’s 1895 memoirs, for instance, he introduces the idea of a set (called an aggregate) in the following way:
By an “aggregate” we are to understand any collection into a whole M of definite and separate objects m of our intuition or our thought.
It is a very short and elegant definition, but also one that can lead into trouble. By the beginning of the 20th century it was discovered that Cantor’s set theory (what is now called a “naive” set theory) produced insurmountable paradoxes. The most famous example of this is Russell’s paradox, discovered by mathematician Bertrand Russell in 1901. We summarize it as follows:
According to naive set theory, any imagined collection of objects consists of a set. We may therefore imagine the set of all sets that are not members of themselves, which we will call S. This set is inherently problematic, though. S itself is either a member of itself, i.e. S is included in S, or it is not. If S is not a member of itself, then it must be a member of itself. If S is a member of itself, then it cannot be included in S, which includes only sets that are not members of themselves.
We have reached a logical paradox! If this explanation seems a bit abstract, it is often phrased in the form of the “barber paradox,” which Russell also introduced, though it was evidently suggested to him by a friend.
Imagine a town with a single barber, who is male. In town, every man keeps himself clean-shaven, and he does this by doing one of two things: he either shaves himself or goes to the barber. But who shaves the barber? He is the barber, so he can’t shave himself.
In the barber paradox, the problem is a matter of human-created rules, but in the case of the general set paradox there is no way around it with the naive description of sets as “collections of anything.”
The problem, it seems, is that we have spoken too loosely about sets: our intuitive description of what a set can or cannot be is flawed. So what to do? The solution, initiated by Ernst Zermelo in 1908, was to “rebuild” set theory from a small set of axioms, creating axiomatic set theory.
An axiom is, in short, a logical definition, rule, or condition. A collection of such axioms are used to form the foundation of a system of mathematics. If you’ve taken a geometry class in high school, you’re probably familiar with this approach: the axiomatic method dates back to ancient Greece and the development of geometry by Euclid around 300 B.C.E. Euclid’s axioms, in modern language, may be written as (via the University of New Mexico):
- A-1 Every two points lie on exactly one line.
- A-2 Any line segment with given endpoints may be continued in either direction.
- A-3 It is possible to construct a circle with any point as its center and with a radius of any length. (This implies that there is neither an upper nor lower limit to distance. In-other-words, any distance, no mater how large can always be increased, and any distance, no mater how small can always be divided.)
- A-4 If two lines cross such that a pair of adjacent angles are congruent, then each of these angles is also congruent to any other angle formed in the same way.
- A-5 (Parallel Axiom): Given a line l and a point not on l, there is one and only one line which contains the point, and is parallel to l.
These axioms, together with some “common sense” rules of logic and a lot of definitions (“what is a line?” for instance), allow one to prove all sorts of interesting things about geometric objects. Similarly, work performed by Zermelo and others led to the formation of a set of axioms for set theory that removed the obvious paradoxes such as Russell’s paradox. The resulting axiomatic system, now known as ZFC (“Zermelo-Fraenkel set theory with the axiom of Choice”), is still accepted and standard to this day.
With the axioms of set theory established, it seemed that the proof or disproof of the continuum hypothesis would only be a matter of time. But another question was raised in the development of axiomatic mathematics: is the system consistent? We note that the axioms of mathematics are a set of rules at are accepted without proof as reasonable: because they are not proven, there is always the possibility that two or more seemingly complimentary axioms might be fundamentally in conflict and lead to a contradiction in some subtle way. This was considered a major worry: in 1900, the master mathematician David Hilbert posed a list of 23 important unsolved problems in mathematics, and the consistency of the axioms of arithmetic was included in this list (along with the continuum hypothesis).
The question was eventually answered, but in a completely unexpected way. In 1931, the Austrian mathematician Kurt Gödel proved the unbelievable: for any given axiomatic system, there are some hypotheses that cannot be proven in it. His results are now known as the Gödel incompleteness theorems. In his first incompleteness theorem, he demonstrated that no axiomatic system is capable of proving all things that are, in fact, true about the natural numbers. In his second incompleteness theorem, he showed that it is also impossible to prove that the axioms of arithmetic are consistent themselves, at least within any system that is comparable to arithmetic!
There is a lot of subtlety in the statement of Gödel’s incompleteness theorems, and a detailed discussion will be deferred for another post. What is important, however, is that it demonstrated that any axiomatic system of mathematics, no matter how carefully constructed, will have “holes” in it: things that may very well be true but cannot be proven.
This revelation had a huge impact on mathematics in general, and set theory in particular. In 1940, Kurt Gödel demonstrated that the continuum hypothesis cannot be disproved from the axioms of ZFC set theory; much later, in 1963, Paul Cohen demonstrated that it cannot be proved within this axiomatic system, either. This leads to the interesting and bizarre conclusion: there may be infinities between the countable and the continuum, but we cannot ever discover or study them. They exist outside of our ability to know them, a truly odd thing indeed. In a sense, it seems that we could pretend that my original hypothesis — that the infinities are countable — is true, and we would never find any evidence to contradict it.
There is another option, however. We could add an axiom to ZFC set theory that allows us to answer the continuum question, one way or another. This may seem like cheating at first glance. However, with the knowledge that every system has undecidable propositions, the goal of an axiomatic system changes: instead of trying to find a system that can be used to prove everything, we need to find a system that proves as much as possible with as few axioms as possible. If we can find an axiom that seems consistent with the existing ones and gives needed answers to questions, so be it.
This option leads to even more weirdness, though. As noted several months ago in Quanta Magazine, modern mathematicians have decided on two possible new axioms for set theory. The catch, however, is that the two axiom options give different answers to the continuum problem! The first option, known as forcing axioms, allow one to prove that the continuum hypothesis is false: therefore there exist sizes of infinity between the countables and the continuum. The second option, known as the inner-model axiom “V = ultimate L”, allows one to prove that the continuum hypothesis is true.
This is perhaps the most bizarre part of our discussion of infinity yet: mathematicians are in the process of deciding which answer they like more! Apparently an axiom can be chosen to make the continuum hypothesis true or false, depending on… taste?
It is natural to wonder at this point if the question of the continuum problem has any meaning at all. There is a very important precedent for such axiomatic shenanigans, however, that goes all the way back to the geometric axioms of ancient Greece.
Let’s take a look at Euclid’s fifth axiom again, the so-called “parallel axiom”:
Given a line l and a point not on l, there is one and only one line which contains the point, and is parallel to l.
Even in Euclid’s time, geometers found this axiom to be problematic. Unlike the others, it did not seem self-evident, or obvious. There were many attempts to prove the parallel “axiom” by use of Euclid’s other four axioms, without success.
The reason that the parallel axiom seemed non-obvious to mathematicians is that it is not obvious at all! It can be replaced by a different axiom that will result in an entirely new type of geometry, known as a non-Euclidean geometry. The original parallel axiom is appropriate when doing geometry on a flat surface: then there is only one line passing through a point that is parallel to a given line. However, if we do geometry on a curved surface, we find that the parallel axiom can be changed.
In an elliptical geometry, there is no line that passes through a point and is parallel to (does not intersect) the given line.
In a hyperbolic geometry, there are two lines that pass through a point and are parallel to the given line.
These possibilities are shown below.
A simple example of an elliptical geometry is the surface of a sphere. A “line” — the shortest path between two points on the sphere — is a great circle, and any two different great circles intersect at a pair of points on the sphere.
All of geometry can be formulated in a (seemingly) consistent way with a modified parallel postulate; of course, the properties of geometrical objects will be different in these modified geometries. For instance, when drawing polygons on a flat sheet of paper, a 3-sided figure — a triangle — is the simplest one possible. On a sphere, however, a 2-sided polygon — a digon — can be formed by the intersection of two great circles, as can be seen in the illustration.
These curved space geometries were first developed in the early 1800s, but they became extremely important in the early 20th century. Einstein’s general theory of relativity, introduced in 1916, models gravity as a curvature of the very geometry of space and time. Where non-Euclidean geometry started as an abstract mathematical concept, it evolved into a very important physical principle.
This brings us back to the competing axioms for set theory, and their relationship to the continuum problem! The case of non-Euclidean geometry illustrates that it is possible to have different axiomatic systems that describe different things, but that are equally useful. We use “flat space” geometry in our day-to-day work (building a house, for instance), and used curved geometry when studying massive stellar objects. In a similar vein, adding different axioms to set theory results in systems that presumably represent different “realities.” In the case of infinite sets, however, it is unlikely that we will ever physically “observe” them, so how do we decide which axiom to add? For many mathematicians, it comes down to a very practical question: which axiom is more useful for proving other things? In this sense, it sounds like forcing axioms have an advantage in allowing solutions to other mathematical problems. However, “V = ultimate L” is, to many mathematicians, more elegant.
The aforementioned article by Natalie Wolchover is an excellent summary of the current debate, and concludes that mathematicians will eventually choose one axiom over the other to “fill out” set theory. However, the lesson of non-Euclidean geometry suggests that both approaches may have some value down the road, in solving different types of problems.
With this, I end my series of posts on infinity, and just in time! We have stepped partly from mathematics and into the realm of pure philosophy, an area that I am uncomfortable in discussing at the moment. There are, of course, many more things to be explored: the incompleteness theorems of Gödel, and so-called surreal numbers are just two of many possible other avenues to explore.
It is fair to say that there are an infinite number of additional things to learn about infinity, and things infinitely weirder!
There are more things in heaven and earth, Horatio,
Than are dreamt of in your philosophy.
– Hamlet, Act I, Scene V
possible future directions:
a) sets/types (a la russell, quine etc)
b) sets/classes (von neumann etc)
c) really big stuff (large cardinals/transfinite ordinals)
and yah the chances that folks end up declaring one or another axiomatic augmentaion to be THE RIGHT ONE are epsilonishly small. the practice nowadays at talks is simply for the speaker to say up front what they’re using, and then discuss their proofs. the lessons of saccheri might be forgotten in detail, but their gist is practically understood by today’s folks.
Thanks for the thoughts! I suspected that the article’s declaration of “THE RIGHT ONE” was suspect.
really big stuff, please.
Great series, thank you!
Thanks for this wonderful article. It is a great explanation of the modern view of axiomatic systems.