Reviewing Terence Tao's Analysis

Disclosure: I received a review copy of this book from the AMS.

More than a decade ago, I was taking my second course on real analysis, taught by a Soviet trained mathematician who studied with Fomin. He was teaching using his own notes and they were great but I wanted some additional readings. I told him that there were some subjects where I found the textbooks I’ve seen unsatisfactory and asked his opinion. He said that I’d enjoy reading Terence Tao’s (then-)new series on analysis. The professor was easily one of the smartest people I met (he still is!) and Tao is, well, Tao. So I assumed that Tao’s books would be too difficult and I did not even check the book out from the library: Big mistake.

Before that course, my analysis training included an introductory course on analysis (closely following “baby Rudin”) and a topology course (taught from Munkres). I think that would have been the perfect time for me to read Tao if I knew better, which is why I am writing this review. I’ll say more about the ideal audience for the book at the end, after going through each chapter briefly but if I had to summarize my feelings about it, I’d say:

It is an excellent text, I would recommend it to everyone, and it doesn’t assume any rigorous background so a high school student can read it. However, I think the high school student should first read Rudin.

Chapter 1

Tao opens the book with a series of apparent paradoxes where naive applications of some “rules” and “theorems” appear to give nonsensical results. Of course, in each case, we are doing something that the utilized rule doesn’t actually cover or in some cases, we are abusing / misusing the rule in other ways. Many of them have similar flavors so I’ll just give one example.

Consider the series $S = 1 - 1 + 1 - 1 + 1 - \ldots$. We can write $S = 1 - (1 - 1 + 1 - 1 + \ldots) = 1 - S$, which gives us $S = \frac{1}{2}$. Of course, this is nonsense; the partial sums just bounce between 0 and 1 forever. But the algebra looks perfectly normal. The issue is that we treated $S$ as a number and applied rules that are really only justified for finite sums (or convergent infinite sums, once we know what that means). This particular series is actually known as Grandi’s series and it has a surprisingly rich history.¹ Tao gives several other examples in a similar spirit: Interchanging limits and integrals, rearranging conditionally convergent series to get different sums, and a few others. They all have the same flavor; something that looks like a routine calculation produces a result that is clearly wrong.

I think it is a nice way to open a book on analysis because most students come into their first analysis course thinking they already know calculus and it is not always obvious to them why they need to redo everything more carefully. Baby Rudin, for instance, opens with the construction of the reals without any such motivation and I remember finding it a bit insulting in my first course; I know my reals, sir, thank you very much! Of course, Rudin’s approach has its own virtues and the kind of student who picks up baby Rudin is probably not the kind of student who needs to be motivated. But Tao is writing for a broader audience and I think his choice is the right one here.

These examples are then used to motivate further study of the foundations of these rules and theorems.

Chapter 2

This chapter builds the system of natural numbers from the Peano axioms.

This was a bit of a coincidence for me as I just read Peano’s original work as well as several papers that build on it (e.g., Burali-Forti’s paradox paper and Russell’s paper on the Theory of Types). Peano tries to do everything in an axiomatic way; he sets up his axioms, introduces his notation and then just proves a sequence of theorems that provide the familiar knowledge of natural numbers that we (thought we) already knew. The problem is that his notation is as if it was designed to make it harder to follow his arguments. For instance, instead of using parentheses, he uses ‘.’, ‘:’, ‘:.’… etc to bind and separate arguments in statements. It might sound like a cool idea but it gets old pretty quickly; especially if you take a break and visit it a few weeks later, you’d think it might as well be an alien script based on a misunderstanding of our own scripts. (I thought it was a pain to read Frege’s Begriffsschrift before reading Peano…)

Anyway, I am telling you about how bad it is to read Peano because in Tao’s writing, Peano’s work turns into a pleasure. Peano axioms essentially assume a starting point (“0 is a natural number”), and an increment operator (“if $n$ is a natural number, $n++$ is also a natural number”), which give us a way to find successors. We then have two axioms that help with identifying and distinguishing different natural numbers: We assert that 0 is not the successor of any natural number and that different natural numbers have different successors. Finally, we also need an axiom that gives us the principle of mathematical induction. Once these are in place, we can then define addition as repeated incrementation, and multiplication as repeated addition. It is a beautiful system.

Given the level of the exposition and the target audience, I think Tao does a great job of motivating the axioms and the results, and signaling the next steps from there. If I was to write a similar book myself, I might opt for a modified version of Dedekind’s approach, which was Peano’s predecessor. My feeling is that not enough people know and appreciate Dedekind. However, I believe that Tao’s choice is the right one, pedagogically speaking. It is much more succinct and direct.

Chapter 3

At this point, we take a quick detour and review the foundations of set theory. We are introduced to Zermelo - Fraenkel axioms, and what they buy us in terms of our mathematical capabilities. This is something that every mathematician should certainly learn at some point in their education. As far as I can tell, this is generally not a part of the core curriculum for math majors in the North American institutions unless the students take it as an elective. It would be ideal if the student came with the knowledge of this field so that this didn’t take the precious space in this book² but given the situation, I understand why Tao chose to cover it.

The coverage is pretty standard except for the fact that it is split across two chapters. The first one (chapter 3) covers the ZF set theory. The second one (chapter 8) comes after building the tools that are helpful in understanding the uncountable sets better, and it introduces the C in the ZFC set theory, the Axiom of Choice, together with the issues related to the cardinality of infinite sets. Given the low level of prerequisites, the split makes sense to me but I think some students might need a bit more meta commentary on this decision.

For the interested reader, I’d also highly recommend reading the original contributions to the development of set theory by Dedekind, Cantor, Frege, Russell, Zermelo, and Fraenkel. If you are short on time, just reading Zermelo’s 1908 paper Investigations in the Foundations of Set Theory should give you a nice introduction to his concerns. (Of course, it makes more sense if you are also familiar with his earlier 1904 paper on the well-ordering theorem as well as his other 1908 paper, which provides a second proof of his well-ordering theorem. In short, The Investigations provided a stronger foundation on which he could prove the well-ordering theorem.)

Chapter 4

In this chapter, we first extend the natural numbers to integers by “closing” the set of natural numbers in terms of the subtraction operation. Then, we further extend the integers to rationals by once again considering the closure; this time the closure of integers with respect to the division. Once we recover both integers and rationals this way, we are still left with some gaps: We finally get the proof that $\sqrt{2}$ is not a rational number. This is usually something that is covered in the first lecture of a first real analysis course, if at all. However, by postponing this work by a few chapters, Tao provides a stronger foundation, as he explains in his preface.

Chapter 5

We now formalize the idea of gaps using Cauchy sequences. By using the limits of Cauchy sequences, we are finally able to “complete” the rationals. We also get to order them and finally start using the least upper bound property.

I have mixed feelings about the way Cauchy sequences are introduced. I can see why it fits Tao’s purpose better to introduce Cauchy sequences before even defining the convergence of sequences (we’ll get them in Chapter 6). This way, we can finally have our real numbers and do stuff with them. However, it also feels wrong both historically and pedagogically to introduce them in this order. I think what bothers me about this sequence of events is that we are so far restricted to reals in a Euclidean metric space. So, the fact that Cauchy sequences are convergent seems trivial. However, it would be a problem if the student carried this intuition to general metric spaces.

Another nonstandard aspect of this chapter is that Cauchy sequences are defined in three steps. We first get “$\epsilon$-steadiness”, which is used to define “eventual $\epsilon$-steadiness”, which in turn becomes a building block for the definition of a Cauchy sequence; I am sure you can guess their definitions if you are familiar with Cauchy sequences. The approach to convergence, continuity, etc. that appears later in the book is also similar. I think it is a useful way to simplify the definition for newcomers. However, I am not sure if I would want to use it as the actual definition. To me, it seems more prudent to provide the regular definition, and then show that it is equivalent to eventual $\epsilon$-steadiness for every $\epsilon$. The approach taken is the opposite. It decreases the usefulness of the book as a reference significantly but it is probably useful for someone who doesn’t have the “mathematical maturity”, which is the main prerequisite mentioned in the prefaces of most mathematics textbooks. This book doesn’t have that prerequisite and appropriately, the approach is adjusted. I’d just caution the readers to also become familiar with a more standard textbook. (I won’t repeat this in each chapter for other definitions but as I mentioned, many definitions are broken into parts in this way.)

Chapter 6

At this point, we are almost done with the construction of reals (and familiar operations on them). There are a few loose ends that get tied in this chapter (e.g., exponentiation with real powers is introduced here) but otherwise, we are in a more familiar territory now. As a result, the coverage and the approach are also pretty similar to more standard introductory textbooks on real analysis; we are introduced to the ideas of convergence and limits, extended real numbers, suprema/infima, limsup/liminf, limit points, and subsequences.

In Example 6.3.4, the idea of countable sets is used but this notion is defined in chapter 8. Countability is not really essential to the example so it is clearly a slip of mind. Otherwise, I think the efficiency of this chapter proves that Tao’s approach is worth considering, especially when teaching real analysis to non-math majors.

Chapter 7

This chapter deals with series. This is one of those subjects that you’d hope that the calculus courses cover well but since you can’t count on it, you have to cover it here as it will be used later in the book. Both the coverage and the approach in this chapter are pretty standard so I won’t say more about it.

Chapter 8

Here, Tao goes back to set theory and introduces countability, Cantor’s theorem ($|2^X| > |X|$) and plays with some uncountable sets. As I noted above, this is also where we get the Axiom of Choice and discuss some of its consequences, especially Zorn’s Lemma.

Chapter 9

This chapter introduces and studies continuous functions on $\mathbb{R}$. We get the $\epsilon$-$\delta$ definition of continuity (broken into pieces, as usual), the intermediate value theorem, the extreme value theorem, and uniform continuity. I think this is where the earlier investment in constructing the reals from scratch pays off the most; e.g., the intermediate value theorem feels like it follows naturally from everything we built so far, instead of arriving as some trick. The rest of the chapter is pretty standard material and the approach is not too different from what you’d find elsewhere.

Chapter 10

Differentiation. The definition of the derivative, differentiation rules, the mean value theorem, inverse function derivatives, and L’Hopital’s rule. The chain rule proof is done properly, which I appreciated; there is that well-known issue where the naive proof breaks down because $g(x) - g(x_0)$ can be zero even when $x \neq x_0$ and many textbooks handle this awkwardly (including baby Rudin, in my opinion).³ L’Hopital’s rule⁴ getting a rigorous treatment here is satisfying given that it was one of the topics that motivated the chapter 1 paradoxes.

Chapter 11

The final chapter covers the Riemann integral. Tao builds it up using piecewise constant functions (which obviously amounts to the same thing as the direct partition-based approaches but fits Tao’s style of starting small and building up). The chapter ends with the fundamental theorems of calculus, which is a fitting conclusion to the whole journey.

Overall

I want to be upfront: I think everyone should read this book. What follows is not an argument against reading it but an argument about when to read it. If you are mildly curious about analysis but you’ll never learn the rigorous version, this is your book. However, if you want to learn it more rigorously, I think Tao is great as a second book. I still recommend it, just not as the first book.

Honestly, I think this book is a bit too boring if this is your first time reading analysis. I know this sounds ungrateful given how clearly and carefully Tao writes but that is actually part of the problem. The book does too much for the reader. Everything is explained before being stated and then explained again later, as the good old Patrick Winston would approve.⁵ You never really have to struggle with a definition or a theorem for an entire day. I think being confused, misunderstanding stuff, failing to write the correct/detailed proof on the first try are non-negotiable parts of learning rigorous math. I know this sounds a bit like Stockholm syndrome but I do believe neuroscience actually backs me up on this.⁶ So, a book about analysis where the most famous mathematician alive breaks everything down for you sounds great but I think you end up remembering less than if you had wandered around on your own for a while first.

Baby Rudin is the opposite in this regard. It gives you essentially nothing to work with and you have to figure out why each definition matters on your own. I used to think this was a flaw; I remember being frustrated by Rudin in my first course and wishing the author would just tell me why we are doing any of this. But now I am not so sure that it was a flaw. I think the confusion is part of how you learn mathematics (or anything, really), at least beyond a certain level. A book that removes too much of the difficulty might actually be doing you a disservice. I realize there are people who disagree with me very strongly on this and I am not sure I would have agreed with myself fifteen years ago either but such is life.⁷

Having said all of that, if you read Tao after working through Rudin or a similar text, the experience is completely different. Things that confused you in Rudin suddenly make a lot more sense. The constructions that felt arbitrary now have context. You can appreciate the choices Tao made because you have already seen what it is like to learn the same material without those choices. So I think of this book as a really good and elementary commentary on real analysis by someone who thought very carefully about how to explain it. And I think that is where it is best, rather than being used as a textbook for a first course.

So my recommendation is: read Rudin first, suffer through it, and write your own proofs. Then come back and read Tao. I wish I had done it this way from the start, although in a sense I did, by accident, since I am reading Tao more than a decade after my Rudin-based courses. Of course, I had to make it harder for myself because I also came to Tao after reading Dedekind, Cantor, and most of what is in the anthology From Frege to Godel, which actually has the original writings from Frege, Cantor, Peano, Russell, Godel, Brouwer, and many others.

So, read it. I loved it. Just maybe read Rudin first.

Many of the 18th century mathematicians wrote about this series. Leibniz, Bernoulli, Laplace, Lagrange, Euler, etc. were all incredibly smart people and their attempts are very original. To me, seemingly simple questions like this show the value of having the right concepts and definitions. Without them, even the smartest people in their time couldn’t solve the problem. ↩︎
It deserves its own books and indeed, there are several excellent texts on set theory, including Enderton, Jech, and Fraenkel, Bar-Hillel, and Levy. ↩︎
I wouldn’t say other treatments, including Rudin, are sloppy. It is just that most of the students come with a sloppy background so unless you make the precision crystal clear, they can understand it in the way they are used to. ↩︎
I should note that L’Hopital himself had essentially nothing to do with this rule. He was a French nobleman who was interested in the new calculus but was not really a mathematician in any serious sense of the word; didn’t have the brains. However, he was rich. So he started paying one of the Bernoullis to send him his latest mathematical discoveries exclusively in exchange for a good amount of money. He then published these results in a textbook, which was also the first textbook on differential calculus and became enormously influential. So, the rule we call “L’Hopital’s rule” was communicated to him by Bernoulli in a letter. I also have a vague memory of reading something about L’Hopital’s role in the Newton-Leibniz priority dispute in Antognazza’s Leibniz: An Intellectual Biography but I can’t find the passage right now. ↩︎
Patrick Winston gave a famous lecture called “How to Speak” at MIT almost every year for decades. Luckily, it was recorded so you can watch it on YouTube like I did. One of his key points was the old adage: “Tell them what you’re going to tell them, tell them, then tell them what you told them.” ↩︎
There is research suggesting that the act of failing first and then succeeding leads to deeper learning than succeeding on the first try. I want to write a separate post about this at some point so I am not going into the details here. ↩︎
I want to be clear that terseness and sloppiness are orthogonal. Rudin is terse. Cantor is sloppy. These have nothing to do with each other. Rudin is terse because he trusts you to fill in the details; the details are there to be filled in and the structure supports it (although it doesn’t hurt to have a professor…). Cantor is sloppy because he was careless; he uses concepts before defining them, mixes personal grievances with mathematics, and makes it genuinely hard to tell if his arguments are circular. I find Rudin’s terseness productive and Cantor’s sloppiness infuriating. Similarly, Hoffman and Kunze is one of the best linear algebra books precisely because it is terse and rigorous without being sloppy. What I am arguing for is terse and clear writing. I also like reading the “classical style”, as some say, and I think terse and clear is almost the mathematical equivalent of that. ↩︎

Reviewing Terence Tao's Analysis - I