Praise for How Not to Be Wrong
“Brilliantly engaging . . . Ellenberg’s
talent for finding real-life situations that enshrine mathematical
principles would be the envy of any math teacher. He presents these in
fluid succession, like courses in a fine restaurant, taking care to make
each insight shine through, unencumbered by jargon or notation. Part of
the sheer intellectual joy of the book is watching the author leap
nimbly from topic to topic, comparing slime molds to the Bush-Gore
Florida vote, criminology to Beethoven’s Ninth Symphony. The final
effect is of one enormous mosaic unified by mathematics.”
—Manil Suri, The Washington Post
“Easy to follow, humorously presented . . . This
book will help you to avoid the pitfalls that result from not
having the right tools. It will help you realize that mathematical
reasoning permeates our lives—that it can be, as Mr. Ellenberg
writes, a kind of ‘X-ray specs that reveal hidden structures
underneath the messy and chaotic surface of the world.’”
—Mario Livio, The Wall Street Journal
“Witty, compelling, and just plain fun to read . . . How Not to
Be Wrong can help you explore your mathematical superpowers.”
—Evelyn Lamb, Scientific American
“Mathematicians from Charles Lutwidge Dodgson to Steven Strogatz
have celebrated the power of mathematics in life and the imagination. In
this hugely enjoyable exploration of everyday maths as ‘an
atomic-powered prosthesis that you attach to your common sense,’
Jordan Ellenberg joins their ranks. Ellenberg, an academic and
Slate ’s‘Do the Math’ columnist, explains key
principles with erudite gusto—whether poking holes in predictions
of a U.S. ‘obesity apocalypse,’ or unpicking an attempt by
psychologist B. F. Skinner to prove statistically that Shakespeare was a
dud at alliteration.”
“The book is filled to the rim with anecdotes and
‘good-to-know’ facts. And Ellenberg does not shy away
from delving deeply into most topics, both in terms of the underlying
mathematical concepts and the background material, which he has
researched meticulously. . . . Whereas the book may be
aimed at a general audience, who wonder how the mathematics
they learned at school might ever be useful, there is much on offer
for those who have chosen a professional career in the sciences even
when the fundamental ideas discussed are not new. It’s a bit like
walking through a well-curated exhibition of a favored painter. Many
works you know inside out, but the context and the logic of the
presentation may offer refreshing new perspectives and insights.”
“Refreshingly lucid while still remaining conceptually rigorous,
this book lends insight into how mathematicians think—and shows us
how we can start to think like mathematicians as well.”
—The New York Times Book Review
“A poet-mathematician offers an empowering and entertaining primer
for the age of Big Data. . . . A rewarding popular math
book for just about anyone.”
—Laura Miller, Salon
“A fresh application of complex mathematical thinking to
commonplace events . . . How Not to Be
Wrong is beautifully written, holding the reader’s
attention throughout with well-chosen material, illuminating exposition,
wit, and helpful examples. I am reminded of the great writer of
recreational mathematics, Martin Gardner: Ellenberg shares
Gardner’s remarkable ability to write clearly and entertainingly,
bringing in deep mathematical ideas without the reader registering their
—Times Higher Education (London)
“Ellenberg tells engaging, even exciting stories about how
‘the problems we think about every day—problems of politics,
of medicine, of commerce, of theology—are shot through with
—The Washington Post (blog)
“A collection of fascinating examples of math and its surprising
applications . . . How Not to Be
Wrong is full of interesting and weird mathematical tools and
“Wry, accessible, and
entertaining . . . Ellenberg finds the commonsense
math at work in the everyday world, and his vivid examples and clear
descriptions show how ‘math is woven into the way we
—Publishers Weekly (starred review)
“Witty and expansive, Ellenberg’s math will leave readers
informed, intrigued, and armed with plenty of impressive conversation
“Readers will indeed marvel at how often mathematics shed
unexpected light on economics (assessing the performance of investment
advisors), public health (predicting the likely prevalence of obesity in
thirty years), and politics (explaining why wealthy individuals vote
Republican but affluent states go for Democrats). Relying on remarkably
few technical formulas, Ellenberg writes with humor and verve as he
repeatedly demonstrates that mathematics simply extends common
“How Not to Be Wrong is a cheery manifesto for the
utility of mathematical thinking. Ellenberg’s prose is a
delight—informal and robust, irreverent yet serious. Maths is
‘an atomic-powered prosthesis that you attach to your common
sense, vastly multiplying its reach and strength,’ he writes.
Doing maths ‘is to be, at once, touched by fire and bound by
reason. Logic forms a narrow channel through which intuition flows with
vastly augmented force.’”
—The Guardian (London)
“The title of this wonderful book explains what it adds to the
honorable genre of popular writing on mathematics. Like Lewis Carroll,
George Gamow, and Martin Gardner before him, Jordan Ellenberg shows how
mathematics can delight and stimulate the mind. But he also shows that
mathematical thinking should be in the toolkit of every thoughtful
person—of everyone who wants to avoid fallacies,
superstitions, and other ways of being wrong.”
—Steven Pinker, Johnstone Family Professor of Psychology, Harvard
University, and author of How the Mind Works
“Brilliant and fascinating! Ellenberg shows his readers how to
magnify common sense using the tools usually only accessible to those
who have studied higher mathematics. I highly recommend it to anyone
interested in expanding their worldly savviness—and math
—Danica McKellar, actress and bestselling author of Math
Doesn’t Suck and Kiss My Math
“Jordan Ellenberg promises to share ways of thinking that are both
simple to grasp and profound in their implications, and he delivers in
spades. These beautifully readable pages delight and enlighten in equal
parts. Those who already love math will eat it up, and those who
don’t yet know how lovable math is are in for a most pleasurable
—Rebecca Newberger Goldstein, author of Plato at the
“With math as with anything else, there’s smart, and then
there’s street smart. This book will help you be both. Fans of
Freakonomics and The Signal and the Noise will love
Ellenberg’s surprising stories, snappy writing, and brilliant
lessons in numerical savvy. How Not to Be Wrong is sharp, funny,
—Steven Strogatz, Jacob Gould Schurman Professor of Applied
Mathematics, Cornell University, and author of The Joy of x
“Every page is a stand-alone, positive, and ontological
examination of the beauty and surprise of mathematical discovery.”
—Cathy O’Neil, Mathbabe.com
HOW NOT TO BE WRONG
Jordan Ellenberg is the Vilas Distinguished Achievement Professor of
Mathematics at the University of Wisconsin-Madison. His writing has
appeared in Slate, The Wall Street Journal, The New
York Times, The Washington Post, The Boston Globe, and
WHEN AM I GOING TO USE THIS?
Right now, in a classroom somewhere in the world, a student is mouthing
off to her math teacher. The teacher has just asked her to spend a
substantial portion of her weekend computing a list of thirty definite
There are other things the student would rather do. There is, in fact,
hardly anything she would not rather do. She knows this quite
clearly, because she spent a substantial portion of the previous weekend
computing a different—but not very different—list of
thirty definite integrals. She doesn’t see the point, and she
tells her teacher so. And at some point in this conversation, the
student is going to ask the question the teacher fears most:
“When am I going to use this?”
Now the math teacher is probably going to say something like:
“I know this seems dull to you, but remember, you don’t know
what career you’ll choose—you may not see the relevance now,
but you might go into a field where it’ll be really important that
you know how to compute definite integrals quickly and correctly by
This answer is seldom satisfying to the student. That’s because
it’s a lie. And the teacher and the student both know it’s a
lie. The number of adults who will ever make use of the integral of (1
− 3x + 4x2)−2 dx, or the formula for the cosine of 3θ,
or synthetic division of polynomials, can be counted on a few thousand
The lie is not very satisfying to the teacher, either. I should know: in
my many years as a math professor I’ve asked many hundreds of
college students to compute lists of definite integrals.
Fortunately, there’s a better answer. It goes something like this:
“Mathematics is not just a sequence of computations to be carried
out by rote until your patience or stamina runs out—although it
might seem that way from what you’ve been taught in courses called
mathematics. Those integrals are to mathematics as weight
training and calisthenics are to soccer. If you want to play
soccer—I mean, really play, at a competitive
level—you’ve got to do a lot of boring, repetitive,
apparently pointless drills. Do professional players ever use
those drills? Well, you won’t see anybody on the field curling a
weight or zigzagging between traffic cones. But you do see players using
the strength, speed, insight, and flexibility they built up by doing
those drills, week after tedious week. Learning those drills is part of
“If you want to play soccer for a living, or even make the varsity
team, you’re going to be spending lots of boring weekends on the
practice field. There’s no other way. But now here’s the
good news. If the drills are too much for you to take, you can still
play for fun, with friends. You can enjoy the thrill of making a slick
pass between defenders or scoring from distance just as much as a pro
athlete does. You’ll be healthier and happier than you would be if
you sat home watching the professionals on TV.
“Mathematics is pretty much the same. You may not be aiming for a
mathematically oriented career. That’s fine—most people
aren’t. But you can still do math. You probably already are
doing math, even if you don’t call it that. Math is woven into the
way we reason. And math makes you better at things. Knowing mathematics
is like wearing a pair of X-ray specs that reveal hidden structures
underneath the messy and chaotic surface of the world. Math is a science
of not being wrong about things, its techniques and habits hammered out
by centuries of hard work and argument. With the tools of mathematics in
hand, you can understand the world in a deeper, sounder, and more
meaningful way. All you need is a coach, or even just a book, to teach
you the rules and some basic tactics. I will be your coach. I will show
For reasons of time, this is seldom what I actually say in the
classroom. But in a book, there’s room to stretch out a little
more. I hope to back up the grand claims I just made by showing you that
the problems we think about every day—problems of politics, of
medicine, of commerce, of theology—are shot through with
mathematics. Understanding this gives you access to insights accessible
by no other means.
Even if I did give my student the full inspirational speech, she
might—if she is really sharp—remain unconvinced.
“That sounds good, Professor,” she’ll say. “But
it’s pretty abstract. You say that with mathematics at your
disposal you can get things right you’d otherwise get wrong. But
what kinds of things? Give me an actual example.”
And at that point I would tell her the story of Abraham Wald and the
missing bullet holes.
ABRAHAM WALD AND THE MISSING BULLET HOLES
This story, like many World War II stories, starts with the Nazis
hounding a Jew out of Europe and ends with the Nazis regretting it.
Abraham Wald was born in 1902 in what was then the city of Klausenburg
in what was then the Austro-Hungarian Empire. By the time Wald was a
teenager, one world war was in the books and his hometown had become
Cluj, Romania. He was the grandson of a rabbi and the son of a kosher
baker, but the younger Wald was a mathematician almost from the start.
His talent for the subject was quickly recognized, and he was admitted
to study mathematics at the University of Vienna, where he was drawn to
subjects abstract and recondite even by the standards of pure
mathematics: set theory and metric spaces.
But when Wald’s studies were completed, it was the mid-1930s,
Austria was deep in economic distress, and there was no possibility that
a foreigner could be hired as a professor in Vienna. Wald was rescued by
a job offer from Oskar Morgenstern. Morgenstern would later immigrate to
the United States and help invent game theory, but in 1933 he was the
director of the Austrian Institute for Economic Research, and he hired
Wald at a small salary to do mathematical odd jobs. That turned out to
be a good move for Wald: his experience in economics got him a
fellowship offer at the Cowles Commission, an economic institute then
located in Colorado Springs. Despite the ever-worsening political
situation, Wald was reluctant to take a step that would lead him away
from pure mathematics for good. But then the Nazis conquered Austria,
making Wald’s decision substantially easier. After just a few
months in Colorado, he was offered a professorship of statistics at
Columbia; he packed up once again and moved to New York.
And that was where he fought the war.
The Statistical Research Group (SRG), where Wald spent much of World War
II, was a classified program that yoked the assembled might of American
statisticians to the war effort—something like the Manhattan
Project, except the weapons being developed were equations, not
explosives. And the SRG was actually in Manhattan, at 401 West
118th Street in Morningside Heights, just a block away from Columbia
University. The building now houses Columbia faculty apartments and some
doctor’s offices, but in 1943 it was the buzzing, sparking nerve
center of wartime math. At the Applied Mathematics Group−Columbia,
dozens of young women bent over Marchant desktop calculators were
calculating formulas for the optimal curve a fighter should trace out
through the air in order to keep an enemy plane in its gunsights. In
another apartment, a team of researchers from Princeton was developing
protocols for strategic bombing. And Columbia’s wing of the atom
bomb project was right next door.
But the SRG was the most high-powered, and ultimately the most
influential, of any of these groups. The atmosphere combined the
intellectual openness and intensity of an academic department with the
shared sense of purpose that comes only with high stakes. “When we
made recommendations,” W. Allen Wallis, the director, wrote,
“frequently things happened. Fighter planes entered combat with
their machine guns loaded according to Jack Wolfowitz’s*
recommendations about mixing types of ammunition, and maybe the pilots
came back or maybe they didn’t. Navy planes launched rockets whose
propellants had been accepted by Abe Girshick’s
sampling-inspection plans, and maybe the rockets exploded and destroyed
our own planes and pilots or maybe they destroyed the target.”
The mathematical talent at hand was equal to the gravity of the task. In
Wallis’s words, the SRG was “the most extraordinary group of
statisticians ever organized, taking into account both number and
quality.” Frederick Mosteller, who would later found
Harvard’s statistics department, was there. So was Leonard Jimmie
Savage, the pioneer of decision theory and great advocate of the field
that came to be called Bayesian statistics.* Norbert Wiener, the MIT
mathematician and the creator of cybernetics, dropped by from time to
time. This was a group where Milton Friedman, the future Nobelist in
economics, was often the fourth-smartest person in the room.
The smartest person in the room was usually Abraham Wald. Wald
had been Allen Wallis’s teacher at Columbia, and functioned as a
kind of mathematical eminence to the group. Still an “enemy
alien,” he was not technically allowed to see the classified
reports he was producing; the joke around SRG was that the secretaries
were required to pull each sheet of notepaper out of his hands as soon
as he was finished writing on it. Wald was, in some ways, an unlikely
participant. His inclination, as it always had been, was toward
abstraction, and away from direct applications. But his motivation to
use his talents against the Axis was obvious. And when you needed to
turn a vague idea into solid mathematics, Wald was the person you wanted
at your side.
So here’s the question. You don’t want your planes to get
shot down by enemy fighters, so you armor them. But armor makes the
plane heavier, and heavier planes are less maneuverable and use more
fuel. Armoring the planes too much is a problem; armoring the planes too
little is a problem. Somewhere in between there’s an optimum. The
reason you have a team of mathematicians socked away in an apartment in
New York City is to figure out where that optimum is.
The military came to the SRG with some data they thought might be
useful. When American planes came back from engagements over Europe,
they were covered in bullet holes. But the damage wasn’t uniformly
distributed across the aircraft. There were more bullet holes in the
fuselage, not so many in the engines.
The officers saw an opportunity for efficiency; you can get the same
protection with less armor if you concentrate the armor on the places
with the greatest need, where the planes are getting hit the most. But
exactly how much more armor belonged on those parts of the plane? That
was the answer they came to Wald for. It wasn’t the answer they
The armor, said Wald, doesn’t go where the bullet holes are. It
goes where the bullet holes aren’t: on the engines.
Wald’s insight was simply to ask: where are the missing holes? The
ones that would have been all over the engine casing, if the damage had
been spread equally all over the plane? Wald was pretty sure he knew.
The missing bullet holes were on the missing planes. The reason planes
were coming back with fewer hits to the engine is that planes that got
hit in the engine weren’t coming back. Whereas the large number of
planes returning to base with a thoroughly Swiss-cheesed fuselage is
pretty strong evidence that hits to the fuselage can (and therefore
should) be tolerated. If you go to the recovery room at the hospital,
you’ll see a lot more people with bullet holes in their legs than
people with bullet holes in their chests. But that’s not because
people don’t get shot in the chest; it’s because the people
who get shot in the chest don’t recover.
Here’s an old mathematician’s trick that makes the picture
perfectly clear: set some variables to zero. In this case, the
variable to tweak is the probability that a plane that takes a hit to
the engine manages to stay in the air. Setting that probability to zero
means a single shot to the engine is guaranteed to bring the plane down.
What would the data look like then? You’d have planes coming back
with bullet holes all over the wings, the fuselage, the nose—but
none at all on the engine. The military analyst has two options for
explaining this: either the German bullets just happen to hit every part
of the plane but one, or the engine is a point of total vulnerability.
Both stories explain the data, but the latter makes a lot more sense.
The armor goes where the bullet holes aren’t.
Wald’s recommendations were quickly put into effect, and were
still being used by the navy and the air force through the wars in Korea
and Vietnam. I can’t tell you exactly how many American planes
they saved, though the data-slinging descendants of the SRG inside
today’s military no doubt have a pretty good idea. One thing the
American defense establishment has traditionally understood very well is
that countries don’t win wars just by being braver than the other
side, or freer, or slightly preferred by God. The winners are usually
the guys who get 5% fewer of their planes shot down, or use 5% less
fuel, or get 5% more nutrition into their infantry at 95% of the cost.
That’s not the stuff war movies are made of, but it’s the
stuff wars are made of. And there’s math every step of the way.
Why did Wald see what the officers, who had vastly more knowledge and
understanding of aerial combat, couldn’t? It comes back to his
math-trained habits of thought. A mathematician is always asking,
“What assumptions are you making? And are they justified?”
This can be annoying. But it can also be very productive. In this case,
the officers were making an assumption unwittingly: that the planes that
came back were a random sample of all the planes. If that were true, you
could draw conclusions about the distribution of bullet holes on all the
planes by examining the distribution of bullet holes on only the
surviving planes. Once you recognize that you’ve been making that
hypothesis, it takes only a moment to realize it’s dead wrong;
there’s no reason at all to expect the planes to have an equal
likelihood of survival no matter where they get hit. In a piece of
mathematical lingo we’ll come back to in chapter 15, the rate of
survival and the location of the bullet holes are correlated.
Wald’s other advantage was his tendency toward abstraction.
Wolfowitz, who had studied under Wald at Columbia, wrote that the
problems he favored were “all of the most abstract sort,”
and that he was “always ready to talk about mathematics, but
uninterested in popularization and special applications.”
Wald’s personality made it hard for him to focus his attention on
applied problems, it’s true. The details of planes and guns were,
to his eye, so much upholstery—he peered right through to the
mathematical struts and nails holding the story together. Sometimes that
approach can lead you to ignore features of the problem that really
matter. But it also lets you see the common skeleton shared by problems
that look very different on the surface. Thus you have meaningful
experience even in areas where you appear to have none.
To a mathematician, the structure underlying the bullet hole problem is
a phenomenon called survivorship bias. It arises again and again,
in all kinds of contexts. And once you’re familiar with it, as
Wald was, you’re primed to notice it wherever it’s hiding.
Like mutual funds. Judging the performance of funds is an area where you
don’t want to be wrong, even by a little bit. A shift of 1% in
annual growth might be the difference between a valuable financial asset
and a dog. The funds in Morningstar’s Large Blend category, whose
mutual funds invest in big companies that roughly represent the S&P
500, look like the former kind. The funds in this class grew an average
of 178.4% between 1995 and 2004: a healthy 10.8% per year.* Sounds like
you’d do well, if you had cash on hand, to invest in those funds,
Well, no. A 2006 study by Savant Capital shone a somewhat colder light
on those numbers. Think again about how Morningstar generates its
number. It’s 2004, you take all the funds classified as Large
Blend, and you see how much they grew over the last ten years.
But something’s missing: the funds that aren’t there.
Mutual funds don’t live forever. Some flourish, some die. The ones
that die are, by and large, the ones that don’t make money. So
judging a decade’s worth of mutual funds by the ones that still
exist at the end of the ten years is like judging our pilots’
evasive maneuvers by counting the bullet holes in the planes that come
back. What would it mean if we never found more than one bullet hole per
plane? Not that our pilots are brilliant at dodging enemy fire, but that
the planes that got hit twice went down in flames.
The Savant study found that if you included the performance of the dead
funds together with the surviving ones, the rate of return dropped down
to 134.5%, a much more ordinary 8.9% per year. More recent research
backed that up: a comprehensive 2011 study in the Review of
Finance covering nearly 5,000 funds found that the excess return
rate of the 2,641 survivors is about 20% higher than the same figure
recomputed to include the funds that didn’t make it. The size of
the survivorship effect might have surprised investors, but it probably
wouldn’t have surprised Abraham Wald.
MATHEMATICS IS THE EXTENSION OF COMMON SENSE BY OTHER MEANS
At this point my teenage interlocutor is going to stop me and ask, quite
reasonably: Where’s the math? Wald was a mathematician,
that’s true, and it can’t be denied that his solution to the
problem of the bullet holes was ingenious, but what’s mathematical
about it? There was no trig identity to be seen, no integral or
inequality or formula.
First of all: Wald did use formulas. I told the story without them,
because this is just the introduction. When you write a book explaining
human reproduction to preteens, the introduction stops short of the
really hydraulic stuff about how babies get inside Mommy’s tummy.
Instead, you start with something more like “Everything in nature
changes; trees lose their leaves in winter only to bloom again in
spring; the humble caterpillar enters its chrysalis and emerges as a
magnificent butterfly. You are part of nature too,
and . . .”
That’s the part of the book we’re in now.
But we’re all adults here. Turning off the soft focus for a
second, here’s what a sample page of Wald’s actual report
I hope that wasn’t too shocking.
Still, the real idea behind Wald’s insight doesn’t
require any of the formalism above. We’ve already explained it,
using no mathematical notation of any kind. So my student’s
question stands. What makes that math? Isn’t it just common sense?
Yes. Mathematics is common sense. On some basic level, this is
clear. How can you explain to someone why adding seven things to five
things yields the same result as adding five things to seven? You
can’t: that fact is baked into our way of thinking about combining
things together. Mathematicians like to give names to the phenomena our
common sense describes: instead of saying, “This thing
added to that thing is the same thing as that thing
added to this thing,” we say, “Addition is
commutative.” Or, because we like our symbols, we write:
For any choice of a and b, a + b = b + a.
Despite the official-looking formula, we are talking about a fact
instinctively understood by every child.
Multiplication is a slightly different story. The formula looks pretty
For any choice of a and b, a × b = b × a.
The mind, presented with this statement, does not say “no
duh” quite as instantly as it does for addition. Is it
“common sense” that two sets of six things amount to the
same as six sets of two?
Maybe not; but it can become common sense. Here’s my
earliest mathematical memory. I’m lying on the floor in my
parents’ house, my cheek pressed against the shag rug, looking at
the stereo. Very probably I am listening to side two of the
Beatles’ Blue Album. Maybe I’m six. This is the seventies,
and therefore the stereo is encased in a pressed wood panel, which has a
rectangular array of airholes punched into the side. Eight holes across,
six holes up and down. So I’m lying there, looking at the
airholes. The six rows of holes. The eight columns of holes. By focusing
my gaze in and out I could make my mind flip back and forth between
seeing the rows and seeing the columns. Six rows with eight holes each.
Eight columns with six holes each.
And then I had it—eight groups of six were the same as six groups
of eight. Not because it was a rule I’d been told, but because it
could not be any other way. The number of holes in the panel was the
number of holes in the panel, no matter which way you counted them.
We tend to teach mathematics as a long list of rules. You learn them in
order and you have to obey them, because if you don’t obey them
you get a C-. This is not mathematics. Mathematics is the study
of things that come out a certain way because there is no other way they
could possibly be.
Now let’s be fair: not everything in mathematics can be made as
perfectly transparent to our intuition as addition and multiplication.
You can’t do calculus by common sense. But calculus is still
derived from our common sense—Newton took our physical
intuition about objects moving in straight lines, formalized it, and
then built on top of that formal structure a universal mathematical
description of motion. Once you have Newton’s theory in hand, you
can apply it to problems that would make your head spin if you had no
equations to help you. In the same way, we have built-in mental systems
for assessing the likelihood of an uncertain outcome. But those systems
are pretty weak and unreliable, especially when it comes to events of
extreme rarity. That’s when we shore up our intuition with a few
sturdy, well-placed theorems and techniques, and make out of it a
mathematical theory of probability.
The specialized language in which mathematicians converse with one
another is a magnificent tool for conveying complex ideas precisely and
swiftly. But its foreignness can create among outsiders the impression
of a sphere of thought totally alien to ordinary thinking. That’s
Math is like an atomic-powered prosthesis that you attach to your common
sense, vastly multiplying its reach and strength. Despite the power of
mathematics, and despite its sometimes forbidding notation and
abstraction, the actual mental work involved is little different from
the way we think about more down-to-earth problems. I find it helpful to
keep in mind an image of Iron Man punching a hole through a brick wall.
On the one hand, the actual wall-breaking force is being supplied, not
by Tony Stark’s muscles, but by a series of exquisitely
synchronized servomechanisms powered by a compact beta particle
generator. On the other hand, from Tony Stark’s point of view,
what he is doing is punching a wall, exactly as he would without the
armor. Only much, much harder.
To paraphrase Clausewitz: Mathematics is the extension of common sense
by other means.
Without the rigorous structure that math provides, common sense can lead
you astray. That’s what happened to the officers who wanted to
armor the parts of the planes that were already strong enough. But
formal mathematics without common sense—without the constant
interplay between abstract reasoning and our intuitions about quantity,
time, space, motion, behavior, and uncertainty—would just be a
sterile exercise in rule-following and bookkeeping. In other words, math
would actually be what the peevish calculus student believes it to be.
That’s a real danger. John von Neumann, in his 1947 essay
“The Mathematician,” warned:
As a mathematical discipline travels far from its empirical source, or
still more, if it is a second and third generation only indirectly
inspired by ideas coming from “reality” it is beset with
very grave dangers. It becomes more and more purely aestheticizing, more
and more purely l’art pour l’art. This need not be
bad, if the field is surrounded by correlated subjects, which still have
closer empirical connections, or if the discipline is under the
influence of men with an exceptionally well-developed taste. But there
is a grave danger that the subject will develop along the line of least
resistance, that the stream, so far from its source, will separate into
a multitude of insignificant branches, and that the discipline will
become a disorganized mass of details and complexities. In other words,
at a great distance from its empirical source, or after much
“abstract” inbreeding, a mathematical subject is in danger
WHAT KINDS OF MATHEMATICS WILL APPEAR IN THIS BOOK?
If your acquaintance with mathematics comes entirely from school, you
have been told a story that is very limited, and in some important ways
false. School mathematics is largely made up of a sequence of facts and
rules, facts which are certain, rules which come from a higher authority
and cannot be questioned. It treats mathematical matters as completely
Mathematics is not settled. Even concerning the basic objects of study,
like numbers and geometric figures, our ignorance is much greater than
our knowledge. And the things we do know were arrived at only after
massive effort, contention, and confusion. All this sweat and tumult is
carefully screened off in your textbook.
There are facts and there are facts, of course. There has never been
much controversy about whether 1 + 2 = 3. The question of how
and whether we can truly prove that 1 + 2 = 3, which wobbles
uneasily between mathematics and philosophy, is another story—we
return to that at the end of the book. But that the computation is
correct is a plain truth. The tumult lies elsewhere. We’ll come
within sight of it several times.
Mathematical facts can be simple or complicated, and they can be shallow
or profound. This divides the mathematical universe into four quadrants:
Basic arithmetic facts, like 1 + 2 = 3, are simple and shallow. So
are basic identities like sin(2x) = 2 sin x cos x or the quadratic
formula: they might be slightly harder to convince yourself of than 1 +
2 = 3, but in the end they don’t have much conceptual heft.
Moving over to complicated/shallow, you have the problem of multiplying
two ten-digit numbers, or the computation of an intricate definite
integral, or, given a couple of years of graduate school, the trace of
Frobenius on a modular form of conductor 2377. It’s conceivable
you might, for some reason, need to know the answer to such a problem,
and it’s undeniable that it would be somewhere between annoying
and impossible to work it out by hand; or, as in the case of the modular
form, it might take some serious schooling even to understand
what’s being asked for. But knowing those answers doesn’t
really enrich your knowledge about the world.
The complicated/profound quadrant is where professional mathematicians
like me try to spend most of our time. That’s where the celebrity
theorems and conjectures live: the Riemann Hypothesis, Fermat’s
Last Theorem,* the Poincaré Conjecture, P vs. NP,
Gödel’s Theorem . . . Each one of these
theorems involves ideas of deep meaning, fundamental importance,
mind-blowing beauty, and brutal technicality, and each of them is the
protagonist of books of its own.
But not this book. This book is going to hang out in the upper left
quadrant: simple and profound. The mathematical ideas we want to address
are ones that can be engaged with directly and profitably, whether your
mathematical training stops at pre-algebra or extends much further. And
they are not “mere facts,” like a simple statement of
arithmetic—they are principles, whose application extends far
beyond the things you’re used to thinking of as mathematical. They
are the go-to tools on the utility belt, and used properly they will
help you not be wrong.
Pure mathematics can be a kind of convent, a quiet place safely cut off
from the pernicious influences of the world’s messiness and
inconsistency. I grew up inside those walls. Other math kids I knew were
tempted by applications to physics, or genomics, or the black art of
hedge fund management, but I wanted no such rumspringa.* As a
graduate student, I dedicated myself to number theory, what Gauss called
“thequeen of mathematics,” the purest of the pure
subjects, the sealed garden at the center of the convent, where we
contemplated the same questions about numbers and equations that
troubled the Greeks and have gotten hardly less vexing in the
twenty-five hundred years since.
At first I worked on number theory with a classical flavor, proving
facts about sums of fourth powers of whole numbers that I could, if
pressed, explain to my family at Thanksgiving, even if I couldn’t
explain how I proved what I proved. But before long I got enticed into
even more abstract realms, investigating problems where the basic
actors—“residually modular Galois representations,”
“cohomology of moduli schemes,” “dynamical systems on
homogeneous spaces,” things like that—were impossible to
talk about outside the archipelago of seminar halls and faculty lounges
that stretches from Oxford to Princeton to Kyoto to Paris to Madison,
Wisconsin, where I’m a professor now. When I tell you this stuff
is thrilling, and meaningful, and beautiful, and that I’ll never
get tired of thinking about it, you may just have to believe me, because
it takes a long education just to get to the point where the objects of
study rear into view.
But something funny happened. The more abstract and distant from lived
experience my research got, the more I started to notice how much math
was going on in the world outside the walls. Not Galois representations
or cohomology, but ideas that were simpler, older, and just as
deep—the northwest quadrant of the conceptual foursquare. I
started writing articles for magazines and newspapers about the way the
world looked through a mathematical lens, and I found, to my surprise,
that even people who said they hated math were willing to read them. It
was a kind of math teaching, but very different from what we do in a
What it has in common with the classroom is that the reader gets asked
to do some work. Back to von Neumann on “The Mathematician”:
“It is harder to understand the mechanism of an airplane, and the
theories of the forces which lift and which propel it, than merely to
ride in it, to be elevated and transported by it—or even to steer
it. It is exceptional that one should be able to acquire the
understanding of a process without having previously acquired a deep
familiarity with running it, with using it, before one has assimilated
it in an instinctive and empirical way.”
In other words: it is pretty hard to understand mathematics
without doing some mathematics. There’s no royal road to
geometry, as Euclid told Ptolemy, or maybe, depending on your source, as
Menaechmus told Alexander the Great. (Let’s face it, famous old
maxims attributed to ancient scientists are probably made up, but
they’re no less instructive for that.)
This will not be the kind of book where I make grand, vague gestures at
great monuments of mathematics, and instruct you in the proper manner of
admiring them from a great distance. We are here to get our hands a
little dirty. We’ll compute some things. There will be a few
formulas and equations, when I need them to make a point. No formal math
beyond arithmetic will be required, though lots of math way beyond
arithmetic will be explained. I’ll draw some crude graphs and
charts. We’ll encounter some topics from school math, outside
their usual habitat; we’ll see how trigonometric functions
describe the extent to which two variables are related to each other,
what calculus has to say about the relationship between linear and
nonlinear phenomena, and how the quadratic formula serves as a cognitive
model for scientific inquiry. And we’ll also run into some of the
mathematics that usually gets put off to college or beyond, like the
crisis in set theory, which appears here as a kind of metaphor for
Supreme Court jurisprudence and baseball umpiring; recent developments
in analytic number theory, which demonstrate the interplay between
structure and randomness; and information theory and combinatorial
designs, which help explain how a group of MIT undergrads won millions
of dollars by understanding the guts of the Massachusetts state lottery.
There will be occasional gossip about mathematicians of note, and a
certain amount of philosophical speculation. There will even be a proof
or two. But there will be no homework, and there will be no test.
Includes: the Laffer curve, calculus explained in one page, the Law
of Large Numbers, assorted terrorism analogies, “Everyone in
America will be overweight by 2048,” why South Dakota has more
brain cancer than North Dakota, the ghosts of departed quantities, the
habit of definition
LESS LIKE SWEDEN
A few years ago, in the heat of the battle over the Affordable Care Act,
Daniel J. Mitchell of the libertarian Cato Institute posted a blog entry
with the provocative title “Why Is Obama Trying to Make America
More Like Sweden when Swedes Are Trying to Be Less Like Sweden?”
Good question! When you put it that way, it does seem pretty perverse.
Why, Mr. President, are we swimming against the current of history,
while social welfare states around the world—even rich little
Sweden!—are cutting back on expensive benefits and high taxes?
“If Swedes have learned from their mistakes and are now trying to
reduce the size and scope of government,” Mitchell writes,
“why are American politicians determined to repeat those
Answering this question will require an extremely scientific chart.
Here’s what the world looks like to the Cato Institute:
The x-axis represents Swedishness,* and the y-axis is some measure of
prosperity. Don’t worry about exactly how we’re quantifying
these things. The point is just this: according to the chart, the more
Swedish you are, the worse off your country is. The Swedes, no fools,
have figured this out and are launching their northwestward climb toward
free-market prosperity. But Obama’s sliding in the wrong
At the top of the facing page I’ve drawn the same picture from the
point of view of people whose economic views are closer to President
Obama’s than to those of the Cato Institute.
This picture gives very different advice about how Swedish we should be.
Where do we find peak prosperity? At a point more Swedish than America,
but less Swedish than Sweden. If this picture is right, it makes perfect
sense for Obama to beef up our welfare state while the Swedes trim
The difference between the two pictures is the difference between
linearity and nonlinearity, one of the central distinctions in
mathematics. The Cato curve is a line;* the non-Cato curve, the one with
the hump in the middle, is not. A line is one kind of curve, but not the
only kind, and lines enjoy all kinds of special properties that curves
in general may not. The highest point on a line segment—the
maximum prosperity, in this example—has to be on one end or the
other. That’s just how lines are. If lowering taxes is good for
prosperity, then lowering taxes even more is even better. And if Sweden
wants to de-Swede, so should we. Of course, an anti-Cato think tank
might posit that the line slopes in the other direction, going southwest
to northeast. And if that’s what the line looks like, then no
amount of social spending is too much. The optimal policy is Maximum
Usually, when someone announces they’re a “nonlinear
thinker” they’re about to apologize for losing something you
lent them. But nonlinearity is a real thing! And in this context,
thinking nonlinearly is crucial, because not all curves are lines. A
moment of reflection will tell you that the real curves of economics
look like the second picture, not the first. They’re nonlinear.
Mitchell’s reasoning is an example of false
linearity—he’s assuming, without coming right out and
saying so, that the course of prosperity is described by the line
segment in the first picture, in which case Sweden stripping down its
social infrastructure means we should do the same.
But as long as you believe there’s such a thing as too much
welfare state and such a thing as too little, you know the linear
picture is wrong. Some principle more complicated than “More
government bad, less government good” is in effect. The generals
who consulted Abraham Wald faced the same kind of situation: too little
armor meant planes got shot down, too much meant the planes
couldn’t fly. It’s not a question of whether adding more
armor is good or bad; it could be either, depending on how heavily
armored the planes are to start with. If there’s an optimal
answer, it’s somewhere in the middle, and deviating from it in
either direction is bad news.
Nonlinear thinking means which way you should go depends on where you
This insight isn’t new. Already in Roman times we find
Horace’s famous remark “Est modus in rebus, sunt certi
denique fines, quos ultra citraque nequit consistere rectum”
(“There is a proper measure in things. There are, finally, certain
boundaries short of and beyond which what is right cannot exist”).
And further back still, in the Nicomachean Ethics, Aristotle
observes that eating either too much or too little is troubling to the
constitution. The optimum is somewhere in between, because the relation
between eating and health isn’t linear, but curved, with bad
outcomes on both ends.
The irony is that economic conservatives like the folks at Cato used to
understand this better than anybody. That second picture I drew up
there? The extremely scientific one with the hump in the middle? I am
not the first person to draw it. It’s called the Laffer
curve, and it’s played a central role in Republican economics
for almost forty years. By the middle of the Reagan administration, the
curve had become such a commonplace of economic discourse that Ben Stein
ad-libbed it into his famous soul-killing lecture in Ferris
Bueller’s Day Off:
Anyone know what this is? Class? Anyone? . . . Anyone?
Anyone seen this before? The Laffer curve. Anyone know what this says?
It says that at this point on the revenue curve, you will get exactly
the same amount of revenue as at this point. This is very controversial.
Does anyone know what Vice President Bush called this in 1980? Anyone?
Something-doo economics. “Voodoo” economics.
The legend of the Laffer curve goes like this: Arthur Laffer, then an
economics professor at the University of Chicago, had dinner one night
in 1974 with Dick Cheney, Donald Rumsfeld, and Wall Street
Journal editor Jude Wanniski at an upscale hotel restaurant in
Washington, DC. They were tussling over President Ford’s tax plan,
and eventually, as intellectuals do when the tussling gets heavy, Laffer
commandeered a napkin* and drew a picture. The picture looked like this:
The horizontal axis here is level of taxation, and the vertical axis
represents the amount of revenue the government takes in from taxpayers.
On the left edge of the graph, the tax rate is 0%; in that case, by
definition, the government gets no tax revenue. On the right, the tax
rate is 100%; whatever income you have, whether from a business you run
or a salary you’re paid, goes straight into Uncle Sam’s bag.
Which is empty. Because if the government vacuums up every cent of the
wage you’re paid to show up and teach school, or sell hardware, or
middle-manage, why bother doing it? Over on the right edge of the graph,
people don’t work at all. Or, if they work, they do so in informal
economic niches where the tax collector’s hand can’t reach.
The government’s revenue is zero once again.
In the intermediate range in the middle of the curve, where the
government charges us somewhere between none of our income and all of
it—in other words, in the real world—the government does
take in some amount of revenue.
That means the curve recording the relationship between tax rate and
government revenue cannot be a straight line. If it were, revenue would
be maximized at either the left or right edge of the graph; but
it’s zero in both places. If the current income tax is really
close to zero, so that you’re on the left-hand side of the graph,
then raising taxes increases the amount of money the government has
available to fund services and programs, just as you might intuitively
expect. But if the rate is close to 100%, raising taxes actually
decreases government revenue. If you’re to the right of the
Laffer peak, and you want to decrease the deficit without cutting
spending, there’s a simple and politically peachy solution: lower
the tax rate, and thereby increase the amount of taxes you take in.
Which way you should go depends on where you are.
So where are we? That’s where things get sticky. In 1974, the top
income tax rate was 70%, and the idea that America was on the right-hand
downslope of the Laffer curve held a certain appeal—especially for
the few people lucky enough to pay tax at that rate, which only applied
to income beyond the first $200,000.* And the Laffer curve had a potent
advocate in Wanniski, who brought his theory into the public
consciousness in a 1978 book rather self-assuredly titled The Way the
World Works.* Wanniski was a true believer, with the right mix of
zeal and political canniness to get people to listen to an idea
considered fringy even by tax-cut advocates. He was untroubled by being
called a nut. “Now, what does ‘nut’ mean?” he
asked an interviewer. “Thomas Edison was a nut, Leibniz was a nut,
Galileo was a nut, so forth and so on. Everybody who comes with a new
idea to the conventional wisdom, comes with an idea that’s so far
outside the mainstream, that’s considered nutty.”
(Aside: it’s important to point out here that people with
out-of-the-mainstream ideas who compare themselves to Edison and Galileo
are never actually right. I get letters with this kind of
language at least once a month, usually from people who have
“proofs” of mathematical statements that have been known for
hundreds of years to be false. I can guarantee you Einstein did not go
around telling people, “Look, I know this theory of general
relativity sounds wacky, but that’s what they said about
The Laffer curve, with its compact visual representation and its
agreeably counterintuitive sting, turned out to be an easy sell for
politicians with a preexisting hunger for tax cuts. As economist Hal
Varian put it, “You can explain it to a Congressman in six minutes
and he can talk about it for six months.” Wanniski became an
adviser first to Jack Kemp, then to Ronald Reagan, whose experiences as
a wealthy movie star in the 1940s formed the template for his view of
the economy four decades later. His budget director, David Stockman,
“I came into the Big Money making pictures during World War
II,” [Reagan] would always say. At that time the wartime income
surtax hit 90 percent. “You could only make four pictures and then
you were in the top bracket,” he would continue. “So we all
quit working after about four pictures and went off to the
country.” High tax rates caused less work. Low tax rates caused
more. His experience proved it.
These days it’s hard to find a reputable economist who thinks
we’re on the downslope of the Laffer curve. Maybe that’s not
surprising, considering top incomes are currently taxed at just 35%, a
rate that would have seemed absurdly low for most of the twentieth
century. But even in Reagan’s day, we were probably on the
left-hand side of the curve. Greg Mankiw, an economist at Harvard and a
Republican who chaired the Council of Economic Advisors under the second
President Bush, writes in his microeconomics textbook:
Subsequent history failed to confirm Laffer’s conjecture that
lower tax rates would raise tax revenue. When Reagan cut taxes after he
was elected, the result was less tax revenue, not more. Revenue from
personal income taxes (per person, adjusted for inflation) fell by 9
percent from 1980 to 1984, even though average income (per person,
adjusted for inflation) grew by 4 percent over this period. Yet once the
policy was in place, it was hard to reverse.
Some sympathy for the supply-siders is now in order. First of all,
maximizing government revenue needn’t be the goal of tax policy.
Milton Friedman, whom we last met during World War II doing classified
military work for the Statistical Research Group, went on to become a
Nobel-winning economist and adviser to presidents, and a powerful
advocate for low taxes and libertarian philosophy. Friedman’s
famous slogan on taxation is “I am in favor of cutting taxes under
any circumstances and for any excuse, for any reason, whenever
it’s possible.” He didn’t think we should be aiming
for the top of the Laffer curve, where government tax revenue is as high
as it can be. For Friedman, money obtained by the government would
eventually be money spent by the government, and that money, he felt,
was more often spent badly than well.
More moderate supply-side thinkers, like Mankiw, argue that lower taxes
can increase the motivation to work hard and launch businesses, leading
eventually to a bigger, stronger economy, even if the immediate effect
of the tax cut is decreased government revenue and bigger deficits. An
economist with more redistributionist sympathies would observe that this
cuts both ways; maybe the government’s diminished ability to spend
means it constructs less infrastructure, regulates fraud less
stringently, and generally does less of the work that enables free
enterprise to thrive.
Mankiw also points out that the very richest people—the ones
who’d been paying 70% on the top tranche of their
income—did contribute more tax revenue after Reagan’s
tax cuts.* That leads to the somewhat vexing possibility that the way to
maximize government revenue is to jack up taxes on the middle class, who
have no choice but to keep on working, while slashing rates on the rich;
those guys have enough stockpiled wealth to make credible threats to
withhold or offshore their economic activity, should their government
charge them a rate they deem too high. If that story’s right, a
lot of liberals will uncomfortably climb in the boat with Milton
Friedman: maybe maximizing tax revenue isn’t so great after all.
Mankiw’s final assessment is a rather polite “Laffer’s
argument is not completely without merit.” I would give Laffer
more credit than that! His drawing made the fundamental and
incontrovertible mathematical point that the relationship between
taxation and revenue is necessarily nonlinear. It doesn’t, of
course, have to be a single smooth hill like the one Laffer sketched; it
could look like a trapezoid
or a Bactrian camel’s back
or a wildly oscillating free-for-all*
but if it slopes upward in one place, it has to slope downward somewhere
else. There is such a thing as being too Swedish. That’s a
statement no economist would disagree with. It’s also, as Laffer
himself pointed out, something that was understood by many social
scientists before him. But to most people, it’s not at all
obvious—at least, not until you see the picture on the napkin.
Laffer understood perfectly well that his curve didn’t have the
power to tell you whether or not any given economy at any given time was
overtaxed or not. That’s why he didn’t draw any numbers on
the picture. Questioned during congressional testimony about the precise
location of the optimal tax rate, he conceded, “I cannot measure
it frankly, but I can tell you what the characteristics of it are; yes,
sir.” All the Laffer curve says is that lower taxes could, under
some circumstances, increase tax revenue; but figuring out what those
circumstances are requires deep, difficult, empirical work, the kind of
work that doesn’t fit on a napkin.
There’s nothing wrong with the Laffer curve—only with the
uses people put it to. Wanniski and the politicians who followed his
panpipe fell prey to the oldest false syllogism in the book:
It could be the case that lowering taxes will increase government
I want it to be the case that lowering taxes will increase
Therefore, it is the case that lowering taxes will increase
STRAIGHT LOCALLY, CURVED GLOBALLY
You might not have thought you needed a professional mathematician to
tell you that not all curves are straight lines. But linear reasoning is
everywhere. You’re doing it every time you say that if something
is good to have, having more of it is even better. Political shouters
rely on it: “You support military action against Iran? I guess
you’d like to launch a ground invasion of every country
that looks at us funny!” Or, on the other hand,
“Engagement with Iran? You probably also think Adolf
Hitler was just misunderstood.”
Why is this kind of reasoning so popular, when a moment’s thought
reveals its wrongness? Why would anyone think, even for a second, that
all curves are straight lines, when they’re obviously not?
One reason is that, in a sense, they are. That story starts with
What’s the area of the following circle?
In the modern world, that’s a problem so standard you could put it
on the SAT. The area of a circle is πr2, and in this case the
radius r is 1, so the area is π. But two thousand years ago this
was a vexing open question, important enough to draw the attention of
Why was it so hard? For one thing, the Greeks didn’t really think
of π as a number, as we do. The numbers they understood were whole
numbers, numbers that counted things: 1, 2, 3, 4 . . .
But the first great success of Greek geometry—the Pythagorean
Theorem*—turned out to be the ruin of their number system.
Here’s a picture:
The Pythagorean Theorem tells you that the square of the
hypotenuse—the side drawn diagonally here, the one that
doesn’t touch the right angle—is the sum of the squares of
the other two sides, or legs. In this picture, that says the
square of the hypotenuse is 12 + 12 = 1 + 1 = 2. In
particular, the hypotenuse is longer than 1 and shorter than 2 (as you
can check with your eyeballs, no theorem required). That the length is
not a whole number was not, in itself, a problem for the Greeks. Maybe
we just measured everything in the wrong units. If we choose our unit of
length to make the legs 5 units long, you can check with a ruler that
the hypotenuse is just about 7 units long. Just about—but a bit
too long. For the square of the hypotenuse is
52 + 52 = 25 + 25 = 50
and if the hypotenuse were 7, its square would be 7 × 7
Or if you make the legs 12 units long, the hypotenuse is almost exactly
17 units, but is tantalizingly too short, because 122 + 122 is 288, a
smidgen less than 172, which is 289.
And at some point around the fifth century BCE, a member of the
Pythagorean school made a shocking discovery: there was no way to
measure the isosceles right triangle so that the length of each side was
a whole number. Modern people would say “the square root of 2 is
irrational”—that is, it is not the ratio of any two whole
numbers. But the Pythagoreans would not have said that. How could they?
Their notion of quantity was built on the idea of proportions between
whole numbers. To them, the length of that hypotenuse had been revealed
to be not a number at all.
This caused a fuss. The Pythagoreans, you have to remember, were
extremely weird. Their philosophy was a chunky stew of things we’d
now call mathematics, things we’d now call religion, and things
we’d now call mental illness. They believed that odd numbers were
good and even numbers evil; that a planet identical to our own, the
Antichthon, lay on the other side of the sun; and that it was wrong to
eat beans, by some accounts because they were the repository of dead
people’s souls. Pythagoras himself was said to have had the
ability to talk to cattle (he told them not to eat beans) and to have
been one of the very few ancient Greeks to wear pants.
The mathematics of the Pythagoreans was inseparably bound up with their
ideology. The story (probably not really true, but it gives the right
impression of the Pythagorean style) is that the Pythagorean who
discovered the irrationality of the square root of 2 was a man named
Hippasus, whose reward for proving such a nauseating theorem was to be
tossed into the sea by his colleagues, to his death.
But you can’t drown a theorem. The Pythagoreans’ successors,
like Euclid and Archimedes, understood that you had to roll up your
sleeves and measure things, even if this brought you outside the
pleasant walled garden of the whole numbers. No one knew whether the
area of a circle could be expressed using whole numbers alone.* But
wheels must be built and silos filled;* so the measurement must be done.
The original idea comes from Eudoxus of Cnidus; Euclid included it as
book 12 of the elements. But it was Archimedes who really brought the
project to its full fruition. Today we call his approach the method
of exhaustion. And it starts like this.
The square in the picture is called the inscribed square; each of
its corners just touches the circle, but it doesn’t extend beyond
the circle’s boundary. Why do this? Because circles are mysterious
and intimidating, and squares are easy. If you have before you a square
whose side has length X, its area is X times X—indeed,
that’s why we call the operation of multiplying a number by itself
squaring! A basic rule of mathematical life: if the universe hands you a
hard problem, try to solve an easier one instead, and hope the simple
version is close enough to the original problem that the universe
The inscribed square breaks up into four triangles, each of which is
none other than the isosceles triangle we just drew.* So the
square’s area is four times the area of the triangle. That
triangle, in turn, is what you get when you take a 1 x 1 square and cut
it diagonally in half like a tuna fish sandwich.
The area of the tuna fish sandwich is 1 × 1 = 1, so the
area of each triangular half-sandwich is 1/2, and the area of the
inscribed square is 4 × 1/2, or 2.
By the way, suppose you don’t know the Pythagorean Theorem.
Guess what—you do now! Or at least you know what it has to say
about this particular right triangle. Because the right triangle that
makes up the lower half of the tuna fish sandwich is exactly the same as
the one that is the northwest quarter of the inscribed square. And its
hypotenuse is the inscribed square’s side. So when you square the
hypotenuse, you get the area of the inscribed square, which is 2. That
is, the hypotenuse is that number which, when squared, yields 2; or, in
the usual more concise lingo, the square root of 2.
The inscribed square is entirely contained within the circle. If its
area is 2, the area of the circle must be at least 2.
Now we draw another square.
This one is called the circumscribed square; it, too, touches the
circle at just four points. But this square contains the circle. Its
sides have length 2, so its area is 4; and so we know the area of the
circle is at most 4.
To have shown that pi is between 2 and 4 is perhaps not so impressive.
But Archimedes is just getting started. Take the four corners of your
inscribed square and mark new points on the circle halfway between each
adjacent pair of corners. Now you’ve got eight equally spaced
points, and when you connect those, you get an inscribed octagon, or, in
technical language, a “stop sign”:
Computing the area of the inscribed octagon is a bit harder, and
I’ll spare you the trigonometry. The important thing is that
it’s about straight lines and angles, not curves, and so it was
doable with the methods available to Archimedes. And the area is twice
the square root of 2, which is about 2.83.
You can play the same game with the circumscribed octagon
whose area is 8(√2 − 1), a little over 3.31.
So the area of the circle is trapped in between 2.83 and 3.31.
Why stop there? You can stick points in between the corners of the
octagon (whether inscribed or circumscribed) to make a 16-gon; after
some more trigonometric figuring, that tells you that the area of the
circle is in between 3.06 and 3.18. Do it again, to make a 32-gon; and
again, and again, and pretty soon you have something that looks like
Wait, isn’t that just the circle? Of course not! It’s a
regular polygon with 65,536 sides. Couldn’t you tell?
The great insight of Eudoxus and Archimedes was that it doesn’t
matter whether it’s a circle or a polygon with very many very
short sides. The two areas will be close enough for any purpose you
might have in mind. The area of the little fringe between the circle and
the polygon has been “exhausted” by our relentless
iteration. The circle has a curve to it, that’s true. But every
tiny little piece of it can be well approximated by a perfectly straight
line, just as the tiny little patch of the earth’s surface we
stand on is well approximated by a perfectly flat plane.*
The slogan to keep in mind: straight locally, curved globally.
Or think of it like this. You are streaking downward toward the circle
as from a great height. At first you can see the whole thing:
Then just one segment of arc:
And a still smaller segment:
Until, zooming in, and zooming in, what you see is pretty much
indistinguishable from a line. An ant on the circle, aware only of his
own tiny immediate surroundings, would think he was on a straight line,
just as a person on the surface of the earth (unless she is clever
enough to watch objects crest the horizon as they approach from afar)
feels like she’s standing on a plane.
THE PAGE WHERE I TEACH YOU CALCULUS
I will now teach you calculus. Ready? The idea, for which we have Isaac
Newton to thank, is that there’s nothing special about a perfect
circle. Every smooth curve, when you zoom in enough, looks just
like a line. Doesn’t matter how winding or snarled it
is—just that it doesn’t have any sharp corners.
When you fire a missile*, its path looks like this:
The missile goes up, then down, in a parabolic arc. Gravity makes all
motion curve toward the earth; that’s among the fundamental facts
of our physical life. But if we zoom in on a very short segment, the
curve starts to look like this:
And then like this:
Just like the circle, the missile’s path looks to the naked eye
like a straight line, progressing upward at an angle. The deviation from
straightness caused by gravity is too small to see—but it’s
still there, of course. Zooming in to an even smaller region of the
curve makes the curve even more like a straight line. Closer and
straighter, closer and straighter . . .
Now here’s the conceptual leap. Newton said, look, let’s go
all the way. Reduce your field of view until it’s
infinitesimal—so small that it’s smaller than any
size you can name, but not zero. You’re studying the
missile’s arc, not over a very short time interval, but at a
single moment. What was almost a line becomes exactly a
line. And the slope of this line is what Newton called the
fluxion, and what we’d now call the derivative.
That’s a kind of jump Archimedes wasn’t willing to make. He
understood that polygons with shorter sides got closer and closer to the
circle—but he would never have said that the circle actually
was a polygon with infinitely many infinitely short sides.
Some of Newton’s contemporaries, too, were reluctant to go along
for the ride. The most famous objector was George Berkeley, who
denounced Newton’s infinitesimals in a tone of high mockery sadly
absent from current mathematical literature: “And what are these
fluxions? The velocities of evanescent increments. And what are these
same evanescent increments? They are neither finite quantities, nor
quantities infinitely small, nor yet nothing. May we not call them the
ghosts of departed quantities?”
And yet calculus works. If you swing a rock in a loop around your
head and suddenly release it, it’ll shoot off along a linear
trajectory at constant speed,* exactly in the direction that calculus
says the rock is moving at the precise moment you let go. That’s
yet another Newtonian insight; objects in motion tend to proceed in a
straight-line path, unless some other force intercedes to nudge the
object one way or the other. That’s one reason linear thinking
comes so naturally to us: our intuition about time and motion is formed
by the phenomena we observe in the world. Even before Newton codified
his laws, something in us knew that things like to move in straight
lines, unless given a reason to do otherwise.
EVANESCENT INCREMENTS AND UNNECESSARY PERPLEXITIES
Newton’s critics had a point; his construction of the derivative
didn’t amount to what we’d call rigorous mathematics
nowadays. The problem is the notion of the infinitely small, which was a
slightly embarrassing sticking point for mathematicians for thousands of
years. The trouble started with Zeno, a fifth-century-BCE Greek
philosopher of the Eleatic school who specialized in asking
innocent-seeming questions about the physical world that inevitably
blossomed into huge philosophical brouhahas.
His most famous paradox goes like this. I decide to walk to the ice
cream store. Now certainly I can’t get to the ice cream store
until I’ve gone halfway there. And once I’ve gone halfway, I
can’t get to the store until I’ve gone half the distance
that remains. Having done so, I still have to cover half the remaining
distance. And so on, and so on. I may get closer and closer to the ice
cream store—but no matter how many steps of this process I
undergo, I never actually reach the ice cream store. I am always
some tiny but nonzero distance away from my two scoops with jimmies.
Thus, Zeno concludes, to walk to the ice cream store is impossible. The
argument works just as well for any destination: it’s equally
impossible to walk across the street, or to take a single step, or to
wave your hand. All motion is ruled out.
Diogenes the Cynic was said to have refuted Zeno’s argument by
standing up and walking across the room. Which is a pretty good argument
that motion is actually possible; so something must be wrong with
Zeno’s argument. But where’s the mistake?
Break down the trip to the store numerically. First you go halfway. Then
you go half of the remaining distance, which is 1/4 of the total
distance, and you’ve got 1/4 left to go. So half of what’s
left is 1/8, then 1/16, then 1/32. Your progress toward the store looks
1/2 + 1/4 + 1/8 + 1/16 + 1/32 + . . .
If you add up ten terms of this sequence you get about 0.999. If you add
up twenty terms it’s more like 0.999999. In other words, you are
getting really, really, really close to the store. But no matter how
many terms you add, you never get to 1.
Zeno’s paradox is much like another conundrum: is the repeating
decimal 0.99999. . . . . . equal to 1?
I have seen people come nearly to blows over this question.* It’s
hotly disputed on websites ranging from World of Warcraft fan pages to
Ayn Rand forums. Our natural feeling about Zeno is “of course you
eventually get your ice cream.” But in this case, intuition points
the other way. Most people, if you press them, say
0.9999 . . . doesn’t equal 1. It doesn’t
look like 1, that’s for sure. It looks smaller. But not
much smaller! Like Zeno’s hungry ice cream lover, it gets closer
and closer to its goal, but never, it seems, quite makes it there.
And yet, math teachers everywhere, myself included, will tell them,
“No, it’s 1.”
How do I convince someone to come over to my side? One good trick is to
argue as follows. Everyone knows that
0.33333. . . . . = 1/3.
Multiply both sides by 3 and you’ll see
0.99999. . . . = 3/3 = 1.
If that doesn’t sway you, try multiplying
0.99999 . . . by 10, which is just a matter of moving the
decimal point one spot to the right.
10 × (0.99999 . . .)
= 9.99999 . . .
Now subtract the vexing decimal from both sides:
10 × (0.99999 . . .) − 1
× (0.99999 . . .)
= 9.99999 . . . −
0.99999 . . . .
The left-hand side of the equation is just 9
× (0.99999 . . .), because 10 times something
minus that something is 9 times the aforementioned thing. And over on
the right-hand side, we have managed to cancel out the terrible infinite
decimal, and are left with a simple 9. So we end up with
9 × (0.99999 . . .) = 9.
If 9 times something is 9, that something just has to be
These arguments are often enough to win people over. But let’s be
honest: they lack something. They don’t really address the anxious
uncertainty induced by the claim 0.99999 . . . = 1;
instead, they represent a kind of algebraic intimidation. “You
believe that 1/3 is 0.3 repeating—don’t you? Don’t
Or worse: maybe you bought my argument based on multiplication by 10.
But how about this one? What is
1 + 2 + 4 + 8 + 16 + . . . ?
Here the “. . .” means “carry on the
sum forever, adding twice as much each time.” Surely such a sum
must be infinite! But an argument much like the apparently correct one
concerning 0.9999 . . . seems to suggest otherwise.
Multiply the sum above by 2 and you get
2 × (1 + 2 + 4 + 8 + 16 + . . .) = 2 + 4 +
8 + 16 + . . .
which looks a lot like the original sum; indeed, it is just the original
sum (1 + 2 + 4 + 8 + 16 + . . .) with the 1 lopped off
the beginning, which means that 2 × (1 + 2 + 4 + 8 + 16
+ . . .) is 1 less than (1 + 2 + 4 + 8 +
16 + . . .). In other words,
2 × (1 + 2 + 4 + 8 + 16 + . . .) − 1
× (1 + 2 + 4 + 8 + 16 + . . .) = −1.
But the left-hand side simplifies to the very sum we started with, and
we’re left with
1 + 2 + 4 + 8 + 16 + . . . = −1.
Is that what you want to believe?* That adding bigger and bigger
numbers, ad infinitum, flops you over into negativeland?
More craziness: What is the value of the infinite sum
1 − 1 + 1 − 1 + 1 − 1 + . . .
One might first observe that the sum is
(1 − 1) + (1 − 1) + (1 − 1) + . . .
= 0 + 0 + 0 + . . .
and argue that the sum of a bunch of zeroes, even infinitely many, has
to be 0. On the other hand, 1 − 1 + 1 is the same thing as 1
− (1 − 1), because the negative of a negative is a positive;
applying this fact again and again, we can rewrite the sum as
1 − (1 − 1) − (1 − 1) − (1 − 1)
. . . = 1 − 0 − 0 −
0 . . .
which seems to demand, in the same way, that the sum is equal to 1! So
which is it, 0 or 1? Or is it somehow 0 half the time and 1 half the
time? It seems to depend where you stop—but infinite sums never
Don’t decide yet, because it gets worse. Suppose T is the value of
our mystery sum:
T = 1 − 1 + 1 − 1 + 1 − 1 + . . .
Taking the negative of both sides gives you
−T = −1 + 1 − 1 + 1 . . .
But the sum on the right-hand side is precisely what you get if you take
the original sum defining T and lop off that first 1, thus subtracting
1; in other words,
−T = −1 + 1 − 1 + 1 . . .
= T − 1.
So −T = T − 1, an equation concerning T which is
satisfied only when T is equal to 1/2. Can a sum of infinitely many
whole numbers somehow magically become a fraction? If you say no, you
have the right to be at least a little suspicious of slick arguments
like this one. But note that some people said yes, including the Italian
mathematician/priest Guido Grandi, after whom the series 1 − 1 + 1
− 1 + 1 − 1 + . . . is usually named; in a
1703 paper, he argued that the sum of the series is 1/2, and moreover
that this miraculous conclusion represented the creation of the universe
from nothing. (Don’t worry, I don’t follow that last step
either.) Other leading mathematicians of the time, like Leibniz and
Euler, were on board with Grandi’s strange computation, if not his
But in fact, the answer to the 0.999 . . . riddle (and to
Zeno’s paradox, and to Grandi’s series) lies a little
deeper. You don’t have to give in to my algebraic strong-arming.
You might, for instance, insist that 0.999 . . . is not
equal to 1, but rather 1 minus some tiny infinitesimal number. And, for
that matter, you might further insist that 0.333 . . . is
not exactly equal to 1/3, but also falls short by an
infinitesimal quantity. This point of view requires some stamina to push
through to completion, but it can be done. I once had a calculus student
named Brian who, unhappy with the classroom definitions, worked out a
fair chunk of the theory by himself, referring to his infinitesimal
quantities as “Brian numbers.”
Brian was not actually the first to get there. There’s a whole
field of mathematics that specializes in contemplating numbers of this
kind, called nonstandard analysis. The theory, developed by
Abraham Robinson in the mid-twentieth century, finally made sense of the
“evanescent increments” that Berkeley found so ridiculous.
The price you have to pay (or, from another point of view, the reward
you get to reap) is a profusion of novel kinds of numbers; not only
infinitely small ones, but infinitely large ones, a huge spray of them
in all shapes and sizes.*
As it happened, Brian was in luck—I had a colleague at Princeton,
Edward Nelson, who was an expert in nonstandard analysis. I set up a
meeting for the two of them so Brian could learn more about it. The
meeting, Ed told me later, didn’t go well. As soon as Ed made it
clear that infinitesimal quantities were not in fact going to be called
Brian numbers, Brian lost all interest.
(Moral lesson: people who go into mathematics for fame and glory
don’t stay in mathematics for long.)
But we’re no closer to settling our dispute. What is
0.999 . . . , really? Is it 1? Or is it some
number infinitesimally less than 1, a crazy kind of number that
hadn’t even been discovered a hundred years ago?
The right answer is to unask the question. What is
0.999 . . . , really? It appears to refer to a kind of
.9 + .09 + .009 + .0009 + . . .
But what does that mean? That pesky ellipsis is the real problem. There
can be no controversy about what it means to add up two, or three, or a
hundred numbers. This is just mathematical notation for a physical
process we understand very well: take a hundred heaps of stuff, mush
them together, see how much you have. But infinitely many? That’s
a different story. In the real world, you can never have infinitely many
heaps. What’s the numerical value of an infinite sum? It
doesn’t have one—until we give it one. That was the
great innovation of Augustin-Louis Cauchy, who introduced the notion of
limit into calculus in the 1820s.*
The British number theorist G. H. Hardy, in his 1949 book Divergent
Series, explains it best:
It does not occur to a modern mathematician that a collection of
mathematical symbols should have a “meaning” until one has
been assigned to it by definition. It was not a triviality even to the
greatest mathematicians of the eighteenth century. They had not the
habit of definition: it was not natural to them to say, in so many
words, “by X we mean Y.” . . . It is
broadly true to say that mathematicians before Cauchy asked not,
“How shall we define 1 − 1 + 1 − 1
+ . . .” but “What is 1 − 1 + 1
− 1 + . . . ?” and that this habit of
mind led them into unnecessary perplexities and controversies which were
often really verbal.
This is not just loosey-goosey mathematical relativism. Just because we
can assign whatever meaning we like to a string of mathematical
symbols doesn’t mean we should. In math, as in life, there are
good choices and there are bad ones. In the mathematical context, the
good choices are the ones that settle unnecessary perplexities without
creating new ones.
The sum .9 + .09 + .009 + . . . gets
closer and closer to 1 the more terms you add. And it never gets any
farther away. No matter how tight a cordon we draw around the number 1,
the sum will eventually, after some finite number of steps, penetrate
it, and never leave. Under those circumstances, Cauchy said, we should
simply define the value of the infinite sum to be 1. And then he
worked very hard to prove that committing oneself to his definition
didn’t cause horrible contradictions to pop up elsewhere. By the
time this labor was done, he’d constructed a framework that made
Newton’s calculus completely rigorous. When we say a curve looks
locally like a straight line at a certain angle, we now mean more or
less this: as you zoom in tighter and tighter, the curve resembles the
given line more and more closely. In Cauchy’s formulation,
there’s no need to mention infinitely small numbers, or anything
else that would make a skeptic blanch.
Of course there is a cost. The reason the 0.999 . . .
problem is difficult is that it brings our intuitions into conflict. We
would like the sum of an infinite series to play nicely with arithmetic
manipulations like the ones we carried out on the previous pages, and
this seems to demand that the sum equal 1. On the other hand, we would
like each number to be represented by a unique string of decimal digits,
which conflicts with the claim that the same number can be called either
1 or 0.999 . . . , as we like. We can’t hold on to
both of these desires at once; one must be discarded. In Cauchy’s
approach, which has amply proved its worth in the two centuries since he
invented it, it’s the uniqueness of the decimal expansion that
goes out the window. We’re untroubled by the fact that the English
language sometimes uses two different strings of letters (i.e., two
words) to refer synonymously to the same thing in the world; in the same
way, it’s not so bad that two different strings of digits can
refer to the same number.
As for Grandi’s 1 − 1 + 1 − 1 + . . . ,
it is one of the series outside the reach of Cauchy’s theory: that
is, one of the divergent series that formed the subject of
Hardy’s book. The Norwegian mathematician Niels Henrik Abel, an
early fan of Cauchy’s approach, wrote in 1828, “Divergent
series are the invention of the devil, and it is shameful to base on
them any demonstration whatsoever.”* Hardy’s view, which is
our view today, is more forgiving; there are some divergent series to
which we ought to assign values and some to which we ought not, and some
to which we ought or ought not depending on the context in which the
series arises. Modern mathematicians would say that if we are to assign
the Grandi series a value, it should be 1/2, because, as it turns out,
all interesting theories of infinite sums either give it the value 1/2
or decline, like Cauchy’s theory, to give it any value at all.*
To write Cauchy’s definitions down precisely takes a bit more
work. This was especially true for Cauchy himself, who had not quite
phrased the ideas in their clean, modern form.* (In mathematics, you
very seldom get the clearest account of an idea from the person who
invented it.) Cauchy was an unwavering conservative and a royalist, but
in his mathematics he was proudly revolutionary and a scourge to
academic authority. Once he understood how to do things without the
dangerous infinitesimals, he unilaterally rewrote his syllabus at the
École Polytechnique to reflect his new ideas. This enraged everyone
around him: his mystified students, who had signed up for freshman
calculus, not a seminar on cutting-edge pure mathematics; his
colleagues, who felt that the engineering students at the École had
no need for Cauchy’s level of rigor; and the administrators, whose
commands to stick to the official course outline he completely ignored.
The École imposed a new curriculum from above that emphasized the
traditional infinitesimal approach to calculus, and placed note takers
in Cauchy’s classroom to make sure he complied. Cauchy did not
comply. Cauchy was not interested in the needs of engineers. Cauchy was
interested in the truth.
It’s hard to defend Cauchy’s stance on pedagogical grounds.
But I’m sympathetic with him anyway. One of the great joys of
mathematics is the incontrovertible feeling that you’ve understood
something the right way, all the way down to the bottom; it’s a
feeling I haven’t experienced in any other sphere of mental life.
And when you know how to do something the right way, it’s
hard—for some stubborn people, impossible—to make yourself
explain it the wrong way.
EVERYONE IS OBESE
The stand-up comic Eugene Mirman tells this joke about statistics. He
says he likes to tell people, “I read that 100% of Americans were
“But Eugene,” his confused companion protests,
“you’re not Asian.”
And the punch line, delivered with magnificent self-assurance: “I
read that I was!”
I thought of Mirman’s joke when I encountered a paper in the
journal Obesity whose title posed the discomfiting question:
“Will all Americans become overweight or obese?” As if the
rhetorical question weren’t enough, the article supplies an
answer: “Yes—by 2048.”
In 2048 I’ll be seventy-seven years old, and I hope not to be
overweight. But I read I would be!
The Obesity paper got plenty of press, as you might imagine. ABC
News warned of an “obesity apocalypse.” The Long Beach
Press-Telegram went with the simple headline “We’re
Getting Fatter.” The study’s results resonated with the
latest manifestation of the fevered, ever-shifting anxiety with which
Americans have always contemplated our national moral status. Before I
was born, boys grew long hair and thus we were bound to get whipped by
the Communists. When I was a kid, we played arcade games too much, which
left us doomed to be outcompeted by the industrious Japanese. Now we eat
too much fast food, and we’re all going to die weak and immobile,
surrounded by empty chicken buckets, puddled into the couches from which
we long ago became unable to hoist ourselves. The paper certified this
anxiety as a fact proved by science.
I have some good news. We’re not all going to be overweight in the
year 2048. Why? Because not every curve is a line.
Excerpted from "How Not to Be Wrong: The Power of Mathematical Thinking" by Jordan Ellenberg. Copyright © 2015 by Jordan Ellenberg. Excerpted by permission. All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher. Excerpts are provided solely for the personal use of visitors to this web site.