SIAM: Massively Collaborative Mathematics

Massively Collaborative Mathematics

April 1, 2010

Julie Rehmeyer

Timothy Gowers's blog long ago became one of the most read in mathematics. He's used the blog in a near endless variety of ways: to discuss his mathematical ideas, to give informal descriptions of his papers, to share successful pedagogical strategies, to keep people informed on the progress of his book, to play with mathematical takes on real-world phenomena, and more. When he ruminated on the value of a wiki that would gather descriptions of useful mathematical tricks---a tricki!---a reader offered him the technical assistance that led to realization of the idea (tricki.org).

Gowers, a Cambridge University mathematician and a Fields Medalist (1998), recently took his blog a step further. Rather than using it just to disseminate mathematical ideas, he has made it a forum for mathematical research itself.

His first experiment in this direction succeeded way beyond his expectations. More than two dozen mathematicians came together on the blog and found a direct, combinatorial proof of the density Hales�Jewett theorem over the course of a mere six weeks.

"It felt like the difference between driving a car and pushing it," Gowers says. "I'm absolutely sure from my experience with one problem that this very new method of research has the potential to do great things."

Proof by "Super-mathematician"
The story began a year ago, when Gowers proposed what he called a "polymath project," inviting anyone interested to work with him openly, online, on the density Hales�Jewett (DHJ) theorem. To make the invitation more enticing, he seeded the conversation by offering up a line of attack on the problem. He hadn't yet managed to make the approach work on his own, but he thought that a group working together might be successful. His initial goal was modest: Rather than aiming to prove the full density Hales�Jewett theorem, he set out to determine whether his approach could be made to work for one special case.

He chose the suggested problem carefully. For one thing, he was pretty sure that a proof could be found. Indeed, a proof of DHJ already had been found--but it used complex machinery from ergodic theory. Gowers was after a direct, combinatorial proof, and because such proofs had been found for closely related theorems, it seemed likely that DHJ had one as well.

DHJ also had the advantage of being important and of interest to a wide variety of mathematicians and theoretical computer scientists. DHJ addresses this question: How many squares would you have to remove from a multidimensional tic-tac-toe board of arbitrary size to make it impossible for either player to win? In ordinary tic-tac-toe (a 3 � 3 grid in two dimensions), removing the three squares on a diagonal would suffice to block all possible lines that are three squares long. The density Hales�Jewett theorem says that as the number of dimensions goes to infinity, the proportion of the squares that have to be removed gets closer and closer to 1.

The theorem is actually stronger than this, because it says that the same result is true even if you decide to block only lines of a special kind, called "combinatorial lines." (Moving from one point to the next in an ordinary geometric line, each coordinate goes monotonically upward, monotonically downward, or stays the same; the three points (1,3), (2,2), and (3,1), for instance, form a geometric line. In a combinatorial line, on the other hand, coordinates that aren't constant must all change in the same direction: either all upward or all downward. The above geometric line isn't combinatorial, because the first coordinate for each point goes up and the second goes down.)

The special case that Gowers settled on as an initial goal is the first challenging case of DHJ---a grid of size three in any number of dimensions.

Described as a result about tic-tac-toe, DHJ may sound insignificant, but it has widespread importance in math and theoretical computer science. The result implies Szemer�di's theorem, a well-known foundational result in combinatorics that has been used in many fields. DHJ is part of Ramsey theory and additive combinatorics, which are currently being exploited in computer science. In addition, DHJ itself almost immediately implies a solution to the "parallel repetition problem" in computer science.

There was one more reason to think that a combinatorial proof of DHJ would be important: Szemer�di's theorem itself has a number of different proofs, and the discovery of each one opened up productive new veins of research. The special case of 3 � 3 grids initially proposed by Gowers, however, wasn't enough on its own to imply Szemer�di's theorem. Still, a new approach to the problem was likely to yield useful insights.

Gowers laid out ground rules at the beginning: Comments, he said, should be short, expressing a single idea or just the germ of an idea; in fact, such incomplete thoughts were particularly welcome. Comments were to be expressed as clearly as possible. Technical work was to be delayed until clearly necessary. As a rule of thumb, Gowers suggested avoiding the kind of comment that requires thinking with a piece of paper at hand.

With these rules, Gowers was codifying lessons he had learned about the most productive approach to any mathematical research, whether collaborative or solo; he was also attempting to engineer a particular kind of conversation. "The ideal outcome would be a solution of the problem with no single individual having to think all that hard," Gowers wrote in his blog. "The hard thought would be done by a sort of super-mathematician whose brain is distributed amongst bits of the brains of lots of interlinked people." A comment full of loose ends that others could grab onto was preferable to a solitary polished gem.

Expectations, Met and Unmet
Within a day, comments started coming in; Terence Tao of UCLA, also a Fields Medalist (2006), was among the early contributors. Within a week, about two dozen people were participating and the conversation had spread to Tao's blog and to a wiki. Within six weeks, more than 800 comments had been posted on the blog---and the contributors had produced a simple, beautiful, combinatorial proof.

Even more remarkably, the proof generalized beyond the special case of 3 � 3 grids to cover the entire theorem, and thus to provide a new proof of Szemer�di's theorem in the process. "That was a huge surprise," Gowers says, because in closely related problems, the case of 4 � 4 or larger grids was far more difficult.

As is usual in mathematical research, at the beginning people threw out many more questions than answers. Little by little, some key questions were answered and the project gained momentum. "It was extremely enjoyable, but somewhat addictive," Gowers says. "I had a feeling that it became a spectator sport. Just before going to bed, I might have some ideas and put them on the blog with some questions, and then when I woke up, someone in the US might have come up with some answers."

Ryan O'Donnell, a theoretical computer scientist at Carnegie Mellon University, got hooked as well. "It was very hectic," he says. "I had to think for four or five hours a day to come up with some small idea or question to ask. I thought about it every day. Eventually, after it was over, Gowers said that he spent a lot of time too, so then I felt better."

Other participants also reported working hard, and Gowers admits, "I'm not sure I believe in the original idea of a super-brain." Nevertheless, the exhilarating speed with which they produced a result points to some kind of emergent phenomenon. The speed was partly a result of the number and variety of the ideas that were produced. In addition, the structure of the process encouraged efficiency. With solitary research or small collaborations, "it's quite tempting to go into great detail exploring an idea," Gowers says. "Then you find out that the calculations don't work out, and then afterwards you realize that there was a reason it didn't work. With lots of people, someone may see the reason it's not going to work very quickly and spare you the time."

Furthermore, Gowers says, "I have found that the mere effort of writing a blog post about a particular problem often stimulates me to have new ideas. It's almost as though I could just pretend that I'm going to have a polymath project and write to myself. I wouldn't expect a �monomath' project to be as good as a polymath one, but it could still be better than working entirely in private."

Gowers had originally hoped that dozens of mathematicians would participate, and indeed, over the course of the project, 27 people contributed comments. But once a productive approach had been found, the group narrowed to a core of about six people, many fewer than Gowers had hoped for. Participation required understanding the big picture of the proof, and acquiring that understanding took a sustained effort that only a few were willing to muster. Boris Bukh of Cambridge, for example, made important contributions early on, but after a few days, he stopped. "It was taking quite a bit of my time," he says, "and I felt I was duplicating other people's efforts."

Open-source Software as Model
Gowers believes that it may be possible to structure a future polymath in a way that lessens this requirement. He points, for example, to open-source software, which is organized so that people can contribute to specific pieces of a project without grasping its entirety. Mathematical proofs often have subgoals that could be tackled separately, but the linear structure of a blog makes it hard for newcomers to identify those manageable pieces.

Gowers believes that it might be possible to organize a wiki-like document, a "proof-discovery tree," that would investigate all reasonable approaches to the main problem and arrange them in a natural hierarchical way. The main page would discuss the very general questions and strategy of the proof, and then subpages would address questions raised by the main page, sub-subpages would address questions raised by these, and so forth. A branch of the tree of webpages would end when a question was definitively answered. Gowers thinks it might be possible for many more people to get involved by jumping in on one of the subpages.

Furthermore, such an approach might make Gowers's original image of a "super-mathematician" a more practical possibility. "My ultimate fantasy," he wrote in his blog, "is that it might be possible to solve a problem without anybody taking a global view: Everybody would concentrate on local questions, and at some point the proof-discovery tree would become a successful one. Success would be a purely formal property that one could verify automatically."

A big advantage of open online collaboration---whether done with a proof-discovery tree or not---is that discoveries that turn out not to be essential to the main thrust of the proof are still recorded. "In order to publish a decent paper, you need a bigger insight," Bukh says. "In this case, even small insights are preserved." And even though an online conversation is inevitably far looser than a published paper would be, Bukh believes that if mathematicians are interested in the question a polymath is addressing, then "absolutely, they'll slog through blog posts. There are so many papers that are written far worse even than these postings. It's still much easier to read what other people have done than to do it on your own."

One other aspect of the experience may have fallen short of expectation: Ironically, the project's proof of DHJ (and hence Szemer�di's theorem) is so beautiful and complete that it may not open up new avenues of research, as other new proofs have done. "Sometimes a result is so satisfactory on its own that it kind of finishes a story," Gowers says. The team has almost finished writing up the proof. The paper will be published under the name "D.H.J. Polymath," with a link to the project's wiki page.

Since the DHJ project ended (it was later christened "Polymath 1"), several similar projects have been attempted. Terry Tao hosted a polymath devoted to the sixth problem in the 2009 International Mathematics Olympiad, and six solutions have been found. A couple of polymaths petered out when the participants ran out of new ideas (the fate, of course, of most mathematical projects). And in early January, Gowers started his second polymath, on the Erd�s discrepancy problem. At first, the group focused on running experiments. "We are spotting patterns that are so clearly there that it's really intriguing," Gowers says. "It's clear that something's going to come out of the investigations." They've recently found significant theoretical results as well.

Polymaths are such a new approach that everyone involved is still uncertain what, precisely, they'll turn out to be good for and where their limitations will be. "My initial point was a mixture of skepticism and enthusiasm," says Gil Kalai of the Hebrew University of Jerusalem, a participant in Polymath 1," and I'm still mixed. But we have to take a note that it was successful, and this is quite a substantial piece of evidence. We should be cautious and skeptical but also excited."

Julie Rehmeyer writes about mathematics and science from Berkeley, California. She is the online math columnist for Science News.

See more news from this issue

Donate · Contact Us · Site Map · Join SIAM · My Account

Massively Collaborative Mathematics

In This Section