Popular (Computer) ScienceMarch 18, 2012
9 Algorithms that Changed the Future: The Ingenious Ideas that Drive Today's Computers. By John MacCormick, Princeton University Press, Princeton, New Jersey, 2012, x + 219 pages, $27.95.
Despite the widespread popular interest in computers, there are very few good, popular introductions to the central ideas of computer science. 9 Algorithms that Changed the Future is certainly one of the best that I have seen. MacCormick presents the central ideas behind eight algorithms, or categories of algorithms, one per chapter:
1. Indexing for information retrieval
2. PageRank, the ranking method used by Google
3. Error-correcting codes
4. Public-key cryptography
5. Pattern recognition: nearest-neighbor methods, decision trees, and neural networks
6. Data compression
7. Consistency maintenance in databases
8. Digital signatures
A ninth chapter discusses uncomputable problems (this is the "ninth algorithm" of the title). MacCormick concludes with a chapter on the prospects for future "great algorithms"; I did not find this discussion convincing or even thought-provoking.
The algorithms, MacCormick says, were chosen to satisfy three criteria. First, they must be part of programs used daily by the man on the street. Second, they must be addressed to specific tasks rather than generic problems. Third, they must be theory-
oriented rather than hardware-related. A fourth criterion, one presumes, is that their explanation must be accessible to the audience; the fast Fourier transform, for example, satisfies MacCormick's first three criteria at least as well as many of the algorithms he chose, but could hardly be explained in a general-audience book.
The great challenge in a book of this type is to give explanations that, without oversimplification, are understandable to a general audience. MacCormick succeeds brilliantly at this. He lays out clearly the problems to be solved and the ways in which they arise in programs used by the reader every day; then, patiently, with a wealth of beautifully chosen examples and analogies, he builds up the key ingenious ideas behind each of the algorithms, and makes it clear how they work. A reader new to computer science will certainly have to read slowly and to think, but the effort will be well rewarded. When the technical details become too difficult to explain, MacCormick says so; he never tries to "dumb things down."
The book assumes almost no mathematics---only integer arithmetic and modular arithmetic---and almost no computer science---only experiences, such as running search engines and using ATMs, that are part of everyday life in 2012. In a proof that the halting problem is uncomputable, he introduces the idea of a program running on its own text; he spends four pages formulating this idea for the benefit of a reader who has never consciously run a program on any input at all, and has no idea that a program could double as an input.
The only discussion that I thought could be improved was that of PageRank. MacCormick devotes half of that chapter to the "random surfer" stochastic model, an analogy that I have never considered very useful for understanding PageRank. Its only real value is that it allows PageRank to be connected to the large mathematical literature on computing stationary distributions for Markov processes, which is not relevant to the intended reader of this book. A hydraulic model---the pages are reservoirs, the links are pipes, the total outflow from P is proportional to the amount in P and divided evenly among the pipes, the PageRank is the steady state---is just as accurate and avoids the confusing and irrelevant random process. The central idea of real importance is that PageRank was the first great success of "crowd-sourcing."
The book is impressively free of hype. If anything, I thought it was a little too low-key; it could have reasonably included a little more "gee whiz!" stuff. Given the two chapters on search engines, why not include a few of the numbers that indicate the sheer scale of the things (numbers of web pages indexed, of languages dealt with, of user queries per second, of machine cycles per query, of networked PCs per data center, and so forth)? In the chapter on pattern recognition, why not mention the car that drove itself across the country---not, actually, the most difficult technical accomplishment in computer vision, but one of the most astonishing. In the chapter on PageRank, why not trumpet Google's other great triumph of crowd-sourcing, the Google spelling corrector? Let me also mention an omission of a different kind: The Further Reading section for the chapter on uncomputable problems should certainly include Hofstadter's G�del, Escher, Bach.
In one respect the author and the reader have been ill served by the publisher; the quality of some of the illustrations is very poor. Those on pages 117 and 119 illustrating lossy compression of images are printed so badly that one can hardly tell the low-quality images from the high-quality images, and the whole point is lost. Likewise, the screen shots on pages 180 and 181 of the output when JPEG and executable files are opened in Excel are almost unreadable. They are garbage, of course, but the reader should be able to see that they are garbage. I hope that these flaws can be fixed in future printings, or, if that is not practical, that the images can be put on the web, with a URL given in the text. But these are very small blemishes in what is, overall, an extraordinary achievement in the daunting task of presenting computer science for a popular audience.
Ernest Davis is a professor of computer science at the Courant Institute of Mathematical Sciences, NYU.