A Frequentist Does This, A Bayesian That
March 13, 2004
Book Review
Persi Diaconis
Probability Theory: The Logic of Science. By E.T. Jaynes, Cambridge University Press, Cambridge, UK, 2003, 758 pages, $60.
At last: Ed Jaynes's book on Bayesian statistics has seen the light of day. This is not an ordinary text. It is an unabashed, hard sell of the Bayesian approach to statistics. It is wonderfully down to earth, with hundreds of telling examples. Everyone who is interested in the problems or applications of statistics should have a serious look.
I've been studying mimeographed and electronic versions of Jaynes's book for more than 30 years. As a beginning graduate student, I attended a "foundations" course. I wanted to understand the difference between frequentist and Bayesian statistics. My professor gave me a long philosophical answer that I didn't understand. I wanted "here's a simple problem; a frequentist does this, a Bayesian that." A few weeks into the term, I discovered Jaynes's manuscript. He gives wonderful examples and never fails to point out how silly the frequentist position can be.
Jaynes's book could be used as a textbook for a course on Bayesian statistics at the advanced undergraduate or beginning graduate level. Among many other things, it derives the probability calculus using Richard Cox's appealing axioms, and it treats estimation and hypothesis testing in both introductory and more advanced ways. There is some real data and much sensible analysis of many examples, usually drawn from science and engineering. Two features are a substantial treatment of time series and a chapter on communication theory.
The most interesting parts of the book are its constant focus on foundational aspects. We see far too little of this in our teaching and even less in applications. Jaynes doesn't let us get away without thinking. There are sermons on reality versus models, a whole chapter on paradoxes of probability theory and another on principles and pathologies of orthodox statistics. He is always questioning (and answering his own questions): What does it all mean? Does this make sense?
The book is not mathematically sophisticated. Indeed, Jaynes---trained as a physicist---has little use for mathematical nitpicking. He doesn't do this quietly, but rather is constantly on the attack (Appendix B is an entertaining 15-page dose of this). None of this leads him to foolishness; indeed, I see a strong trend against measure theory in modern statistics departments: I had to fight to keep the measure theory requirement in Stanford's statistics graduate program. The fight was lost at Berkeley.
Jaynes is celebrated for his maximum entropy approach to the selection of prior distributions. A spirited development of this approach here (in Chapter 11) is coupled with an interesting presentation of group invariance and ignorance priors. Jaynes once attended a one-week course of lectures I gave on exchangeability. At the time, he had become a sort of guru to believers in maximum entropy. I drew him out on the issue because I think having "automatic priors" can keep people from usefully quantifying what they actually know. Jaynes agreed: "You should build what you know into the prior and save max ent for filling in at the end," he said. In this and other discussions, I found him open and interesting. Some others do not find him so. Jaynes gets in some final digs in his long-running debate with Dawid-Stone-Zideck in Chapter 15 (and elsewhere). They accused him of using silly improper priors by not knowing his left from his right Haar measure; he doesn't take it lying down. William Feller poked fun at Bayesian arguments in his celebrated book; Jaynes takes many pokes back.
One of my favorite chapters is on the physics of "random experiments." There, Jaynes carefully analyzes basic examples, such as tossing a coin, using an inspired mix of physics, Bayes, and data. I am just finishing a paper on a careful analysis of coin tossing. I am happy to report that Jaynes scooped me on several points.
The book is wonderfully out of date. The wonderful part is that Jaynes discusses and points to dozens of papers from the 1950s through the 1980s that have slipped off the map. Of course, Bayesian statistics has had an enormous growth spurt in the past 20 years. The reader will find no mention of the Gibbs sampler, nor of particle filters or any of the modern tools of Bayesian statistics. Jaynes's book was developed before the computer revolution, when Bayesian programs like Bugs or Mr. Bayes weren't even a dream. This is far from a disaster. First, it is easy to access the new Bayesian methodology. A few good recent books are listed in the references. Using them as a guide, readers can find all they need. Jaynes's focus on basic philosophical issues is timeless and much needed.
We owe much to the editor, Larry Bretthorst, for cobbling together the bulk of Jaynes's manuscript with the many bits and pieces he intended to add. Things end abruptly in a few places (e.g., on page 530), but I didn't find this at all distracting.
The book is long (727 pages) but truly ever engaging. Part of this is due to Jaynes catty style, making fun, calling names, telling it as he feels it. Nowhere is this more in evidence than in the long annotated bibliography. Jaynes is known as an objective Bayesian, but he has surely written the field's most subjective account.
There are many places in which I want to yell at him. He's so full of himself. That's what makes the book so terrific. It's the real thing---the best introduction to Bayesian statistics that I know. Go take a look for yourself.
Annotated Bibliography
I list below some key references for the modern developments not covered by Jaynes. All of these books include extensive literature surveys.
- J. Bernardo and A. Smith, Bayesian Theory, Wiley, New York, 1994. A clear, solid development of classical Bayesian statistics.
- P. Diaconis and S. Holmes, A Bayesian Peek into Feller, Vol. I., Sankhya A 64, 2002, 820-841. A route between Feller and Jaynes.
- A. Doulet, N. de Freitas, and W. Gordon, Sequential Monte Carlo Methods in Practice, Springer, New York, 2001. The main topic of this book is particle filters, a computational tool box that is crucial in engineering applications, for image and motion analysis, and much else. Along with the Gibbs sampler, this is a most important practical development in Bayesian methodology.
- J. Gill, Bayesian Methods: A Social and Behavioral Sciences Approach, Chapman and Hall, Boca Raton, Florida, 2002. A surprisingly thorough review written by a user of Bayesian statistics, with applications drawn from the social sciences.
- J.K. Gosh and R.V. Ramamoorthi, Bayesian Non-Parametrics, Springer, New York, 2003. A clear, self-contained development of this important topic.
- J. Liu, Monte Carlo Techniques in Scientific Computing, Springer, New York, 2001. Quite simply, the best recent book on Markov chain Monte Carlo and other linchpins of modern Bayesian statistics.
- M. Schervish, Theory of Statistics, Springer, New York, 1995. Treats the many mathematical topics that Jaynes glosses over. Perhaps the best single-volume treatment of mathematical statistics.
Persi Diaconis is a professor of mathematics and statistics at Stanford University.