Friday, July 6, 2012

The Probability that the Higgs Boson Has Been Discovered

Surely everyone has heard of the probable discovery of the Higgs boson. But what does it have to do with  forensic science or law? It is a reminder that the "prosecutor's fallacy" is not limited to prosecutors or courtrooms. Reports in the popular press by skilled physicists and science writers trying to explain this impressive discovery are replete with a messy form of the transposition fallacy. Here is an example from an otherwise excellent report by physicist Lawrence Krauss in Slate magazine:
One can in fact quantify the likelihood that the observations are mistaken and that the events are actually background noise mimicking a real signal. Each experiment quotes a likelihood of very close to “5 sigma,” meaning the likelihood that the events were produced by chance is less than one in 3.5 million. Yet in spite of this, the only claim that has been made so far is that the new particle is real and “Higgs-like.”
Likewise, Nature announced "just a 0.00006% probability that the result is due to chance." The New York Times reported that "the likelihood that their signal was a result of a chance fluctuation was less than one chance in 3.5 million, 'five sigma,' which is the gold standard in physics for a discovery," attributing the statement to CERN's physicists.

How is this (mis)reporting related to the transposition fallacy? Well, sigma (σ) stands for standard deviation, and 5σ means 5 standard deviations from the value expected if the measurements were just noise. For a normal distribution, results this extreme or more extreme would be seen in pure noise a small fraction of the time. The tiny figures quoted above are estimates of that fraction. The fraction is the statistician's p-value, P(>5σ | noise), and it is on the order of 10-6. In plain English (and one bit of Greek), the probability of data of more than 5σ given that they are just noise is on the order of one in a million. So the observations would be very surprising if they were just noise.

But the probability that they actually are noise is an inverse probability, P(noise | data). That probability depends on the likelihoods P(5σ | noise) and P(5σ | signal) as well as on the prior probability, P(noise). The p-value itself does not generally "quantify the likelihood that the observations are mistaken and that the events are actually background noise mimicking a real signal." It does not specify the "probability that the result is due to chance." If one wants to quantify the probability that the data are a real signal rather than noise, then, for better or worse, one must turn to Bayes' rule.

References (for physicists)

- Giulio D’Agostini, Bayesian Reasoning in High Energy Physics, CERN Yellow Report 99-03, July 1999
- Giulio D'Agostini, Probability and Measurement Uncertainty in Physics: A Bayesian Primer (1995)

A couple of other blogs (and one newspaper) making the same point




Professor Dennis Lindley, a major figure in the development of Bayesian methods (and known to some readers of this blog as the author of a classic paper on using them to identify glass fragments) posed a few questions on the Higgs boson experiment via the list server of the International Society for Bayesian Analysis. One well informed set of answers came from Louis Lyons (organiser of PHYSTAT series of meetings, and a member of CMS Collaboration at CERN). I posted a slightly edited version on July 11 under the title "More on Statistical Reasoning and the Higgs Boson." The full text of these and various other interesting messages is at

No comments:

Post a Comment