Wednesday, July 11, 2012

More on Statistical Reasoning and the Higgs Boson

A posting of July 6, "The Probability that the Higgs Boson Has Been Discovered," mentioned the transposition of a p-value in stories in the popular press about the discovery of what is likely to be the Higgs Boson. Professor Dennis Lindley, a major figure in the development of Bayesian methods (and known to some readers of this blog as the author of a classic paper on using them to identify glass fragments) posed a few questions on the experiment via the list server of the International Society for Bayesian Analysis. One highly informed set of answers came from Louis Lyons (organiser of PHYSTAT series of meetings, and a member of CMS Collaboration at CERN). The following is a slightly edited version of the comments of Lindley (DL) and Lyons (LL). The comments presuppose knowledge of the meaning of a p-value, a likelihood ratio, Bayes' rule, and the divide between frequentists and Bayesians. (The original text as well as many other interesting messages are at http://bayesian.org/forums/news/3648.)

DL:
Specifically, the news referred to a confidence interval with 5-sigma limits.

LL:
The test statistic we use for looking at p-values is basically the likelihood ratio for the two hypotheses (H_0 = Standard Model (S. M.) of Particle Physics, but no Higgs; H_1 = S.M with Higgs). A small p_0 (and a reasonable p_1) then implies that H_1 is a better description of the data than H_0. This of course does not prove that H_1 is correct, but maybe Nature corresponds to some H_2, which is more like H_1 than it is like H_0. Indeed in principle data will never prove a theory is true, but the more experimental tests it survives, the happier we are to use it -- e.g. Newtonian mechanics was fine for centuries till the arrival of Relativity.

In the case of the Higgs, it can decay to different sets of particles, and these rates are defined by the S.M.  We measure these ratios, but with large uncertainties with the present data. They are consistent with the S.M. predictions, but it could be much more convincing with more data. Hence the caution about saying we have discovered the Higgs of the S.M.

DL:
Five standard deviations, assuming normality, means a p-value of around 0.0000005. A number of questions spring to mind.

1.  Why such an extreme evidence requirement? We know from a Bayesian perspective that this only makes sense if (a) the existence of the Higgs boson (or some other particle sharing some of its properties) has extremely small prior probability and/or (b) the consequences of erroneously announcing its discovery are dire in the extreme. Neither seems to be the case, so why 5-sigma?

LL:
This is an unfortunate tradition, that is used more readily by journal editors than by Particle Physicists. Reasons are
a) Historically we have had 3 and 4 sigma effects that have gone away

b) The 'Look Elsewhere Effect' (LEE). We are worried about the chance of a statistical fluctuation mimicking our observation, not only at the given mass of 125 GeV but anywhere in the spectrum. The quoted p-values are 'local' i.e. the chance of a fluctuation at the observed mass. Unfortunately the LEE correction factor is not very precisely defined, because of ambiguities about what is meant by 'elsewhere'

c) The possibility of some systematic effect (characterised by a nuisance parameter) being more important than allowed for in the analysis, or even overlooked - see the recent experiment at CERN which claimed that neutrinos travelled faster than the speed of light.

d) A subconscious use of Bayes Theorem to turn p-values into probabilities about the hypotheses.
All the above vary from experiment to experiment, so we realise that it is a bit unfair to use the same standard for discovery for all analyses. We prefer just to quote the p-values (or whatever).

DL:
2. Rather than ad hoc justification of a p-value, it is of course better to do a proper Bayesian analysis.  Are the particle physics community completely wedded to frequentist analysis?

LL:
No we are not anti-Bayesian, and indeed our test statistics is a likelihood ratio. If you like, you can regard our p-values as an attempt to calibrate the meaning of a particular value of the likelihood ratio.

We actually recommend that for parameter determination at the LHC, it is useful to compare Bayesian and Frequentist methods. But for comparing hypotheses (e.g. an experimental distribution is fitted by H_0 = a smooth distribution; or by H_1 = a smooth distribution plus a localised peak), we are worried about what priors to use for the extra parameters that occur in the alternative hypothesis.We would welcome advice.

DL:
3. We know that given enough data it is nearly always possible for a significance test to reject the null hypothesis at arbitrarily low p-values, simply because the parameter will never be exactly equal to its null value. And apparently the LHC has accumulated a very large quantity of data. So could even this extreme p-value be illusory?

LL:
We are aware of this. But in fact, although the LHC has accumulated enormous amounts of data, the Higgs search is like looking for a needle in  a haystack. The final samples of events that are used to look for the Higgs contain only tens to thousands of events.

These and related issues are discussed to some extent in my article "Open statistical issues in Particle Physics", Ann. Appl. Stat. Volume 2, Number 3 (2008), 887-915. It is supposed to be statistician-friendly.

No comments:

Post a Comment