glen cowan, scma4, 12-15 june, 2006 1 the small-n problem in high energy physics glen cowan...
TRANSCRIPT
Glen Cowan, SCMA4, 12-15 June, 2006
1
The small-n problem in High Energy Physics
Glen Cowan
Glen CowanDepartment of PhysicsRoyal Holloway, University of [email protected]/~cowan
Statistical Challenges in Modern Astronomy IV
SCMA4, 12-15 June, 2006
June 12 - 15, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
2
Outline
Glen Cowan
I. High Energy Physics (HEP) overviewTheoryExperimentsData
II. The small-n problem, etc.Making a discoverySetting limitsSystematic uncertainties
III. Conclusions
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
3
The current picture in particle physics
Matter... + force carriers...
photon ()W±
Zgluon (g)
+ relativity + quantum mechanics + symmetries...
= “The Standard Model”
• almost certainly incomplete• 25 free parameters (masses, coupling strengths,...)• should include Higgs boson (not yet seen)• no gravity yet• agrees with all experimental observations so far
Glen Cowan SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
4
Experiments in High Energy Physics
Glen Cowan
HEP mainly studies particle collisions in accelerators, e.g.,
Large Electron-Positron (LEP) Collider at CERN, 1989-2000
4 detectors, each collaboration ~400 physicists.
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
5
More HEP experiments
Glen Cowan
LEP tunnel now used for the Large Hadron Collider (LHC)
proton-proton collisions, Ecm=14 TeV, very high luminosity
Two general purpose detectors: ATLAS and CMS
Each detector collaboration has ~2000 physicists
Data taking to start 2007
The ATLAS DetectorSCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
6
HEP data
Glen Cowan
Basic unit of data: an ‘event’.
Ideally, an event is a list of momentum vectors & particle types.
In practice, particles ‘reconstructed’ as tracks, clusters of energydeposited in calorimeters, etc.
Resolution, angular coverage, particle id, etc. imperfect.
An event from the ALEPH detector at LEP
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
7
Data samples
Glen Cowan
At LEP, event rates typically ~Hz or less
~106 Z boson events in 5 years for each of 4 experiments
At LHC, ~109 events/sec(!!!), mostly uninteresting;
do quick sifting, record ~200 events/sec
single event ~ 1 Mbyte
1 ‘year’ ≈ 107 s, 1016 pp collisions per year,
2 billion / year recorded (~2 Pbyte / year)
For new/rare processes, rates at LHC can be vanishingly small
Higgs bosons detectable per year could be e.g. ~103
→ ‘needle in a haystack’
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
8
HEP game plan
Glen Cowan
Goals include:
Fill in the gaps in the Standard Model (e.g. find the Higgs)
Find something beyond the Standard Model (New Physics)
Example of an extension to SM: Supersymmetry (SUSY)
For every SM particle → SUSY partner (none yet seen!)
Minimal SUSY has 105 free parameters, constrainedmodels ~5 parameters (plus the 25 from SM)
Provides dark matter candidate (neutralino), unificationof gauge couplings, solution to hierarchy problem,...
Lightest SUSY particle can be stable (effectively invisible)
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
9
Simulated HEP data
Glen Cowan
Monte Carlo event generators available for essentially allStandard Model processes, also for many possible extensionsto the SM (supersymmetric models, extra dimensions, etc.)
SM predictions rely on a variety of approximations (perturbation theory to limited order, phenomenological modeling of non-perturbative effects, etc.)
Monte Carlo programs also used to simulate detector response.
Simulated event for ATLAS
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
10
A simulated event
Glen Cowan
PYTHIA Monte Carlopp → gluino-gluino
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
11Glen Cowan
The data streamExperiment records events of different types, with different numbers of particles, kinematic properties, ...
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
12Glen Cowan
Selecting events
To search for events of a given type (H0: ‘signal’), need discriminatingvariable(s) distributed as differentlyas possible relative to unwanted event types (H1: ‘background’)
Count number of events in acceptance region defined by ‘cuts’
Expected number of signal events: s = s s L
Expected number of background events: b = b b L
s, b = cross section for signal, background
‘Efficiencies’: s = P( accept | s ), b = P( accept | b )
L = integrated luminosity (related to beam intensity, data taking time)SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
13Glen Cowan
Poisson data with background
Count n events, e.g., in fixed time or integrated luminosity.
s = expected number of signal events
b = expected number of background events
n ~ Poisson(s+b):
Sometimes b known, other times it is in some way uncertain.
Goals: (i) convince people that s ≠ 0 (discovery);(ii) measure or place limits on s, taking into consideration the uncertainty in b.
Widely discussed in HEP community, see e.g. proceedings ofPHYSTAT meetings, Durham, Fermilab, CERN workshops...
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
14
Making a discovery
Glen Cowan
Often compute p-value of the ‘background only’ hypothesis H0 using test variable related to a characteristic of the signal.
p-value = Probability to see data as incompatible with H0, or more so, relative to the data observed.
Requires definition of ‘incompatible with H0’
HEP folklore: claim discovery if p-value equivalent to a 5 fluctuation of Gaussian variable (one-sided)
Actual p-value at which discovery becomes believable will depend on signal in question (subjective)
Why not do Bayesian analysis?
Usually don’t know how to assign meaningful priorprobabilities
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
15
Computing p-values
Glen Cowan
For n ~ Poisson (s+b) we compute p-value of H0 : s = 0
Often we don’t simply count events but also measure foreach event one or more quantities
number of events observed n replaced by numbers of events (n1, ..., nN) in a histogram
Goodness-of-fit variable could be e.g. Pearson’s 2
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
16
Example: search for the Higgs boson at LEP
Glen Cowan
Several usable signal modes:
Important background from e+e → ZZ
Mass of jet pair = mass of Higgs boson;b jets contain tracksnot from interaction point
b-jet pair of virtualZ can mimic Higgs
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
17
A candidate Higgs event
Glen Cowan
17 ‘Higgs like’ candidates seen but no claim of discovery -- p-value of s=0 (background only) hypothesis ≈ 0.09
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
18Glen Cowan
Setting limitsFrequentist intervals (limits) for a parameter s can be found by defining a test of the hypothesized value s (do this for all s):
Specify values of the data n that are ‘disfavoured’ by s (critical region) such that P(n in critical region) ≤ for a prespecified , e.g., 0.05 or 0.1.
(Because of discrete data, need inequality here.)
If n is observed in the critical region, reject the value s.
Now invert the test to define a confidence interval as:
set of s values that would not be rejected in a test ofsize (confidence level is 1 ).
The interval will cover the true value of s with probability ≥ 1 .
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
19Glen Cowan
Setting limits: ‘classical method’E.g. for upper limit on s, take critical region to be low values of n, limit sup at confidence level 1 thus found from
Similarly for lower limit at confidence level 1 ,
Sometimes choose → central confidence interval.
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
20Glen Cowan
Calculating classical limitsTo solve for slo, sup, can exploit relation to 2 distribution:
SCMA4, 12-15 June, 2006
Quantile of 2 distribution
For low fluctuation of n this can give negative result for sup; i.e. confidence interval is empty.
b
Glen Cowan, SCMA4, 12-15 June, 2006
21Glen Cowan
Likelihood ratio limits (Feldman-Cousins)Define likelihood ratio for hypothesized parameter value s:
Here is the ML estimator, note
Critical region defined by low values of likelihood ratio.
Resulting intervals can be one- or two-sided (depending on n).
(Re)discovered for HEP by Feldman and Cousins, Phys. Rev. D 57 (1998) 3873.
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
22Glen Cowan
Coverage probability of confidence intervalsBecause of discreteness of Poisson data, probability for intervalto include true value in general > confidence level (‘over-coverage’)
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
23Glen Cowan
More on intervals from LR test (Feldman-Cousins)
Caveat with coverage: suppose we find n >> b.Usually one then quotes a measurement:
If, however, n isn’t large enough to claim discovery, onesets a limit on s.
FC pointed out that if this decision is made based on n, thenthe actual coverage probability of the interval can be less thanthe stated confidence level (‘flip-flopping’).
FC intervals remove this, providing a smooth transition from1- to 2-sided intervals, depending on n.
But, suppose FC gives e.g. 0.1 < s < 5 at 90% CL, p-value of s=0 still substantial. Part of upper-limit ‘wasted’?
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
24Glen Cowan
Properties of upper limits
Upper limit sup vs. n Mean upper limit vs. s
Example: take b = 5.0, 1 - = 0.95
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
25
Upper limit versus b
Glen Cowan SCMA4, 12-15 June, 2006
b
If n = 0 observed, should upper limit depend on b?Classical: yesBayesian: noFC: yes
Feldman & Cousins, PRD 57 (1998) 3873
Glen Cowan, SCMA4, 12-15 June, 2006
26Glen Cowan
Nuisance parameters and limitsIn general we don’t know the background b perfectly.
Suppose we have a measurement of b, e.g., bmeas ~ N (b, b)
So the data are really: n events and the value bmeas.
In principle the confidence interval recipe can be generalized to two measurements and two parameters.
Difficult and not usually attempted, but see e.g. talks by K. Cranmer at PHYSTAT03, G. Punzi at PHYSTAT05.
G. Punzi, PHYSTAT05
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
27Glen Cowan
Bayesian limits with uncertainty on bUncertainty on b goes into the prior, e.g.,
Put this into Bayes’ theorem,
Marginalize over b, then use p(s|n) to find intervals for swith any desired probability content.
For b = 0, b = 0, (s) = const. (s > 0), Bayesian upper limit coincides with classical one.
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
28Glen Cowan
Cousins-Highland method
Regard b as random, characterized by pdf (b).
Makes sense in Bayesian approach, but in frequentist model b is constant (although unknown).
A measurement bmeas is random but this is not the meannumber of background events, rather, b is.
Compute anyway
This would be the probability for n if Nature were to generatea new value of b upon repetition of the experiment with b(b).
Now e.g. use this P(n;s) in the classical recipe for upper limitat CL = 1 :
Widely used method in HEP.
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
29Glen Cowan
‘Integrated likelihoods’
Consider again signal s and background b, suppose we haveuncertainty in b characterized by a prior pdf b(b).
Define integrated likelihood asalso called modified profile likelihood, in any case nota real likelihood.
Now use this to construct likelihood-ratio test and invertto obtain confidence intervals.
Feldman-Cousins & Cousins-Highland (FHC2), see e.g.J. Conrad et al., Phys. Rev. D67 (2003) 012002 and Conrad/Tegenfeldt PHYSTAT05 talk.
Calculators available (Conrad, Tegenfeldt, Barlow).
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
30Glen Cowan
Correlation between
causes errors
to increase.
Standard deviations from
tangent lines to contour
Digression: tangent plane methodConsider least-squares fit with parameter of interest 0 andnuisance parameter 1, i.e., minimize
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
31Glen Cowan
The ‘tangent plane’ method is a special case of using the
profile likelihood:
The profile likelihood
is found by maximizing L (0, 1) for each 0.
Equivalently use
The interval obtained from is the same as
what is obtained from the tangents to
Well known in HEP as the ‘MINOS’ method in MINUIT.
See e.g. talks by Reid, Cranmer, Rolke at PHYSTAT05.
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
32Glen Cowan
Interval from inverting profile LR test
Suppose we have a measurement bmeas of b.
Build the likelihood ratio test with profile likelihood:
and use this to construct confidence intervals.
Not widely used in HEP but recommended in e.g. Kendall & Stuart; see also PHYSTAT05 talks by Cranmer, Feldman, Cousins, Reid.
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
33
Wrapping up,
Glen Cowan
Frequentist methods have been most widely used but for manyquestions (particularly related to systematics), Bayesian methodsare getting more notice.
Frequentist properties such as coverage probability of confidenceintervals seen as very important (overly so?)
Bayesian methods remain problematic in cases where it is difficult to enumerate alternative hypotheses and assign meaningful prior probabilities.
Tools widely applied at LEP; some work needed to extendthese to LHC analyses (ongoing).
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
34
Finally,
Glen Cowan
The LEP programme was dominated by limit setting:
Standard Model confirmed, No New Physics
The Tevatron discovered the top quark and Bs mixing (both partsof the SM) and also set many limits (but NNP)
By ~2012 either we’ll have discovered something new and interesting beyond the Standard Model,
or,
we’ll still be setting limits and HEP should think seriously about a new approach!
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
35
Extra slides
Glen Cowan SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
36
A recent discovery: Bs oscillations
Glen Cowan
Recently the D0 experiment (Fermilab) announced the discovery of Bs mixing:
Moriond talk by Brendan Casey, also hep-ex/0603029
Produce a Bq meson at time t=0; there is a time dependentprobability for it to decay as an anti-Bq (q = d or s):
|Vts|À |Vtd| and so Bs oscillates quickly compared to decay rateSought but not seen at LEP; early on predicted to be visible at Tevatron
Discovery quickly confirmed by the CDF experiment
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
37Glen Cowan Statistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester
Glen Cowan, SCMA4, 12-15 June, 2006
38Glen Cowan
Confidence interval from likelihood function In the large sample limit it can be shown for ML estimators:
defines a hyper-ellipsoidal confidence region,
If then
(n-dimensional Gaussian, covariance V)
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
39Glen Cowan
Approximate confidence regions from L() So the recipe to find the confidence region with CL = 1 is:
For finite samples, these are approximate confidence regions.
Coverage probability not guaranteed to be equal to ;
no simple theorem to say by how far off it will be (use MC).
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
40Glen Cowan Statistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester
Glen Cowan, SCMA4, 12-15 June, 2006
41Glen Cowan
Upper limit from test of hypothesized ms
Base test on likelihood ratio (here = ms):
Observed value is lobs , sampling distribution is g(l;) (from MC)
is excluded at CL=1 if
D0 shows the distribution of ln l for ms = 25 ps-1
equivalent to 2.1 effect
95% CL upper limit
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
42
The significance of an observed signal
Glen Cowan
Suppose b = 0.5, and we observe nobs = 5.
Often, however, b has some uncertainty
this can have significant impact on p-value,e.g. if b = 0.8, p-value = 1.4 103
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
43
The significance of a peak
Glen Cowan
Suppose we measure a value x for each event and find:
Each bin (observed) is aPoisson r.v., means aregiven by dashed lines.
In the two bins with the peak, 11 entries found with b = 3.2.We are tempted to compute the p-value for the s = 0 hypothesis as:
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
44
The significance of a peak (2)
Glen Cowan
But... did we know where to look for the peak?
→ give P(n ≥ 11) in any 2 adjacent bins
Is the observed width consistent with the expected x resolution?
→ take x window several times the expected resolution
How many bins distributions have we looked at?
→ look at a thousand of them, you’ll find a 10-3 effect
Did we adjust the cuts to ‘enhance’ the peak?
→ freeze cuts, repeat analysis with new data
How about the bins to the sides of the peak... (too low!)
Should we publish????
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
45
Statistical vs. systematic errors
Glen Cowan
Statistical errors:
How much would the result fluctuate upon repetition of the measurement?
Implies some set of assumptions to define probability of outcome of the measurement.
Systematic errors:
What is the uncertainty in my result due to uncertainty in my assumptions, e.g.,
model (theoretical) uncertainty;modeling of measurement apparatus.
The sources of error do not vary upon repetition of the measurement. Often result from uncertainvalue of, e.g., calibration constants, efficiencies, etc.
SCMA4, 12-15 June, 2006
Glen Cowan, SCMA4, 12-15 June, 2006
46
Systematic errors and nuisance parameters
Glen Cowan
Response of measurement apparatus is never modeled perfectly:
x (true value)
y (m
easu
red
valu
e) model:
truth:
Model can be made to approximate better the truth by includingmore free parameters.
systematic uncertainty ↔ nuisance parameters
SCMA4, 12-15 June, 2006