glen cowan, scma4, 12-15 june, 2006 1 the small-n problem in high energy physics glen cowan...

Glen Cowan, SCMA4, 12-15 June, 2006

1

The small-n problem in High Energy Physics

Glen Cowan

Glen CowanDepartment of PhysicsRoyal Holloway, University of [email protected]/~cowan

Statistical Challenges in Modern Astronomy IV

SCMA4, 12-15 June, 2006

June 12 - 15, 2006


2

Outline

Glen Cowan

I. High Energy Physics (HEP) overviewTheoryExperimentsData

II. The small-n problem, etc.Making a discoverySetting limitsSystematic uncertainties

III. Conclusions

SCMA4, 12-15 June, 2006


3

The current picture in particle physics

Matter... + force carriers...

photon ()W±

Zgluon (g)

+ relativity + quantum mechanics + symmetries...

= “The Standard Model”

• almost certainly incomplete• 25 free parameters (masses, coupling strengths,...)• should include Higgs boson (not yet seen)• no gravity yet• agrees with all experimental observations so far

Glen Cowan SCMA4, 12-15 June, 2006


4

Experiments in High Energy Physics

Glen Cowan

HEP mainly studies particle collisions in accelerators, e.g.,

Large Electron-Positron (LEP) Collider at CERN, 1989-2000

4 detectors, each collaboration ~400 physicists.

SCMA4, 12-15 June, 2006


5

More HEP experiments

Glen Cowan

LEP tunnel now used for the Large Hadron Collider (LHC)

proton-proton collisions, Ecm=14 TeV, very high luminosity

Two general purpose detectors: ATLAS and CMS

Each detector collaboration has ~2000 physicists

Data taking to start 2007

The ATLAS DetectorSCMA4, 12-15 June, 2006


6

HEP data

Glen Cowan

Basic unit of data: an ‘event’.

Ideally, an event is a list of momentum vectors & particle types.

In practice, particles ‘reconstructed’ as tracks, clusters of energydeposited in calorimeters, etc.

Resolution, angular coverage, particle id, etc. imperfect.

An event from the ALEPH detector at LEP

SCMA4, 12-15 June, 2006


7

Data samples

Glen Cowan

At LEP, event rates typically ~Hz or less

~106 Z boson events in 5 years for each of 4 experiments

At LHC, ~109 events/sec(!!!), mostly uninteresting;

do quick sifting, record ~200 events/sec

single event ~ 1 Mbyte

1 ‘year’ ≈ 107 s, 1016 pp collisions per year,

2 billion / year recorded (~2 Pbyte / year)

For new/rare processes, rates at LHC can be vanishingly small

Higgs bosons detectable per year could be e.g. ~103

→ ‘needle in a haystack’

SCMA4, 12-15 June, 2006


8

HEP game plan

Glen Cowan

Goals include:

Fill in the gaps in the Standard Model (e.g. find the Higgs)

Find something beyond the Standard Model (New Physics)

Example of an extension to SM: Supersymmetry (SUSY)

For every SM particle → SUSY partner (none yet seen!)

Minimal SUSY has 105 free parameters, constrainedmodels ~5 parameters (plus the 25 from SM)

Provides dark matter candidate (neutralino), unificationof gauge couplings, solution to hierarchy problem,...

Lightest SUSY particle can be stable (effectively invisible)

SCMA4, 12-15 June, 2006


9

Simulated HEP data

Glen Cowan

Monte Carlo event generators available for essentially allStandard Model processes, also for many possible extensionsto the SM (supersymmetric models, extra dimensions, etc.)

SM predictions rely on a variety of approximations (perturbation theory to limited order, phenomenological modeling of non-perturbative effects, etc.)

Monte Carlo programs also used to simulate detector response.

Simulated event for ATLAS

SCMA4, 12-15 June, 2006


10

A simulated event

Glen Cowan

PYTHIA Monte Carlopp → gluino-gluino

SCMA4, 12-15 June, 2006


11Glen Cowan

The data streamExperiment records events of different types, with different numbers of particles, kinematic properties, ...

SCMA4, 12-15 June, 2006


12Glen Cowan

Selecting events

To search for events of a given type (H0: ‘signal’), need discriminatingvariable(s) distributed as differentlyas possible relative to unwanted event types (H1: ‘background’)

Count number of events in acceptance region defined by ‘cuts’

Expected number of signal events: s = s s L

Expected number of background events: b = b b L

s, b = cross section for signal, background

‘Efficiencies’: s = P( accept | s ), b = P( accept | b )

L = integrated luminosity (related to beam intensity, data taking time)SCMA4, 12-15 June, 2006


13Glen Cowan

Poisson data with background

Count n events, e.g., in fixed time or integrated luminosity.

s = expected number of signal events

b = expected number of background events

n ~ Poisson(s+b):

Sometimes b known, other times it is in some way uncertain.

Goals: (i) convince people that s ≠ 0 (discovery);(ii) measure or place limits on s, taking into consideration the uncertainty in b.

Widely discussed in HEP community, see e.g. proceedings ofPHYSTAT meetings, Durham, Fermilab, CERN workshops...

SCMA4, 12-15 June, 2006


14

Making a discovery

Glen Cowan

Often compute p-value of the ‘background only’ hypothesis H0 using test variable related to a characteristic of the signal.

p-value = Probability to see data as incompatible with H0, or more so, relative to the data observed.

Requires definition of ‘incompatible with H0’

HEP folklore: claim discovery if p-value equivalent to a 5 fluctuation of Gaussian variable (one-sided)

Actual p-value at which discovery becomes believable will depend on signal in question (subjective)

Why not do Bayesian analysis?

Usually don’t know how to assign meaningful priorprobabilities

SCMA4, 12-15 June, 2006


15

Computing p-values

Glen Cowan

For n ~ Poisson (s+b) we compute p-value of H0 : s = 0

Often we don’t simply count events but also measure foreach event one or more quantities

number of events observed n replaced by numbers of events (n1, ..., nN) in a histogram

Goodness-of-fit variable could be e.g. Pearson’s 2

SCMA4, 12-15 June, 2006


16

Example: search for the Higgs boson at LEP

Glen Cowan

Several usable signal modes:

Important background from e+e → ZZ

Mass of jet pair = mass of Higgs boson;b jets contain tracksnot from interaction point

b-jet pair of virtualZ can mimic Higgs

SCMA4, 12-15 June, 2006


17

A candidate Higgs event

Glen Cowan

17 ‘Higgs like’ candidates seen but no claim of discovery -- p-value of s=0 (background only) hypothesis ≈ 0.09

SCMA4, 12-15 June, 2006


18Glen Cowan

Setting limitsFrequentist intervals (limits) for a parameter s can be found by defining a test of the hypothesized value s (do this for all s):

Specify values of the data n that are ‘disfavoured’ by s (critical region) such that P(n in critical region) ≤ for a prespecified , e.g., 0.05 or 0.1.

(Because of discrete data, need inequality here.)

If n is observed in the critical region, reject the value s.

Now invert the test to define a confidence interval as:

set of s values that would not be rejected in a test ofsize (confidence level is 1 ).

The interval will cover the true value of s with probability ≥ 1 .

SCMA4, 12-15 June, 2006


19Glen Cowan

Setting limits: ‘classical method’E.g. for upper limit on s, take critical region to be low values of n, limit sup at confidence level 1 thus found from

Similarly for lower limit at confidence level 1 ,

Sometimes choose → central confidence interval.

SCMA4, 12-15 June, 2006


20Glen Cowan

Calculating classical limitsTo solve for slo, sup, can exploit relation to 2 distribution:

SCMA4, 12-15 June, 2006

Quantile of 2 distribution

For low fluctuation of n this can give negative result for sup; i.e. confidence interval is empty.

b


21Glen Cowan

Likelihood ratio limits (Feldman-Cousins)Define likelihood ratio for hypothesized parameter value s:

Here is the ML estimator, note

Critical region defined by low values of likelihood ratio.

Resulting intervals can be one- or two-sided (depending on n).

(Re)discovered for HEP by Feldman and Cousins, Phys. Rev. D 57 (1998) 3873.

SCMA4, 12-15 June, 2006


22Glen Cowan

Coverage probability of confidence intervalsBecause of discreteness of Poisson data, probability for intervalto include true value in general > confidence level (‘over-coverage’)

SCMA4, 12-15 June, 2006


23Glen Cowan

More on intervals from LR test (Feldman-Cousins)

Caveat with coverage: suppose we find n >> b.Usually one then quotes a measurement:

If, however, n isn’t large enough to claim discovery, onesets a limit on s.

FC pointed out that if this decision is made based on n, thenthe actual coverage probability of the interval can be less thanthe stated confidence level (‘flip-flopping’).

FC intervals remove this, providing a smooth transition from1- to 2-sided intervals, depending on n.

But, suppose FC gives e.g. 0.1 < s < 5 at 90% CL, p-value of s=0 still substantial. Part of upper-limit ‘wasted’?

SCMA4, 12-15 June, 2006


24Glen Cowan

Properties of upper limits

Upper limit sup vs. n Mean upper limit vs. s

Example: take b = 5.0, 1 - = 0.95

SCMA4, 12-15 June, 2006


25

Upper limit versus b


b

If n = 0 observed, should upper limit depend on b?Classical: yesBayesian: noFC: yes

Feldman & Cousins, PRD 57 (1998) 3873


26Glen Cowan

Nuisance parameters and limitsIn general we don’t know the background b perfectly.

Suppose we have a measurement of b, e.g., bmeas ~ N (b, b)

So the data are really: n events and the value bmeas.

In principle the confidence interval recipe can be generalized to two measurements and two parameters.

Difficult and not usually attempted, but see e.g. talks by K. Cranmer at PHYSTAT03, G. Punzi at PHYSTAT05.

G. Punzi, PHYSTAT05

SCMA4, 12-15 June, 2006


27Glen Cowan

Bayesian limits with uncertainty on bUncertainty on b goes into the prior, e.g.,

Put this into Bayes’ theorem,

Marginalize over b, then use p(s|n) to find intervals for swith any desired probability content.

For b = 0, b = 0, (s) = const. (s > 0), Bayesian upper limit coincides with classical one.

SCMA4, 12-15 June, 2006


28Glen Cowan

Cousins-Highland method

Regard b as random, characterized by pdf (b).

Makes sense in Bayesian approach, but in frequentist model b is constant (although unknown).

A measurement bmeas is random but this is not the meannumber of background events, rather, b is.

Compute anyway

This would be the probability for n if Nature were to generatea new value of b upon repetition of the experiment with b(b).

Now e.g. use this P(n;s) in the classical recipe for upper limitat CL = 1 :

Widely used method in HEP.

SCMA4, 12-15 June, 2006


29Glen Cowan

‘Integrated likelihoods’

Consider again signal s and background b, suppose we haveuncertainty in b characterized by a prior pdf b(b).

Define integrated likelihood asalso called modified profile likelihood, in any case nota real likelihood.

Now use this to construct likelihood-ratio test and invertto obtain confidence intervals.

Feldman-Cousins & Cousins-Highland (FHC2), see e.g.J. Conrad et al., Phys. Rev. D67 (2003) 012002 and Conrad/Tegenfeldt PHYSTAT05 talk.

Calculators available (Conrad, Tegenfeldt, Barlow).

SCMA4, 12-15 June, 2006


30Glen Cowan

Correlation between

causes errors

to increase.

Standard deviations from

tangent lines to contour

Digression: tangent plane methodConsider least-squares fit with parameter of interest 0 andnuisance parameter 1, i.e., minimize

SCMA4, 12-15 June, 2006


31Glen Cowan

The ‘tangent plane’ method is a special case of using the

profile likelihood:

The profile likelihood

is found by maximizing L (0, 1) for each 0.

Equivalently use

The interval obtained from is the same as

what is obtained from the tangents to

Well known in HEP as the ‘MINOS’ method in MINUIT.

See e.g. talks by Reid, Cranmer, Rolke at PHYSTAT05.

SCMA4, 12-15 June, 2006


32Glen Cowan

Interval from inverting profile LR test

Suppose we have a measurement bmeas of b.

Build the likelihood ratio test with profile likelihood:

and use this to construct confidence intervals.

Not widely used in HEP but recommended in e.g. Kendall & Stuart; see also PHYSTAT05 talks by Cranmer, Feldman, Cousins, Reid.

SCMA4, 12-15 June, 2006


33

Wrapping up,

Glen Cowan

Frequentist methods have been most widely used but for manyquestions (particularly related to systematics), Bayesian methodsare getting more notice.

Frequentist properties such as coverage probability of confidenceintervals seen as very important (overly so?)

Bayesian methods remain problematic in cases where it is difficult to enumerate alternative hypotheses and assign meaningful prior probabilities.

Tools widely applied at LEP; some work needed to extendthese to LHC analyses (ongoing).

SCMA4, 12-15 June, 2006


34

Finally,

Glen Cowan

The LEP programme was dominated by limit setting:

Standard Model confirmed, No New Physics

The Tevatron discovered the top quark and Bs mixing (both partsof the SM) and also set many limits (but NNP)

By ~2012 either we’ll have discovered something new and interesting beyond the Standard Model,

or,

we’ll still be setting limits and HEP should think seriously about a new approach!

SCMA4, 12-15 June, 2006


35

Extra slides



36

A recent discovery: Bs oscillations

Glen Cowan

Recently the D0 experiment (Fermilab) announced the discovery of Bs mixing:

Moriond talk by Brendan Casey, also hep-ex/0603029

Produce a Bq meson at time t=0; there is a time dependentprobability for it to decay as an anti-Bq (q = d or s):

|Vts|À |Vtd| and so Bs oscillates quickly compared to decay rateSought but not seen at LEP; early on predicted to be visible at Tevatron

Discovery quickly confirmed by the CDF experiment

SCMA4, 12-15 June, 2006


37Glen Cowan Statistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester


38Glen Cowan

Confidence interval from likelihood function In the large sample limit it can be shown for ML estimators:

defines a hyper-ellipsoidal confidence region,

If then

(n-dimensional Gaussian, covariance V)

SCMA4, 12-15 June, 2006


39Glen Cowan

Approximate confidence regions from L() So the recipe to find the confidence region with CL = 1 is:

For finite samples, these are approximate confidence regions.

Coverage probability not guaranteed to be equal to ;

no simple theorem to say by how far off it will be (use MC).

SCMA4, 12-15 June, 2006


40Glen Cowan Statistics in HEP, IoP Half Day Meeting, 16 November 2005, Manchester


41Glen Cowan

Upper limit from test of hypothesized ms

Base test on likelihood ratio (here = ms):

Observed value is lobs , sampling distribution is g(l;) (from MC)

is excluded at CL=1 if

D0 shows the distribution of ln l for ms = 25 ps-1

equivalent to 2.1 effect

95% CL upper limit

SCMA4, 12-15 June, 2006


42

The significance of an observed signal

Glen Cowan

Suppose b = 0.5, and we observe nobs = 5.

Often, however, b has some uncertainty

this can have significant impact on p-value,e.g. if b = 0.8, p-value = 1.4 103

SCMA4, 12-15 June, 2006


43

The significance of a peak

Glen Cowan

Suppose we measure a value x for each event and find:

Each bin (observed) is aPoisson r.v., means aregiven by dashed lines.

In the two bins with the peak, 11 entries found with b = 3.2.We are tempted to compute the p-value for the s = 0 hypothesis as:

SCMA4, 12-15 June, 2006


44

The significance of a peak (2)

Glen Cowan

But... did we know where to look for the peak?

→ give P(n ≥ 11) in any 2 adjacent bins

Is the observed width consistent with the expected x resolution?

→ take x window several times the expected resolution

How many bins distributions have we looked at?

→ look at a thousand of them, you’ll find a 10-3 effect

Did we adjust the cuts to ‘enhance’ the peak?

→ freeze cuts, repeat analysis with new data

How about the bins to the sides of the peak... (too low!)

Should we publish????

SCMA4, 12-15 June, 2006


45

Statistical vs. systematic errors

Glen Cowan

Statistical errors:

How much would the result fluctuate upon repetition of the measurement?

Implies some set of assumptions to define probability of outcome of the measurement.

Systematic errors:

What is the uncertainty in my result due to uncertainty in my assumptions, e.g.,

model (theoretical) uncertainty;modeling of measurement apparatus.

The sources of error do not vary upon repetition of the measurement. Often result from uncertainvalue of, e.g., calibration constants, efficiencies, etc.

SCMA4, 12-15 June, 2006


46

Systematic errors and nuisance parameters

Glen Cowan

Response of measurement apparatus is never modeled perfectly:

x (true value)

y (m

easu

red

valu

e) model:

truth:

Model can be made to approximate better the truth by includingmore free parameters.

systematic uncertainty ↔ nuisance parameters

SCMA4, 12-15 June, 2006

glen cowan, scma4, 12-15 june, 2006 1 the small-n problem in high energy physics glen cowan...

Documents

outline glen cowan

data samples glen cowan

lep scma4

glen cowanscma4

conclusions scma4

invisible scma4

haystack scma4

hep game plan glen cowan