advanced analysis techniques in hep

34
ACAT2000 Oct. 16-20, 200 0 Pushpa Bhat 1 Advanced Analysis Techniques in HEP Pushpa Bhat Fermilab ACAT2000 Fermilab, IL October 2000 A reasonable man adapts himself to the world. An unreasonable man persists to adapts the world to himself. So, all So, all progress depends on the unreasonable one. - Bernard Shaw

Upload: kelly-montoya

Post on 31-Dec-2015

25 views

Category:

Documents


2 download

DESCRIPTION

Advanced Analysis Techniques in HEP. A reasonable man adapts himself to the world. An unreasonable man persists to adapts the world to himself. So, all progress depends on the unreasonable one. - Bernard Shaw. Pushpa Bhat Fermilab. ACAT2000 Fermilab, IL October 2000. Outline. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

1

Advanced Analysis Techniques in HEP

Pushpa BhatFermilab

ACAT2000Fermilab, ILOctober 2000

A reasonable man adapts himself to the world.An unreasonable man persists to adapts the world to himself.So, all So, all progress depends on the unreasonable one.

- Bernard Shaw

Page 2: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

2

Outline Introduction

Intelligent Detectors Moving intelligence closer to action

Optimal Analysis Methods

The Neural Network Revolution

New Searches & Precision Measurements Discovery reach for the Higgs Boson Measuring Top quark mass, Higgs mass

Sophisticated Approaches

Probabilistic Approach to Data Analysis

Summary

Page 3: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

3

Data CollectionData Collection

World before Experiment

World After Experiment

Data TransformationData Transformation

Feature ExtractionFeature Extraction

Global DecisionGlobal Decision Data InterpretationData Interpretation

DataOrganization

ReductionAnalysis

DataOrganization

ReductionAnalysis

Data CollectionData Collection

Express Analysis

Page 4: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

4

Intelligent DetectorsData analysis starts when a high energy event occursTransform electronic data into useful “physics” information in real-time Move intelligence closer to action!

Algorithm-specific hardware Neural Networks in Silicon

Configurable hardware FPGAs, DSPs – Implement “smart” algorithms in hardware

Innovative data management on-line + “smart” algorithms in hardware Data in RAM disk & AI algorithms in FPGAs

Expert Systems for Control & Monitoring

Page 5: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

5

Data Analysis TasksParticle Identification e-ID, -ID, b-ID, e/, q/g

Signal/Background Event Classification Signals of new physics are rare and small

(Finding a “jewel” in a hay-stack)

Parameter Estimation t mass, H mass, track parameters, for example

Function Approximation Correction functions, tag rates, fake rates

Data Exploration Knowledge Discovery via data-mining Data-driven extraction of information, latent structure analysis

Page 6: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

6

Optimal Analysis MethodsThe measurements being multivariate, the optimal methods of analyses are necessarily multivariateDiscriminant Analysis: Partition multidimensional variable space, identify boundaries Cluster Analysis: Assign objects to groups based on similarityExamples Fisher linear discriminant, Gaussian classifier Kernel-based methods, K-nearest neighbor (clustering)

methods Adaptive/AI methods

Page 7: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

7

x1x1

x2x2

Why Multivariate Methods?

x1x1

x2x2

Because they are optimal!Because they are optimal!

D(x1,x2)=2.014x1 + 1.592x2D(x1,x2)=2.014x1 + 1.592x2

Page 8: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

8

Also, they need to have optimal flexibility/complexity

x1

x2

)2sin(4.05.0)( xxh Mth Order Polynomial Fit

M=1 M=3 M=10

x1

x2

x1

x2

Simple Flexible Highly flexible

Page 9: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

9

The Golden Rule

Keep it simpleAs simple as possibleNot any simpler

- Einstein

Page 10: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

10

Optimal Event Selection

b)p(b)(xp

s)p(s)|p(x

)x(bp

)x(sp

)(xr

b)p(b)(xp

s)p(s)|p(x

)x(bp

)x(sp

)(xr

defines decision boundariesdefines decision boundariesthat minimize the probabilitythat minimize the probabilityof misclassificationof misclassification

So, the problem mathematically reduces to that of calculating r(x), the Bayes Discriminant Function or probability densities

Posterior probabilityPosterior probability

s)|p(xb)(xp

s)|p(x

r1

r

)|( xsp

s)|p(xb)(xp

s)|p(x

r1

r

)|( xsp

Page 11: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

11

Page 12: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

12

Page 13: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

13

Probability Density EstimatorsHistogramming:

The basic problem of non-parametric density estimation is very simple! Histogram data in M bins in each of the d feature variables

Md bins Curse Of Dimensionality In high dimensions, we would either require a huge

number of data points or most of the bins would be empty leading to an estimated density of zero.

But, the variables are generally correlated and hence tend to be restricted to a sub-space Intrinsic Dimensionality

Page 14: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

14

Kernel-Based MethodsAkin to Histogramming but adopts importance sampling

Place in d-dimensional space a hypercube of side h centered on each data point x,

The estimate will have discontinuities

Can be smoothed out using different forms for kernel functions H(u). A common choice is a multivariate kernel

N

n

n

d h

xxH

hNxp

1

11)(~

N

n

n

d h

xxH

hNxp

1

11)(~

N

n

n

d h

xx

hNxp

12

2

2/2 2

||exp

)2(

11)(~

N

n

n

d h

xx

hNxp

12

2

2/2 2

||exp

)2(

11)(~

N = Number of data points H(u) = 1 if xn in the hypercube = 0 otherwise

h=smoothingparameter

Page 15: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

15

Place a hyper-sphere centered at each data point x and allow the radius to grow to a volume V until it contains K data points. Then, density at x

If our data set contains Nk points in class Ck and N points in total, then

NV

Kxp )(

NV

Kxp )(

K nearest-neighbor Method

N = Number of data pointsN = Number of data points

VN

KCxp

k

kk )|(

VN

KCxp

k

kk )|(

KKkk = # of points in volume = # of points in volume

V for class CV for class Ckk

K

K

xp

CpCxPxCp kkk

k )(

)()|()|(

K

K

xp

CpCxPxCp kkk

k )(

)()|()|(

Page 16: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

16

Discriminant Approximation with Neural Networks

Output of a feed forward neural network can approximate the Bayesian posterior probability p(s|x,y)Directly without estimating class-conditional probabilities

x

y

),,( yxn

r

ryxspyxn

1),|(),,(

r

ryxspyxn

1),|(),,(

Page 17: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

17

Calculating the Discriminant

Consider the sum

i

iii dyxnyxE 2]),,([),,(

Where di = 11 for signal

= 00 for background = vector of parameters

Then

r

ryxspyxn

d

yxdE

1),|(),,(0

),,(

in the limit of large data samples and provided that the function n(x,y,) is flexible enough.

Page 18: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

18

NN estimates a mapping function without requiring a mathematical description of how the output formally depends on the input.

The “hidden” transformation functions, g, adapt themselves to the data as part of the training process. The number of such functions need to grow only as the complexity of the problem grows.

x1

x2

x3

x4

DNN

aijii

kjj

NN e1

1(a))};X({ D

- ggg

ij

k

Neural Networks

Page 19: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

19

Measuring the Top Quark Mass

The DiscriminantsThe Discriminants

Discriminant variables shaded = topshaded = top

Page 20: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

20

Background-rich

Signal-rich

Measuring the Top Quark MassMeasuring the Top Quark Mass

mt = 173.3 ± 5.6(stat.) ± 6.2 (syst.) GeV/c2mt = 173.3 ± 5.6(stat.) ± 6.2 (syst.) GeV/c2

DØ Lepton+jetsDØ Lepton+jets

Page 21: Advanced Analysis Techniques in HEP

Strategy for Discovering the Higgs Boson

at the Tevatron

P.C. Bhat, R. Gilmartin, H. Prosper, PRD 62 (2000) P.C. Bhat, R. Gilmartin, H. Prosper, PRD 62 (2000) hep-ph/0001152

Page 22: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

22

Hints from the Analysis of Precision Data

LEP Electroweak Group, http://www.cern.ch/LEPEWWG/plots/summer99

)(107 6745-

MH = GeV/c2

MH < 225 GeV/c2 at 95% C.L.

)107( 6745

Page 23: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

23

Event SimulationSignal Processes

Backgrounds

Event generation WH, ZH, ZZ and Top with PYTHIA Wbb, Zbb with CompHEP, fragmentation with PYTHIA

Detector modeling SHW (http://www.physics.rutgers.edu/~jconway/soft/shw/shw.html)

Trigger, Tracking, Jet-finding b-tagging (double b-tag efficiency ~ 45%) Di-jet mass resolution ~ 14%

bbbbZH

bbWHpp

,

tbtbqttWZZZbZbbWbpp ,,,,,,

(Scaled down to 10% for RunII Higgs Studies)

Page 24: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

24

WH Results from NN AnalysisWH Results from NN AnalysisMMHH = 100 GeV/c = 100 GeV/c22

WH WH vs Wbb

Page 25: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

25

WH (110 GeV/c2) NN Distributions

Page 26: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

26

Results, Standard vs. NN

A good chance of discovery up to MH= 130 GeV/c2 with 20-30fb-1

Page 27: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

27

Improving the Higgs Mass Resolution

13.8% 12.2%

13.1% 11..3%

13%13% 11%11%

Use mjj and HT (= Etjets ) to train NNs to predict the Higgs boson mass

Page 28: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

28

Newer ApproachesEnsembles of Networks

Committees of Networks Performance can be better than the best single

network

Stacks of NetworksControl both bias and variance

Mixture of ExpertsDecompose complex problems

Page 29: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

29

Exploring Models: Bayesian Approach

Provides probabilistic information on each parameter of a model (SUSY, for example) via marginalization over other parameters

Bayesian method enables straight-forward and meaningful model comparisons. It also allows treatment of all uncertainties in a consistent manner.

Mathematically linked to adaptive algorithms such as Neural Networks (NN)

Hybrid methods involving NN for probability density estimation and Bayesian treatement can be very powerful

Page 30: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

30

SummaryWe are building very sophisticated equipment and will record unprecedented amounts of data in the coming decadeUse of advanced “optimal” analysis techniques will be crucial to achieve the physics goalsMultivariate methods, particularly Neural Network techniques, have already made impact on discoveries and precision measurements and will be the methods of choice in future analysesHybrid methods combining “intelligent” algorithms and probabilistic approach will be the wave of the future

Page 31: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

31

Optimal Event Selection

Optimal Event Selection

x

r(x,y) = constant defines an optimaldecision boundary

r(x,y) = constant defines an optimaldecision boundary

Feature spaceFeature space

),|(

),|(

)()|,(

)()|,(),(

yxbp

yxsp

bpbyxp

spsyxpyxr

),|(

),|(

)()|,(

)()|,(),(

yxbp

yxsp

bpbyxp

spsyxpyxr

S = B =

Conventional cutsx x

y y

0

0

y

0y

x0

x

y

x

y

0x

0y

Page 32: Advanced Analysis Techniques in HEP

Probabilistic Approach to Data Analysis

Bayesian Methods

(The Wave of the future)

Page 33: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

33

Bayesian AnalysisBayesian Analysis

M modelA uninteresting parametersp interesting parameters d data

p A

M A

M p A

dMpApdMp

dMpApdpp

MpqMAQMpAdL

MpqMAQMpAdLdMpAp

)|,,()|(

)|,,()|(

),(),(),,|(

),(),(),,|()|,,(

p A

M A

M p A

dMpApdMp

dMpApdpp

MpqMAQMpAdL

MpqMAQMpAdLdMpAp

)|,,()|(

)|,,()|(

),(),(),,|(

),(),(),,|()|,,(

LikelihoodLikelihood PriorPriorPosteriorPosterior

Bayesian Analysis of Multi-source DataP.C. Bhat, H. Prosper, S. Snyder, Phys. Lett. B 407(1997) 73P.C. Bhat, H. Prosper, S. Snyder, Phys. Lett. B 407(1997) 73

Page 34: Advanced Analysis Techniques in HEP

ACAT2000 Oct. 16-20, 2000 Pushpa Bhat

34

Higgs Mass FitsS=80 WH events, assume background distribution described by Wbb.S=80 WH events, assume background distribution described by Wbb.ResultsResults

S/B = 1/10 MS/B = 1/10 Mfitfit= 114 +/- 11GeV/c= 114 +/- 11GeV/c22

S/B = 1/5 MS/B = 1/5 Mfitfit= 114 +/- 7GeV/c= 114 +/- 7GeV/c22