artificial intelligence—applications in high energy and nuclear physics

Nuclear Instruments and Methods in Physics Research A 502 (2003) 811–814

Artificial intelligence—applications in high energyand nuclear physics

U. M .uller*

Fachbereich Physik, Bergische Universit .at, Gesamthochschule Wuppertal, GauX str. 20, D-42097 Wuppertal, Germany

Abstract

In the parallel sessions at ACAT2002 different artificial intelligence applications in high energy and nuclear physics

were presented. I will briefly summarize these presentations. Further details can be found in the relevant section of these

proceedings.

r 2003 Elsevier Science B.V. All rights reserved.

PACS: 07.05.Mh; 01.30.Cc

Keywords: Artificial intelligence; Neural networks; High energy physics; Nuclear physics; Summary talk

1. Introduction

Many excellent talks were given in the artificialintelligence (AI) parallel sessions. Most of themdeal with neural network applications but also lessknown techniques were described. The followingoverview shows the different topics:

* wavelet analysis

for smoothing spectra in small-angle neutronscattering,

* support vector machines

for classification,* neural networks

3 for many different goals:event selection, energy reconstruction,

tracking, particle identification, trigger

3 from many different experiments:DELPHI, NA48, CDF, DØ, ALICE, CMS,

XEUS.

2. Wavelet analysis

Litvinenko [1] described a wavelet analysisin small-angle neutron scattering. There time-of-flight data are very noisy and so a smooth-ing is necessary. A wavelet analysis wasinvented to try to improve traditional filtertechniques like smoothing window or medianfilter.A wavelet transformation is a projection to

a basic wavelet with a dilation factor and a shift.It was used in a continuous transformationof a discretized signal and a discrete transforma-tion using lifting and data filtering. Using thesenew techniques the smoothing was improvedimpressingly.

*Tel.: +49-202-439-3523; fax: +49-202-439-2811.

E-mail address: [email protected]

(U. M .uller).

0168-9002/03/$ - see front matter r 2003 Elsevier Science B.V. All rights reserved.

doi:10.1016/S0168-9002(03)00607-7

3. Support vector machines

3.1. General ideas

Support vector machines (SVMs) are based on astatistical learning theory and use an optimizedgeneralization. Training data are mapped into ahigh-dimensional feature space:

x-fðxÞ; /fðxÞ;fðx0ÞS ¼: kðx;x0Þ;

f ðxÞ ¼ /w;fðxÞSþ b ¼Xm

i¼1

ðai � an

i Þkðx;x0Þ þ b:

In this feature space a decision boundary isdetermined by constructing the optimal separatinghyperplane. The use of a kernel function (e.g.kðx;x0Þ ¼ expð�jjx � x0jj2=2s2Þ) avoids computa-tions in the feature space. The optimizationproblem is well defined and quadratic. So theoutcoming minimum is the global one.

3.2. Applications

Two physics analyses were presented in theparallel sessions.Vaiciulis [2] from CDF showed a selection of the

dilepton channel in top quark productiont%t-b %blnln: He stressed the very fast training forhis SVM and the performance of his preliminaryanalysis is even without any optimization alreadyas good as the best combination of linear cuts.Naumann [3] from DØ presented a Higgs search

analysis. He reached similar results compared to aneural net analysis. In the high-purity region theSVM is even slightly better.

4. Neural networks

A lot of different neural network analyses wereshown. In the following subsections they aresubdivided according to their purposes.

4.1. Event selections

The first talk about event selection was mypresentation about the separation of hadronic W-pairs eþe�-WþW�-q %qq %q from eþe�-Z0=gn or

Z0Z0-4 jets in DELPHI [4]. A standard feedforward neural network(FFNN) trained with thebackpropagation algorithm(BPA) and consistingof 13 inputs, seven hidden nodes in one layer andone output node clearly outperforms the oldstandard linear cut analysis. The selection qualityindicated by the product from efficiency and purityincreases from 63.5% to 69.1%. Studies todetermine fully the systematic error of the analysiswere described.Dudko [5] presented a Higgs search analysis

with an optimized neural network at the Tevatron.Different studies have shown that the higherselection power of neural network applicationsrequire less integrated luminosity for a successfulHiggs search. Dudko chose the input variables forhis standard neural network from an optimizationprocedure based on the analysis of the underlyingFeynman diagrams of the signal and backgroundprocesses. Compared to his old set of variableshe found a better performance using the newobservables.

4.2. Particle identification

Litov [6] described the e=p separation in chargedand neutral Kaon decays in the NA48 experimentat CERN. The standard selection with a cut in theratio energy over momentum E=p is clearly out-performed by a large FFNN (10 inputs, threehidden layers with 30, 20 and 2 nodes and oneoutput) with BPA. The background is reduced upto a factor of 38. The final cut values for the neuralnet output and one additional variable are opti-mized by minimizing the total error of the analysis.

4.3. Class separation and parameter estimation

XEUS is a future project. The satellite for X-rayobservations is planned to be launched in 2012 atthe earliest. But already now a large variety ofstudies are ongoing. Zimmermann [7] presenteddifferent neural network applications for opera-tion and data analyses:

* A hardware neural network trigger implemen-ted in FPGA devices should reduce the highdata rate.

U. M .uller / Nuclear Instruments and Methods in Physics Research A 502 (2003) 811–814812

* The photon recognition with a three layerFFNN trained with BPA outperforms clearlythe state of the art algorithm.

* With another FFNN the position of onephoton can be estimated better than using thestandard procedure.

* In two photon events the FFNN application isthe first analysis to measure the position of thetwo photons and the distance between thephotons.

* Using the distance measurement also the chargeestimation is improved due to a higher availablestatistics

4.4. Energy reconstruction

For the LHC physics program, especially thesearch for Higgs and SUSY particles, a veryprecise energy measurement is necessary. Litov [8]presented an ansatz for the energy reconstructionat CMS based on neural networks. A complexstructure of two levels of FFNNs with additionalsubnetworks is used. The weights are adjusted onan event-by-event basis.The results are very promising. The energy

reconstruction is even very good if the particlewhich initiate the shower is misidentified. Theobtained energy spectra show a nice Gaussianshape and are free of tails. Finally, the energyresolution is significantly improved.

4.5. Neural tracking

Pulvirenti [9] presented a neural network appli-cation in the track finding of ALICE. The state-of-the-art procedure to reconstruct tracks out of thehuge number of about 84 000 primary tracks uses aKalman filter algorithm. Possible tracks arepropagated from the time projection chamber(TPC) to the inner tracking system (ITS).The goal of the neural network application is to

improve the tracking and to build up an ITSstand-alone tracking. An associative memorytopology consisting of one fully connected singlelayer is used with a sigmoidal activation functionand a binary mapping of the neurons with athreshold of 0.6 on the final activation.

The ITS stand-alone tracking was successfullyimplemented. The recognition efficiency of theneural tracking is comparable to the result usingthe Kalman filter. But the combination of theneural network and the filter give an increase ofabout 10%.

4.6. Hardware neural network

In some current experiments hardware imple-mentations of neural networks are already used fortriggering. So far larger networks have been tooslow to be used in a level 1 trigger. But now thetechnology trend enables to transpose level 2complexity of neural computations into level 1.Pr!evotet [10] described a hardware solution for theimplementation of neural networks in high-energyphysics triggers using the time specifications of theATLAS experiment.The implementation of a 128-64-4 FFNN is

done using a FPGA with 4 processing elements foreach hidden neuron. The processing time of 500 nsis larger than the 25 ns data arriving each bunchcrossing. So the data are processed parallel in atime multiplexed way. This makes the implemen-tation of the digital neural network feasible inreal time.

5. Summary and conclusions

Comparing the talks at ACAT2002 with theprevious workshops, I think it can be noticed thatmore and more neural networks are applied for alarge variety of different tasks in different experi-ments. Mostly, the networks perform significantlybetter than non-AI state-of-the-art analyses. So forme it is only a question of time that neuralnetworks become the default method.The support vector machines seem to be an

interesting alternative to neural networks with asimilar, sometimes even better performance. Buttaking into account the preliminary state of thepresented analyses, I think more studies andapplications are still needed. So it will be quiteinteresting to see what will be presented aboutSVMs at the next ACAT workshop.

U. M .uller / Nuclear Instruments and Methods in Physics Research A 502 (2003) 811–814 813

Acknowledgements

I thank very much all authors of the presenta-tions which served as a basis for this summary.

References

[1] A. Soloviev, E. Litvinenko, G. Ososkov, A. Islamov, A.

Kuklin, Nucl. Instr. and Meth. A 502 (2003) 500.

[2] A. Vaiciulis, Nucl. Instr. and Meth. A 502 (2003) 492.

[3] A. Naumann, talk at ACAT’2002.

[4] K.-H. Becks, J. Drees, U. M .uller, H. Wahlen, Nucl. Instr.

and Meth. A 502 (2003) 483.

[5] E. Boos, L. Dudko, D. Smirnov, Nucl. Instr. and Meth. A

502 (2003) 486.

[6] L. Litov, Nucl. Instr. and Meth. A 502 (2003) 495.

[7] J. Zimmermann, C. Kiesling, P. Holl, Nucl. Instr. and

Meth. A 502 (2003) 507.

[8] J. Damgov, L. Litov, talk at ACAT’2002.

[9] A. Badal!a, R. Barbera, G. Lo Re, A. Palmeri, G.S.

Pappalardo, A. Pulvirenti, F. Riggi, Nucl. Instr. and Meth.

A 502 (2003) 503.

[10] J.-C. Pr!evotet, B. Denby, P. Garda, B. Granado, C.

Kiesling, Nucl. Instr. and Meth. A 502 (2003) 511.

U. M .uller / Nuclear Instruments and Methods in Physics Research A 502 (2003) 811–814814

artificial intelligence—applications in high energy and nuclear physics

Documents