provide tools for the statistical comparison of distributions equivalent reference distributions ...

20
Provide tools for the Provide tools for the statistical comparison statistical comparison of distributions of distributions equivalent reference distributions experimental measurements data from reference sources functions deriving from theoretical calculations or fits Detector monitoring Detector monitoring Simulation validation Simulation validation Reconstruction vs. Reconstruction vs. expectation expectation Regression testing Regression testing Physics analysis Physics analysis Data analysis in HEP

Upload: lester-cunningham

Post on 13-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

Provide tools for the Provide tools for the statistical comparisonstatistical comparison of distributions of distributions equivalent reference distributions experimental measurements data from reference sources functions deriving from theoretical calculations or fits

Detector monitoringDetector monitoring

Simulation validationSimulation validation

Reconstruction vs. expectationReconstruction vs. expectation

Regression testingRegression testing

Physics analysisPhysics analysis

Data analysis in HEP

Page 2: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

Qualitative evaluationQualitative evaluationQuantitative evaluationQuantitative evaluation

GoF statistical toolkit

A project to develop a

statistical comparison systemstatistical comparison system A project to develop a

statistical comparison systemstatistical comparison system

Comparison of distributionsComparison of distributions

Goodness of fit testingGoodness of fit testing

Page 3: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

• United Software Development ProcessUnited Software Development Process, specifically tailoredtailored to the project– practical guidance and tools from the RUPRUP– both rigorous and lightweight– mapping onto ISO 15504

• Guidance from ISO 15504ISO 15504

• Incremental and iterative life cycle model

Software process guidelines

SPIRAL APPROACHSPIRAL APPROACH

Page 4: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

• The project adopts a solid architectural approachsolid architectural approach– to offer the functionalityfunctionality and the qualityquality needed by the users– to be maintainablemaintainable over a large time scale– to be extensibleextensible, to accommodate future evolutions of the

requirements

• Component-based approachComponent-based approach– to facilitate re-use and integration in different frameworks

• AIDAAIDA– adopt a (HEP) standard– no dependence on any specific analysis tool

Architectural guidelines

Page 5: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources
Page 6: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

The algorithms are specialised on the kind of distribution The algorithms are specialised on the kind of distribution (binned/unbinned)(binned/unbinned)

Every algorithm has been rigorously tested!

Documentation available:

http://www.ge.infn.it/geant4/analysis/HEPstatistics/http://www.ge.infn.it/geant4/analysis/HEPstatistics/

Page 7: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

• Applies to binnedbinned distributions

• It can be useful also in case of unbinned distributions, but the data must be grouped into classes

• Cannot be applied if the counting of the theoretical frequencies in each class is < 5

– When this is not the case, one could try to unify contiguous classes until the minimum theoretical frequency is reached

– Otherwise one could use Yates’ formula

Chi-squared testChi-squared test

Page 8: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

0,8

0,9

1

EMPIRICAL DISTRIBUTION FUNCTIONORIGINAL DISTRIBUTIONS

• Kolmogorov-Smirnov test

• Goodman approximation of KS test

• Kuiper test

)(

4 22

nm

nmDmn

)()( xGxFSupD mnmn

)()()()( 00* xFxFMaxxFxFMaxD TT

Dmn

More sophisticated algorithmsMore sophisticated algorithmsunbinned distributionsunbinned distributions

SUPREMUMSUPREMUMSTATISTICSSTATISTICS

Page 9: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

)()()(2

02 xdFxFxF T • Cramer-von Mises test

• Anderson-Darling test

)()(1)(

)()( 202 xdF

xFxF

xFxFA T

TT

T

These algorithms are so powerful that we decided to implement theirequivalent in case of binned distributions:

• Fisz-Cramer-von Mises test

• k-sample Anderson-Darling test

i

ii xFxFnn

nnt 2

21221

21 )]()([)(

i k kkk

kiikk

iK nh

HnH

HnnFh

nkn

nA

4)(

)(1

)1(

)1( 2

22

More powerful algorithmsMore powerful algorithmsunbinned distributionsunbinned distributions

binned distributionsbinned distributions

TESTS CONTAININGTESTS CONTAININGA WEIGHTING FUNCTIONA WEIGHTING FUNCTION

Page 10: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

2 loses information in a test for unbinned distribution by grouping the data into cellsKac, Kiefer and Wolfowitz (1955) showed that Kolmogorov-Smirnov test

requires n4/5 observations compared to n observations for 2 to attain the same power

Cramer-von Mises and Anderson-Darling statistics are expected to be superior to Kolmogorov-Smirnov’s, since they make a comparison of the two distributions all along the range of x, rather than looking for a marked difference at one point

2222 Supremum Supremum statistics statistics

teststests

Supremum Supremum statistics statistics

teststests

Tests Tests containing a containing a

weight functionweight function

Tests Tests containing a containing a

weight functionweight function< <

In terms of power:

IsIs 2 the most powerful algorithm?

Page 11: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

The user is completely shieldedshielded from both statistical and computing complexity.

USERUSER

EXTRACTS THE ALGORITHM WRITING ONE LINE OF CODEEXTRACTS THE ALGORITHM WRITING ONE LINE OF CODE

TOOLKITTOOLKITSTATISTICALSTATISTICAL

RESULTRESULT

User’s point of viewUser’s point of view•Simple user layerSimple user layer

•Only deal with AIDA objectsAIDA objects and choice of comparison algorithmcomparison algorithm

Page 12: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

Examples of practical applicationsExamples of practical applications

Page 13: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

2N-S=0.267 =28 p=1

2N-L=1.315 =28 p=1

2N-S=0.532 =28 p=1

2N-L=1.928 =28 p=1

2N-S=0.373 =28 p=1

2N-L= 5.882 =28 p=1

Geant4 simulationsare statistically

comparable withreference data (NIST database

http://www.nist.gov)

NIST

Geant4 Standard

Geant4 LowE

Chi-squared test

MiMicroscopic validation of physics

Page 14: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

2 not appropriate

(< 5 entries in some bins, physical

information would be lost if rebinned)

Anderson-Darling

Ac (95%) =0.752

Test beam at BessyTest beam at BessyBepi-Colombo missionBepi-Colombo mission

Energy (keV)

Cou

nts

X-ray fluorescence spectrum in Iceand basalt(EIN=6.5 keV)

Very complex distributions

Experimental measurements are comparable with Geant4 simulations

Page 15: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

DEXP-GEANT4=0.11 p=n.s.

2EXP-GEANT4=3.8 =2 p=n.s.

KOLMOGOROV-SMIRNOVKOLMOGOROV-SMIRNOV

Goodman approximation Goodman approximation KOLMOGOROV-SMIRNOVKOLMOGOROV-SMIRNOV

Medical applications-hadron therapyMedical applications-hadron therapy

Experimental measurements are comparable with Geant4 simulations

Page 16: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

Future developmentsFuture developments

Page 17: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

• Real-lifeReal-life distributions are not strictly limited to

one-dimension.• For this reason the algorithms contained in the

GoF Toolkit are going to be generalised to the case of higherhigher dimensional distributions.

• This is a big step forwardbig step forward in statisticsstatistics and in

physics data analysisphysics data analysis as well.

Work in progress (I)Work in progress (I)

Page 18: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

• The user will have the possibility to compare its distributions with some theoretical referencetheoretical reference distributions, as:

- uniform, - gaussian, - Weibull, - gamma, …

• Data handlingData handling: filtering

• Treatment of errorsTreatment of errors (uncertainties)

Work in progress (II)Work in progress (II)

Page 19: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

• The GoF Toolkit is downloadable from the web:www.ge.infn.it/geant4/analysis/HEPstatistics/index.html

• Recent developments– added new algorithms, improved design, improved

documentation– user examples, unit and system tests– statistical detailed documentation

StatusStatus

Page 20: Provide tools for the statistical comparison of distributions  equivalent reference distributions  experimental measurements  data from reference sources

• This is a newnew up-to-dateup-to-date easy to handleeasy to handle and powerfulpowerful tool for statistical comparison in particle physics.

• It the first tool supplying such a variety of sophisticated and sophisticated and powerful statistical testspowerful statistical tests in HEP.

• AIDAAIDA interfaces allow its integration in any other data analysis tool.

Applications in: Applications in: HEPHEP, , astrophysicsastrophysics, , medical physicsmedical physics, … , …

ConclusionsConclusions