Alberto Ribon, CERN
Statistical Testing ProjectStatistical Testing Project
Alberto Ribon, CERN
on behalf of the Statistical Testing Team
CLHEP Workshop CERN, 28 January 2003
Alberto Ribon, CERN
What is?What is?
Provide tools for the Provide tools for the statistical comparisonstatistical comparison of distributions of distributions– simulation data– experimental measurements– data from reference sources– functions deriving from theoretical calculations or from fits
physics physics validationvalidation
regression regression testingtesting
system testingsystem testing
Main application areas in Geant4:
A project to develop a general purpose
statistical analysis systemstatistical analysis system A project to develop a general purpose
statistical analysis systemstatistical analysis system
Alberto Ribon, CERN
The teamThe teamDevelopment team (mostly part time!)
Pablo Cirrone, INFN Southern National Lab
Stefania Donadio, Univ. and INFN Genova
Susanna Guatelli, CERN/IT/API Technical Student and INFN Genova
Alberto Lemut, Univ. and INFN Genova
Barbara Mascialino, Univ. and INFN Genova
Sandra Parlati, INFN Gran Sasso National Lab
Andreas Pfeiffer, CERN/IT/API
Maria Grazia Pia, INFN Genova
Alberto Ribon, CERN/IT/API
Statistical consultancyPaolo Viarengo, Univ. Genova, Statistician
Fred James, CERN
Geant4 system integration teamGabriele Cosmo, CERN/IT/API - Geant4 Release Manager
Sergei Sadilov, CERN/IT/API - Geant4 System Testing Coordinator
interested collaborators
are welcome!
Alberto Ribon, CERN
Scope of the projectScope of the project
The project will provide tools for statistical testingtools for statistical testing– physics comparisons and regression testing– multiple comparison algorithms
GeneralityGenerality (for application also in other areas) should be pursued– facilitated by a component-based architecture
The statistical tools should be used in Geant4 (and in other frameworks)– tool to be used in testing frameworks– not a testing framework itself
Re-use existing tools whenever possible– no attempt to re-invent the wheel– but critical, scientific evaluation of candidate tools
Alberto Ribon, CERN
So far, only ad hoc solutionsSo far, only ad hoc solutions
An old and common problem (comparison of distributions)
The only general “tool” was HDIFF (which does the Kolmogorov-
Smirnov test), which, although very useful and used, was never enough
for any realistic physics analysis
Each experiment (or even each Analysis group) has created each time
its ad hoc “tool” for statistical tests, usually based on legacy code
which were modified and adapted for the particular needs
Example: CDF Coll. PRL 77 (1996) 438
“Inclusive jet cross section in p-pbar collisions at Tevatron”
Alberto Ribon, CERN
Architectural guidelinesArchitectural guidelines
The project adopts a solid architectural architectural approach– to offer the functionalityfunctionality and the qualityquality needed by the users– to be maintainablemaintainable over a large time scale– to be extensibleextensible, to accommodate future evolutions of the requirements
Component-based approachComponent-based approach– Geant4-specificGeant4-specific components + + generalgeneral components – to facilitate re-use and integration in diverse frameworks
AIDAAIDA– adopt a (HEP) standard– no dependence on any specific analysis tool
PythonPython
The approach adopted is compatible with the recommendations of the
CERN LCG Architecture Blueprint RTAGLCG Architecture Blueprint RTAG
Alberto Ribon, CERN
Some use casesSome use cases
Regression testing– Throughout the software life-cycle
Online DAQ– Monitoring detector behaviour w.r.t. a reference
Simulation validation– Comparison with experimental data
Reconstruction– Comparison of reconstructed vs. expected distributions
Physics analysis– Comparisons of experimental distributions (signal sample vs. bkg sample)– Comparison with theoretical distributions (data vs. Standard Model)
Alberto Ribon, CERN
Goodness-of-fit testsGoodness-of-fit tests
Pearson’s 2 test
Kolmogorov test
Kolmogorov – Smirnov test
Lilliefors test
Cramer-von Mises test
Anderson-Darling test
Kuiper test
…
System open to extension and evolution
Suggestions welcome!
Alberto Ribon, CERN
Pearson’s 2Pearson’s 2
Applies to discrete (binned) discrete (binned) distributions
It can be useful also in case of continuous (unbinned) distributions, but the data must be grouped into classes
Cannot be applied if the counting of the theoretical frequencies in each class is < 5
When this is not the case, one could try to unify contiguous classes until the minimum theoretical frequency is reached
Alberto Ribon, CERN
Kolmogorov testKolmogorov test
The easiest among non-parametric tests
Verify the adaptation of a sample coming from a random continuous continuous variable
Based on the computation of the maximum distance between an empirical repartition function and the theoretical repartition one
Test statistics:
D = sup | FO(x) - FT(x)|
Alberto Ribon, CERN
Kolmogorov-Smirnov testKolmogorov-Smirnov test
Problem of the two samples– mathematically similar to Kolmogorov’s
Instead of comparing an empirical distribution with a theoretical one, try to find the maximum difference between the distributions of the two samples Fn and Gm:
Dmn= sup |Fn(x) - Gm(x)|
Can be applied only to continuouscontinuous random variables
Conover (1971) and Gibbons and Chakraborti (1992) tried to extend it to cases of discrete random variables
Alberto Ribon, CERN
Lilliefors testLilliefors test
Similar to Kolmogorov test
Based on the null hypothesis that the random continuous variable is normally distributed N(m,2), with m and 2 unknown
Performed comparing the empirical repartition function F(z1,z2,...,zn) with the one of the standardized normal distribution (z):
D* = sup | FO(z) - (z)|
Alberto Ribon, CERN
Cramer-von Mises testCramer-von Mises test
Based on the test statistics:
2 = integral (FO(x) - FT(x))2 dF(x)
Can be performed both on continuouscontinuous and discrete discrete variables
Satisfactory for symmetric and right-skewed distributions
Alberto Ribon, CERN
Anderson-Darling testAnderson-Darling test
Performed on the test statistics:
A2= integral { [FO(x) – FT(x)]2 / [FT(x) (1-FT(X))] } dFT(x)
Can be performed both on continuouscontinuous and discretediscrete variables
Seems to be suitable to any data-set (Aksenov and Savageau - 2002) with any skewnessskewness (symmetric distributions, left or right skewed)
Seems to be sensitive to fat tail of distributions
Alberto Ribon, CERN
Kuiper testKuiper test
Based on a quantity that remains invariant for any shift or re-parameterization
Does not work well on tails
D* = max (FO(x)-FT(x)) + max (FT(x)-FO(x))
Alberto Ribon, CERN
OOADOOAD
http://www.ge.infn.it/geant4/analysis/TandA/index.html
Collection of user requirements
First analysis and design of the statistical component
Validation of the class design through use cases
Some open issues identified, to be addressed in
the next design iterations
Alberto Ribon, CERN
+ more algorithms
Alberto Ribon, CERN
Alberto Ribon, CERN
Work in progressWork in progressImplementation and test of preliminary design
What can be re-used?– Almost nothing available either in GSL or NAG
Studies in progress– Transformation between binned-unbinned distributions– Strategies to use Kolmogorov-Smirnov with binned distributions
(E. Dagum + original ideas)– How to deal with experimental errors (not only statistical!)– Multi-dimensional distributions– Bayesian approach
In the to-do list– Conversion from AIDA objects to distributions– “Pythonisation”
Alberto Ribon, CERN
Work in progress: User-specificWork in progress: User-specific
Geant4 testing framework – Development of general physics tests in E.M. domain:
collection of relevant observables, and respective reference
data/distributions– Integration in the system testing framework
CMS transition from Geant3 to Geant4– An automaatic regression testing procedure is needed– Similar needs also for future Geant4 versions
Alberto Ribon, CERN
Where?Where?
Core statistical component– Developed in an independent CVS repository– Code, documentation, software process deliverables– Where it will go? CLHEP or LCG ?
Geant4-specific stuff– Kept separated in Geant4
Web site– http://www.ge.infn.it/geant4/analysis/TandA/index.html
Contact persons– [email protected], [email protected]
Alberto Ribon, CERN
Time scaleTime scale
Aggressive time scale driven by User needsdriven by User needs– CMS and Geant4
OOAD + implementation undergoingA first prototype should be ready in few weeks
Advanced functional system summer 2003
Open to the needs/suggestions of anyone– compatible with the available resources– possible integration in GSL
Alberto Ribon, CERN
Conclusions…Conclusions…
Core statistical components of general interest– LHC experiments, Geant4, etc.
Project compatible with LCG architecture blueprint– component-based approach, AIDA, Python…
Open to scientific collaboration
Urgent user needs– CMS and Geant4
First prototype expected in few weeks