maria grazia pia, infn genova statistical testing project maria grazia pia, infn genova on behalf of...
TRANSCRIPT
Maria Grazia Pia, INFN Genova
Statistical Testing ProjectStatistical Testing Project
Maria Grazia Pia, INFN Genova
on behalf of the Statistical Testing Team
http://www.ge.infn.it/geant4/analysis/TandA
LCG-Application Meeting CERN, 27 November 2002
Maria Grazia Pia, INFN Genova
History and backgroundHistory and background
Maria Grazia Pia, INFN Genova
What is?What is?
Provide tools for the Provide tools for the statistical comparisonstatistical comparison of distributions of distributions– equivalent reference distributions (for instance, regression testing)– experimental measurements– data from reference sources– functions deriving from theoretical calculations or from fits
physics physics validationvalidation
regression regression testingtesting
system testingsystem testing
Main application areas in Geant4:
Interest in other areas, not only Geant4? LCG?
A project to develop a
statistical analysis statistical analysis systemsystem,,
to be used in Geant4 testing
A project to develop a
statistical analysis statistical analysis systemsystem,,
to be used in Geant4 testing
Maria Grazia Pia, INFN Genova
HistoryHistory“Statistical testing” agreed in the Geant4 Collaboration as a major objective for 2002
Initial ideas presented at Geant4 TSB meeting, November 2001
Open brainstorming session at a Geant4-WG workshop, 31 May 2002
Inception phase, summer 2002– Informal discussions with STT, Geant4 collaborators and interested potential developers– Initial collection of user requirements in Geant4– First version of software process deliverables: Vision, URD, Risk List
Presentation at Geant4 Workshop + parallel sessions, October 2002– http://www.ge.infn.it/geant4/talks/G4workshop/CERN/pia/tanda-2002.ppt
Launch of the project
Maria Grazia Pia, INFN Genova
The teamThe teamDevelopment team
Pablo Cirrone, INFN Southern National Lab
Stefania Donadio, Univ. and INFN Genova
Susanna Guatelli, CERN/IT/API Technical Student and INFN Genova
Alberto Lemut, Univ. and INFN Genova
Barbara Mascialino, Univ. and INFN Genova
Sandra Parlati, INFN Gran Sasso National Lab
Andreas Pfeiffer, CERN/IT/API
Maria Grazia Pia, INFN Genova
Geant4 system integration teamGabriele Cosmo, CERN/IT/API - Geant4 Release Manager
Sergei Sadilov, CERN/IT/API - Geant4 System Testing Coordinator
Statistical consultancyPaolo Viarengo, Univ. Genova, Statistician
interested collaborators
are welcome!
+ requirements, suggestions, -testing by many other Geant4 Collaborators (M. Maire, A. Ribon, L. Urban et al.)
Maria Grazia Pia, INFN Genova
The visionThe vision
Maria Grazia Pia, INFN Genova
Vision: the basics
Rigorous software processsoftware process
Have a visionvision for the project– An internal tool for Geant4 physics & STT?
– Also for Geant4 physics validation in the experiments?
– Other parties than Geant4 interested?
Who are the stakeholdersstakeholders?
Who are the usersusers?
Who are the developersdevelopers?
Build on a solid architecturearchitecture
Clearly define scopescope, objectivesobjectives
Flexible, extensible, Flexible, extensible, maintainablemaintainable system
Software quality quality
Clearly define roles
Maria Grazia Pia, INFN Genova
Scope of the projectScope of the project
The project will provide tools for statistical testingtools for statistical testing of Geant4– physics comparisons and regression testing– multiple comparison algorithms
GeneralityGenerality (for application also in other areas) should be pursued– facilitated by a component-based architecture
The statistical tools should be used in Geant4 (and in other frameworks)– tool to be used in testing frameworks– not a testing framework itself
Re-use existing tools whenever possible– no attempt to re-invent the wheel– but critical, scientific evaluation of candidate tools
Maria Grazia Pia, INFN Genova
Architectural guidelinesArchitectural guidelines
The project adopts a solid architectural architectural approach– to offer the functionalityfunctionality and the qualityquality needed by the users– to be maintainablemaintainable over a large time scale– to be extensibleextensible, to accommodate future evolutions of the requirements
Component-based approachComponent-based approach– Geant4-specificGeant4-specific components + + generalgeneral components – to facilitate re-use and integration in diverse frameworks
AIDAAIDA– adopt a (HEP) standard– no dependence on any specific analysis tool
PythonPython
The approach adopted is compatible with the recommendations of the LCG Architecture Blueprint RTAGLCG Architecture Blueprint RTAG
Maria Grazia Pia, INFN Genova
The reason why we are here…The reason why we are here… Core statistics comparison componentstatistics comparison component + user layer
can be generalised to wider scope than Geant4 only
This is the reason why we present the project to LCG – to establish a scientific discussionscientific discussion on a topic of common interest– to see if there are any interested usersinterested users– to see if there are any interested collaboratorsinterested collaborators
We would all benefit of a collaborative approach to a common problem
– share expertise, ideas, tools, resources…
Maria Grazia Pia, INFN Genova
Software process guidelinesSoftware process guidelines
Significant experience in the team– in Geant4 and in other projects
Guidance from ISO 15504ISO 15504– standard!
USDPUSDP, specifically tailoredtailored to the project– practical guidance and tools from the RUPRUP– both rigorous and lightweight– mapping onto ISO 15504
Open to use tools provided by the LCG Software Process LCG Software Process InfrastructureInfrastructure project
Maria Grazia Pia, INFN Genova
Who are the stakeholders? Who are the stakeholders?
Name Description Responsibilities
Geant4 STT Coordinator
Coordinates system testingEnsure that the system meets the needs of Geant4 System Testing
Geant4 physics coordinators
Coordinate Geant4 std EM, lowE EM, hadronic WGs
Ensure that the system meets the needs of Geant4 Physics Testing
Geant4 TSBIs responsible for Geant4 technical matters
Provide guidelines, monitors progress
INFN Computing Committee
National Committee whom part of the developers respond to; has appointed 4 referees
Recommend funding; review the project, monitor progress
Others? Who? LCG? Requirements? Expertise?
Maria Grazia Pia, INFN Genova
Who are the users?Who are the users?
Other potential users:
users of the Geant4 Toolkitusers of the Geant4 Toolkit, wishing to compare the results of their applications to reference data or to their own experimental results
other projectsother projects with requirements for statistical comparisons of distributions(e.g. the LHC Computing Grid project)
Groups Responsibilities
Geant4 physics Working Groups
Provide and document requirements, provide feedback on prototypes, perform -testing on preliminary releases of the product, provide use cases for acceptance testing
Geant4 STT Provide and document requirements, perform formal acceptance testing for adoption in system testing
Maria Grazia Pia, INFN Genova
Some use casesSome use cases
Regression testing– Throughout the software life-cycle
Online DAQ– Monitoring detector behaviour w.r.t. a reference
Simulation validation– Comparison with experimental data
Reconstruction– Comparison of reconstructed vs. expected distributions
Physics analysis– Comparisons of experimental distributions (ATLAS vs. CMS Higgs?)– Comparison with theoretical distributions (data vs. Standard Model)
Maria Grazia Pia, INFN Genova
What do the users want?What do the users want?
User requirementsUser requirements from Geant4 Geant4 (physics, system testing) elicited, analysed, specified and reviewed with the users
– User Requirements Document– http://www.ge.infn.it/geant4/analysis/TandA/URD_TandA.html– Use case model in progress
Specific user requirements related to the core statisticalstatistical component component – Detail in progress (URD in preparation)– Input from LCG?
Requirement traceability– Analysis/design, implementation, test, documentation, results
Maria Grazia Pia, INFN Genova
Are there any constraints? Are there any constraints?
Geant4 constraint requirementsGeant4 constraint requirements
Based on AIDA
No concrete dependencies on specific AIDA implementations should appear in the code of the system tests
Available on Geant4 supported platforms
The system should not require additional licenses w.r.t. what required for Geant4 development
Other non-functional requirements?
Maria Grazia Pia, INFN Genova
The core statistical component
The core statistical component
Maria Grazia Pia, INFN Genova
HBOOK, PAW & Co.HBOOK, PAW & Co.
Based on considerations such as those given above, as well as considerable computational experience, it is generally believed that tests like the Kolmogorov or Smirnov-Cramer-Von-Mises (which is similar but more complicated to calculate) are probably the most powerfulthe most powerful for the kinds of phenomena generally of interest to high-energy physicists. […]
The value of PROB returned by HDIFF is calculated such that it will be uniformly distributed between zero and one for compatible histograms, provided the data are not binned.provided the data are not binned. […]
The value of PROB should notnot be expected to have exactly the correctcorrect distribution for binned databinned data.
HBOOK manual, 1994
CDF Collaboration, Inclusive jet cross section in p pbar collisions at sqrt(s) 1.8 TeV, Phys. Rev. Lett. 77 (1996) 438
but…
Maria Grazia Pia, INFN Genova
Goodness-of-fit testsGoodness-of-fit tests
Pearson’s 2 test
Kolmogorov test
Kolmogorov – Smirnov test
Lilliefors test
Cramer-von Mises test
Anderson-Darling test
Kuiper test
…
It is a difficult domain…
Implementing algorithms is easyBut comparing real-life distributions is not easy
Incremental and iterative software processCollaboration with statistics experts
Patience, humility, time…
System open to extension and evolution
Suggestions welcome!
Maria Grazia Pia, INFN Genova
Pearson’s 2Pearson’s 2
Applies to discrete discrete distributions
It can be useful also in case of continuous distributions, but the data must be grouped into classes
Cannot be applied if the counting of the theoretical frequencies in each class is < 5
When this is not the case, one could try to unify contiguous classes until the minimum theoretical frequency is reached
Maria Grazia Pia, INFN Genova
Kolmogorov testKolmogorov test
The easiest among non-parametric tests
Verify the adaptation of a sample coming from a random continuous continuous variable
Based on the computation of the maximum distance between an empirical repartition function and the theoretical repartition one
Test statistics:
D = sup | FO(x) - FT(x)|
Maria Grazia Pia, INFN Genova
Kolmogorov-Smirnov testKolmogorov-Smirnov test
Problem of the two samples– mathematically similar to Kolmogorov’s
Instead of comparing an empirical distribution with a theoretical one, try to find the maximum difference between the distributions of the two samples Fn and Gm:
Dmn= sup |Fn(x) - Gm(x)|
Can be applied only to continuouscontinuous random variables
Conover (1971) and Gibbons and Chakraborti (1992) tried to extend it to cases of discrete random variables
Maria Grazia Pia, INFN Genova
Lilliefors testLilliefors test
Similar to Kolmogorov test
Based on the null hypothesis that the random continuous variable is normally distributed N(m,2), with m and 2 unknown
Performed comparing the empirical repartition function F(z1,z2,...,zn) with the one of the standardized normal distribution (z):
D* = sup | FO(z) - (z)|
Maria Grazia Pia, INFN Genova
Cramer-von Mises testCramer-von Mises test
Based on the test statistics:
2 = integral (FO(x) - FT(x))2 dF(x)
Can be performed both on continuouscontinuous and discrete discrete variables
Satisfactory for symmetric and right-skewed distributions
Maria Grazia Pia, INFN Genova
Anderson-Darling testAnderson-Darling test
Performed on the test statistics:
A2= integral { [FO(x) – FT(x)]2 / [FT(x) (1-FT(X))] } dFT(x)
Can be performed both on continuouscontinuous and discretediscrete variables
Seems to be suitable to any data-set (Aksenov and Savageau - 2002) with any skewnessskewness (symmetric distributions, left or right skewed)
Seems to be sensitive to fat tail of distributions
Maria Grazia Pia, INFN Genova
Kuiper testKuiper test
Based on a quantity that remains invariant for any shift or re-parameterization
Does not work well on tails
D* = max (FO(x)-FT(x)) + max (FT(x)-FO(x))
Maria Grazia Pia, INFN Genova
Work in progressWork in progress
Maria Grazia Pia, INFN Genova
OOADOOAD
Preliminary design of the statistical component in progressin progress
Core statistics comparison package
User layer
Policy-based class design
http://www.ge.infn.it/geant4/rose/statistics/
Validation of the design through use cases
Some open issues identified, to be addressed in next design iteration
Maria Grazia Pia, INFN Genova work in
progre
ss+ more algorithms
Maria Grazia Pia, INFN Genova work in
progre
ss
Maria Grazia Pia, INFN Genova
work in progress
Use case: compare two continuous distributions
Maria Grazia Pia, INFN Genova
Work in progressWork in progressImplementation and test of preliminary design
What can be re-used?– Algorithms in GSL, NAG libraries (to be evaluated)
Studies in progress– Transformation between continuous-discrete distributions– Strategies to use Kolmogorov-Smirnov with discrete distributions (E. Dagum + original ideas)– How to deal with experimental errors (not only statistical!)– Multi-dimensional distributions– Bayesian approach
In the to-do list– Conversion from AIDA objects to distributions– “Pythonisation”
Revision of the initial documents (Vision, URD, Risks)– Based on the recent evolutions in the project– Input from today’s meeting?
Maria Grazia Pia, INFN Genova
Work in progress: Geant4-specificWork in progress: Geant4-specific
Development of general physics tests in the E.M. domain, for comparison of reference distributions
– Compilation of existing tests– Evaluation, documentation of tests– Elicitation of requirements for tests among the Geant4 physics groups– Collection of reference data/distributions
Prototype for automated comparison w.r.t. reference databases – NIST, Sandia etc., directly downloaded from the web– Prototype as a risk mitigation strategy
Integration in the Geant4 system testing framework
Integration in Geant4 physics testing frameworks
Maria Grazia Pia, INFN Genova
Where?Where?
Geant4-specific stuff– In Geant4– May be included in public distribution, if of interest to users
Core statistical component– Developed in an independent CVS repository– Code, documentation, software process deliverables
Web site– http://www.ge.infn.it/geant4/analysis/TandA/index.html
Contact persons– [email protected], [email protected]
Maria Grazia Pia, INFN Genova
Time scaleTime scale
Aggressive time scale driven by Geant4 needsdriven by Geant4 needs– incremental and iterative software process
OOAD + implementation already startedPrototype at CHEP
Advanced functional system summer 2003
Open to the needs/suggestions of LCG– compatible with the available resources and Geant4 needs
Maria Grazia Pia, INFN Genova
Conclusions…Conclusions…
Geant4 requires a statistical testing system for physics validation and regression testing
– to provide a high quality product to its user communities
Core statistical component (of potential general interest)Geant4-specific components
Project compatible with LCG architecture blueprint– component-based approach, AIDA, Python…
Rigorous software process– to contribute to the quality of the product
Aggressive time scale dictated by Geant4 needs
Open to scientific collaborationBeginning
…Beginning
…