mathematical modeling and computational physics 2013

23
LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics 2013 Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna

Upload: eshana

Post on 15-Jan-2016

55 views

Category:

Documents


0 download

DESCRIPTION

Development of the distributed computing system for the MPD at the NICA collider, analytical estimations. Gertsenberger K . V . Joint Institute for Nuclear Research , Dubna. Mathematical Modeling and Computational Physics 2013. NICA scheme. MMCP’2013. Multipurpose Detector (MPD). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Mathematical Modeling and Computational Physics 2013

LOGO

Development of the distributed computing system for the MPD at the NICA collider, analytical estimations

Mathematical Modeling and Computational Physics 2013

Gertsenberger K. V.

Joint Institute for Nuclear Research, Dubna

Page 2: Mathematical Modeling and Computational Physics 2013

NICA scheme

Gertsenberger K.V. 2MMCP’2013

Page 3: Mathematical Modeling and Computational Physics 2013

Multipurpose Detector (MPD)

The software MPDRoot is developed for the event simulation, reconstruction and physical analysis of the heavy ions’ collision registered by MPD at the NICA collider.

3Gertsenberger K.V.MMCP’2013

Page 4: Mathematical Modeling and Computational Physics 2013

Prerequisites of the NICA cluster

high interaction rate (to 6 KHz) high particle multiplicity, about 1000 charged particles for the

central collision at the NICA energyone event reconstruction takes tens of seconds in

MPDRoot now, 1M events – months large data stream from the MPD:

100k events ~ 5 TB

100 000k events ~ 5 PB/yearunified interface for parallel processing and storing of

the event data

4Gertsenberger K.V.MMCP’2013

Page 5: Mathematical Modeling and Computational Physics 2013

Development of the NICA cluster

2 main lines of the development:

data storage development for the experiment

organization of parallel processing of the MPD events

5

development and expansion distributed cluster for the MPD experiment based on LHEP farm

development and expansion distributed cluster for the MPD experiment based on LHEP farm

Gertsenberger K.V.MMCP’2013

Page 6: Mathematical Modeling and Computational Physics 2013

Current NICA cluster in LHEP for MPD

6Gertsenberger K.V.MMCP’2013

Page 7: Mathematical Modeling and Computational Physics 2013

Distributed file system GlusterFS

aggregates the existing file systems in common distributed file system

automatic replication works as background process

background self-checking service restores corrupted files in case of hardware or software failure

implemented on application layer and working in user space

7Gertsenberger K.V.MMCP’2013

Page 8: Mathematical Modeling and Computational Physics 2013

Data storage on the NICA cluster

8Gertsenberger K.V.MMCP’2013

Page 9: Mathematical Modeling and Computational Physics 2013

Development of the distributed computing system

PROOF serverparallel data processing in a ROOT macro on the parallel architectures

NICA clusterconcurrent data processing

on cluster nodes

MPD-schedulerscheduling system for the task distribution to parallelize data processing on cluster nodes

9Gertsenberger K.V.MMCP’2013

Page 10: Mathematical Modeling and Computational Physics 2013

Parallel data processing with PROOF

PROOF (Parallel ROOT Facility) – the part of the ROOT software, no additional installations

PROOF uses data independent parallelism based on the lack of correlation for MPD events good scalability

Parallelization for three parallel architectures:

1. PROOF-Lite parallelizes the data processing on one multiprocessor/multicores machine

2. PROOF parallelizes processing on heterogeneous computing cluster

3. Parallel data processing in GRID

Transparency: the same program code can execute both sequentially and concurrently

10Gertsenberger K.V.MMCP’2013

Page 11: Mathematical Modeling and Computational Physics 2013

Using PROOF in MPDRoot The last parameter of the reconstruction: run_type (default, “local”).

Speedup on the user multicore machine:

$ root reco.C(“evetest.root”, “mpddst.root”, 0, 1000, “proof”)

parallel processing of 1000 events with thread count being equal logical processor count

$ root reco.C (“evetest.root”, “mpddst.root”, 0, 500, “proof:workers=3”)

parallel processing of 500 events with 3 concurrent threads

Speedup on the NICA cluster:$ root reco.C(“evetest.root”, “mpddst.root”, 0, 1000, “proof:[email protected]:21001”)

parallel processing of 1000 events on all cluster nodes of PoD farm

$ root reco.C (“eve”, “mpddst”, 0, 500, “proof:[email protected]:21001:workers=10”)

parallel processing of 500 events on PoD cluster with 10 workers

11Gertsenberger K.V.MMCP’2013

Page 12: Mathematical Modeling and Computational Physics 2013

Speedup of the reconstruction on 4-cores machine

12Gertsenberger K.V.MMCP’2013

Page 13: Mathematical Modeling and Computational Physics 2013

PROOF on the NICA cluster

13Gertsenberger K.V.MMCP’2013

proofproof proof proof proof proof proof

proof

proof = master serverproof = slave node

*.root

GlusterFS

Proof On Demand Cluster

(8) (8) (16) (16) (24) (24) (32)

$ root reco.C(“evetest.root”,”mpddst.root”, 0, 3, “proof:[email protected]:21001”)

event count

evetest.rootevent №0

event №1

event №2

mpddst.root

Page 14: Mathematical Modeling and Computational Physics 2013

Speedup of the reconstruction on the NICA cluster

14Gertsenberger K.V.MMCP’2013

Page 15: Mathematical Modeling and Computational Physics 2013

MPD-schedulerDeveloped on C++ language with ROOT classes support.

Uses scheduling system Sun Grid Engine (qsub command) for execution in cluster mode.

SGE combines cluster machines on LHEP farm into the pool of worker nodes with 78 logical processors.

The job for distributed execution on the NICA cluster is described and passed to MPD-scheduler as XML file:

$ mpd-scheduler my_job.xml

15Gertsenberger K.V.MMCP’2013

Page 16: Mathematical Modeling and Computational Physics 2013

Job description

16

<job>

<macro name="$VMCWORKDIR/macro/mpd/reco.C" start_event=”0” count_event=”1000” add_args=“local”/>

<file input="$VMCWORKDIR/macro/mpd/evetest1.root" output="$VMCWORKDIR/macro/mpd/mpddst1.root"/>

<file input="$VMCWORKDIR/macro/mpd/evetest2.root" output="$VMCWORKDIR/macro/mpd/mpddst2.root"/>

<file db_input="mpd.jinr.ru*,energy=3,gen=urqmd" output="~/mpdroot/macro/mpd/evetest_${counter}.root"/>

<run mode="local" count="5" config=“~/build/config.sh" logs="processing.log"/>

</job>The description starts and ends with tag <job>.

Tag <macro> sets information about macro being executed by MPDRoot

Tag <file> defines files to process by macro above

Tag <run> describes run parameters and allocated resources

* mpd.jinr.ru – server name with production database

Gertsenberger K.V.MMCP’2013

Page 17: Mathematical Modeling and Computational Physics 2013

<job> <macro name="~/mpdroot/macro/mpd/reco.C"/> <file input="$VMCWORKDIR/evetest1.root" output="$VMCWORKDIR/mpddst1.root"/> <file input="$VMCWORKDIR/evetest2.root" output="$VMCWORKDIR/mpddst2.root"/> <file input="$VMCWORKDIR/evetest3.root" output="$VMCWORKDIR/mpddst3.root"/> <run mode=“global" count=“3" config=“~/mpdroot/build/config.sh"/></job>

Job execution on the NICA cluster

17Gertsenberger K.V.MMCP’2013 17Gertsenberger K.V.MMCP’2013

SGESGE SGE SGE SGE SGE SGE

SGE = Sun Grid Engine serverSGE = Sun Grid Engine worker

*.root

GlusterFS

SGE batch system

(8) (8) (16) (16) (24) (24) (32)

qsubevetest1.root

SGE

MPD-schedulerevetest2.root

evetest3.root

free free free busy busy busy busy

mpddst2.root

job_reco.xml

<job> <command line="get_mpd_production energy=5-9 "/> <run mode="global" config="~/mpdroot/build/config.sh"/></job>

job_command.xml

mpddst1.root mpddst3.rootjob_command.xml

Page 18: Mathematical Modeling and Computational Physics 2013

Speedup of the one reconstruction on NICA cluster

18Gertsenberger K.V.MMCP’2013

Page 19: Mathematical Modeling and Computational Physics 2013

NICA cluster section on mpd.jinr.ru

19Gertsenberger K.V.MMCP’2013

Page 20: Mathematical Modeling and Computational Physics 2013

Conclusions The distributed NICA cluster was deployed based on LHEP farm for

the NICA/MPD experiment (Fairsoft, ROOT/PROOF, MPDRoot, Gluster, Torque, Maui). 128 cores

The data storage was organized with distributed file system GlusterFS: /nica/mpd[1-8]. 10 TB

PROOF On Demand cluster was implemented to parallelize event data processing for the MPD experiment, PROOF support was added to the reconstruction macro.

The system for the distributed job execution MPD-scheduler was developed to run MPDRoot macros concurrently on the cluster.

The web site mpd.jinr.ru in section Computing – NICA cluster presents the manuals for the systems described above.

20Gertsenberger K.V.MMCP’2013

Page 21: Mathematical Modeling and Computational Physics 2013

LOGO

Page 22: Mathematical Modeling and Computational Physics 2013

Analytical model for parallel processing on cluster

22

Spሺnሻ= BD ∗Pnode ∗ቀ2∗ nW + T1ቁ n∗ሺPnode + 1ሻ + BD ∗T1 speedup for point (data independent) algorithm of image processing

Pnode – count of logical processors, n – data to process (byte), ВD – speed of the data access (MB/s), T1 – “pure” time of the sequential processing (s)

Gertsenberger K.V.MMCP’2013

Page 23: Mathematical Modeling and Computational Physics 2013

Prediction of the NICA computing power

23

How many are logical processors required to process NTASK physical analysis tasks and one reconstruction within Tday days in parallel?

Pnode = n+ BD ∗T1 BD ∗Tpar – n Pnode (NTASK ) = n1 ∗(NTASK + 1) ∗ NEVENT + BD ∗(TPA ∗NTASK + TREC) ∗NEVENT BD ∗(Tday ∗24∗3600)– n1 ∗(NTASK + 1) ∗ NEVENT

If n1 = 2 MB, NEVENT = 10 000 000 events, TPA = 5 s/event, TREC = 10 s/event., BD = 100 MB/s, Tday = 30 days

Gertsenberger K.V.MMCP’2013