problemi e strategie relativi al calcolo off-line di virgo laura brocco universita’ di roma “la...

24
Problemi e strategie Problemi e strategie relativi al calcolo relativi al calcolo off- off- line line di VIRGO di VIRGO Laura Brocco Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the for the VIRGO VIRGO collaboration collaboration

Upload: arnold-underwood

Post on 01-Jan-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Problemi e strategie Problemi e strategie relativi al calcolo relativi al calcolo off-lineoff-line

di VIRGOdi VIRGOLaura BroccoLaura Brocco

Universita’ di Roma “La Sapienza” & INFN Roma1

for the for the VIRGOVIRGO collaboration collaboration

Page 2: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

OutlinesOutlines

Part IPart I• Data Production • Data Transfer and Storage Part IIPart II• Search for gravitational wave pulses

and quasi-periodic signals • Search for periodic signals

• Conclusions

Page 3: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

I - Data Production and I - Data Production and StorageStorage

Page 4: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Status of VirgoStatus of Virgo• CITF commissioningCITF commissioning ended on September 2002 ended on September 2002

5 Engineering Runs (three-days long) done 5 Engineering Runs (three-days long) done• ITF commissioningITF commissioning started in September 2003 started in September 2003

(ends in September 2004) (ends in September 2004) 4 Engineering Runs done 4 Engineering Runs done until now until now

• Full Virgo locked before the end of 2004 Full Virgo locked before the end of 2004

Page 5: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Virgo Data ProductionVirgo Data Production5 different5 different data streams produced: data streams produced:• Raw dataRaw data

Time series containing information from the different sub-

systems, recorded in 1 sec long frames. Each file is made of 300 frames ( 1.8 GByte size). The data flow is 6 MByte/sec.

• Processed dataProcessed data h-recon, quality channels. Stored in frames 1 sec long.

Expected data flow 0.6 MByte/sec.

• Trend dataTrend data Slowly acquired information, global information, fast quantities. These information are stored in frames 1 hour long. The expected data flow is about 10 kByte/sec.

• 50 Hz data 50 Hz data Fast channels down-sampled @ 50 Hz for long term studies

Data flow 140 kByte/sec.• Network analysis dataNetwork analysis data

Data made available to external collaborations (i.e. LIGO). These data contains environmental data, h-recon, etc.Expected data flow ~ 1 MByte/sec (depending on the agreement on data exchange among the different experiments).

Page 6: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Data Transfer & StorageData Transfer & Storage I – Present Situation

CascinaCascinaVIRGOVIRGO CNAFCNAF

LYONLYON

bbftpbbftp

bbftbbftpp

CASCINACASCINA: 70 TByte storage (as data buffer for daily activities) + LTO Tapes

CNAFCNAF: nas1, nas2 & nas3

9.96 TByte9.96 TByte full with ER data (from E0 to C3)

Asked up to 20 TByte for 2004

Transfer performed by virgo-gateway machine (Dell bi-processor @ 1 GHz)

Data flow 3 MByte/sec

LYONLYON: Data stored with HPSS (from E0 to C3) Data flow @ 6.4 MByte/sec

Page 7: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Data Transfer & StorageData Transfer & StorageII – Futures Plans

CascinaCascinaStorageStorage

SRM

MySQL archive

bbftp Server

SRM Client C2 Temp. Buffer

SRM

MySQL archive

SRM Client C1

bbftp Server

SRM Client C3

SRM Client C2

CascinaCascina Bologna-Bologna-CNAFCNAF

Bologna Bologna StorageStoragebbft

p

bbftp to Lyon

On

-O

n-

Lin

eL

ine To

BKDB

@ Lyon

SRM Client C3To BKDB @ Lyon

Page 8: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Book-Keeping Data-BaseBook-Keeping Data-Base

Oracle Data-Base.Generated by SRM Client C3 in Cascina, and

hosted in Lyon.Replicated both in Bologna and Cascina

CascinaCascina BolognaBologna LyonLyon FileFile Info. Info. Info.

InformationDirectory1 yes0 no/deleted2 in transfer

Directory1 yes0 no/deleted2 in transfer

Directory1 yes0 no/deleted2 in transfer

Name, SizeGPS timeDAQ informationEvent information

Page 9: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

II - Off-line Analysis II - Off-line Analysis ProceduresProcedures

Page 10: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Data analysis RequirementsData analysis RequirementsI - Search for bursts & coalescing

binary gravitational signalsBursts:Bursts:

Short signals (4100 ms) of unknown shape, frequencies between 50 Hz and 6000 Hz, and amplitude 10-25 ≤ h ≤ 10-20 .

Specific Burst oriented software developedSpecific Burst oriented software developed: Burst Library (BuL):  C++ library containing several packages dedicated to the search for burst gravitational waves.

BuL is developed on DEC/OSF1 V5.2, Linux/RH 6.1 and Linux/RH 7.2, and all the packages are managed and built using CMT.

SNAG (Signal and Noise for Gravitational Antennas): MatLab toolbox containing filters to perform burst searches both in frequency and time domain.

SNAG is developed on Windows & Linux (to be completed)

Page 11: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Data analysis RequirementsData analysis RequirementsI - Search for bursts & coalescing

binary gravitational signalsPreprocessing for Bursts analysisPreprocessing for Bursts analysis

Whitening:  Library dedicated to perform data whitening. There exists a C

version (LIB_Whitening original) and a C++ version (Whitening, interfaced with BuL) Ana Batch: 

C++ framework which provides some facilities to extract data from Virgo data files (in Frame format).NAP (Noise Analysis Package)

C & C++ library containing all the packages dedicated to noise studies and simulations (in development).

Typical duration of jobs:Typical duration of jobs: 1 hour CPU-time for 1 hour of data samples (on a Xeon bi-processor @ 1.7 GHz with 1.5 GByte RAM)

From 1/2 to 1 hour CPU-time for 1/2 hour of data samples on MatLab (Windows), depending on the number of templates and of threshold values

Some algorithms need machine cluster (matched filtering with 1000 templates)

Page 12: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Data analysis RequirementsData analysis RequirementsI - Search for bursts & coalescing

binary gravitational signals

chirpCoalescing Binary Coalescing Binary Systems:Systems:

• Compact stars (NS/NS, NS/BH, BH/BH)

• The exact shape of the signal is accurately predictable, but depends on the two masses of the stars, on their spin rates + several relativistic effects

Page 13: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Data analysis RequirementsData analysis RequirementsI - Search for bursts & coalescing

binary gravitational signalsCoalescing Binary Systems:Coalescing Binary Systems:

Matched filtering techniques have been developed, with Matched filtering techniques have been developed, with thousands of banks of filters (Templates average size 4 thousands of banks of filters (Templates average size 4 MByte)MByte)

1. Single frequency band analysis (Flat Search)(Flat Search), running with Merlino framework (written on Ansi C, communication based on MPI on a beowulf cluster)

2. Two frequency band analysis (Multi-Band Template (Multi-Band Template Analysis)Analysis), with same templates grids for all frequency bands

3. Dynamic Matched Filter Techniques (Price Algorithm)(Price Algorithm)4. Hierarchical strategies using ALE (Adaptive Line Enhanced ALE (Adaptive Line Enhanced

filters)filters)

Needed Needed high computing powerhigh computing power (~ 300 Gflops for in-time (~ 300 Gflops for in-time analysis, 3 times more for off-line analysis) and needed analysis, 3 times more for off-line analysis) and needed distribute framework to parallel computationdistribute framework to parallel computation

Page 14: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Data analysis RequirementsData analysis RequirementsI - Search for bursts & coalescing

binary gravitational signalsScheme for bursts and coalescing binary Scheme for bursts and coalescing binary

detection detection

To be implemented @ BolognaTo be implemented @ Bolognah reconstructionh reconstruction

2 signals @ 20kHz

Lines removalLines removal

WhiteningWhitening

Decimation/Decimation/Re-samplingRe-sampling

Bursts Filters

C.B. Filters

DataDataStorageStorage

Ev. selected

Ev. selected

StoraStoragege

Raw data

StorageStorage

Page 15: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Data analysis RequirementsData analysis RequirementsII - Search for periodic gravitational

signalsPeriodic gravitational signals are emitted, e.g., by asymmetric rotating neutron stars.

Amplitude of the signals very low Amplitude of the signals very low long integration long integration times (~ months) are needed. times (~ months) are needed.

Hierarchical strategy has been developed based on the Hierarchical strategy has been developed based on the alternation of “coherent” and “incoherent” steps.alternation of “coherent” and “incoherent” steps.

Large computing resources needed for the analysis: Tflops Tflops rangerange

However, the larger is the CP we can access and the wider is the portion of source parameter space we can explore.

Low granularity: the analysis method is well suited to a distributed computing environment.

Two main computing centers, Bologna and Lyon, plus Napoli and Roma

Page 16: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

preliminary analisys

input files

GRIDGRID

candidates

Candidates copied back to a local machine for further steps of the analysis. Typical output files dimensions ~200kB, ~2∙104 candidates.

Typical dimensions ~1.2 MB for 6 6 months ofmonths of datadata. Replicated among SEs.

~ 105 jobs sent in 3 months3 months (incoherent steps) Typical job duration ~5-10 hours on a 2.4 GHZ Xeon proc, depending on the source frequency.

Performed locally (coherent steps).

C.C. Storage

Page 17: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

We are carrying on test activities on the data analysis software in two computing environments: local batch systems (PBS) and grid (INFN-Grid).Main activities so far:Main activities so far:

• Adaptation of the data analysis procedures to work in a distributed environment;

• Tests of the “incoherent” part of the analysis pipeline (several software versions) using simulated data (thousands of jobs submitted). Used machines:

• Roma, Bologna, Napoli (about 30 machines) whithin INFN-Grid

• Lyon (25 processors) as a classic batch system

• Full-scale test of the “coherent” part of the analysis (28 processors for ~3 months, 24 hours/day; farms in Bologna and Roma).

Results:Results:

• very good scaling of performances with the number of nodes involved (but only small scale tests done up to now);

• grid software more and more stable and reliable;

Page 18: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

ConclusionsConclusions

The Virgo experiment will complete the commissioning in 2004.The Virgo experiment will complete the commissioning in 2004.

Data Production:Data Production: 5 kinds of data will be produced, with data flow from 10 kByte/sec

(Trend Data) up to 6 MByte/sec (raw data) Typical raw-data file size 1.8 GByte

Storage:Storage: 2 permanent storage, Bologna-CNAF and Lyon, + CascinaAutomatic processes to transfer data from Cascina to Bologna and

from Bologna to Lyon are in development

Data Analysis:Data Analysis: Several filters have been developed to search for gravitational

waves, all the filtering techniques need for high computing power and parallel computations.

4 M.D.C. (productions) performed until now, next foreseen in June.4 M.D.C. (productions) performed until now, next foreseen in June.GRID tests have been performed using Roma, Bologna and Napoli GRID tests have been performed using Roma, Bologna and Napoli

farms. farms. Larger scale tests will be performed in next months.Larger scale tests will be performed in next months.

The analysis of scientific data will start in 2005.The analysis of scientific data will start in 2005.

Page 19: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration
Page 20: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration
Page 21: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration
Page 22: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Merlino FrameworkMerlino Framework• Distributed framework for data a parallel data

analysis • Is composed of 4 main processes• Written in ANSI C code, communication based on

MPI and running on a Beowulf cluster• “plug-ins” functions customization (dynamic library)• Data flow customization

• Plug-in actually used, tested of under develop:

– Matched Filter– Inspiral generator– Mean Filter – PC– Dumped SineFilter

By Leone B.Bosi

Page 23: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Next steps (in 2004)Next steps (in 2004)

• integration and validation of the whole analysis software;

• larger scale grid tests (up to ~100 processors and more involved);

Page 24: Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

Lyon

INFN-GRIDVirgo

CE SEGIIS

GRIS

GRIS

CE SEGIIS

GRIS

GRIS

MDSVirgo-I

MDSVirgo-F

CE SEGIIS

GRIS

GRIS

Virgo-I Cnaf

Virgo-F Lione-

GIIS GIIS

RLS @ CnafScenario 1

VirgoBDII RB

Virgo-I Roma

CE SEGIIS

GRIS

GRIS

Virgo-I Napoli

CE SEGIIS

GRIS

GRIS

Virgo-F ….-

VirgoBDIIRB

by Antonia Ghiselli