ams tim, cern jul 23, 2004 ams computing and ground centers alexei klimentov —...
Post on 18-Dec-2015
228 Views
Preview:
TRANSCRIPT
AMS TIM, CERN Jul 23, 2004
AMS Computing and Ground Centers
Alexei Klimentov — Alexei.Klimentov@cern.ch
2Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS Computing and Ground Data Centers
AMS-02 Ground Centers– AMS centers at JSC– Ground data transfer– Science Operation Center prototype
» Hardware and Software evaluation» Implementation plan
AMS/CERN computing and manpower issues MC Production Status
– AMS-02 MC (2004A)– Open questions :
» plans for Y2005» AMS-01 MC
3Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS-02 Ground Support SystemsPayload Operations Control Center (POCC) at CERN (first 2-3 months in Houston)CERN Bldg.892 wing A
“control room”, usual source of commands
receives Health & Status (H&S), monitoring and science data in real-time receives NASA video
voice communication with NASA flight operations
Backup Control Station at JSC (TBD)Monitor Station in MIT
“backup” of “control room” receives Health & Status (H&S) and monitoring data in real-time voice communication with NASA flight operations
Science Operations Center (SOC) at CERN (first 2-3 months in Houston)CERN Bldg.892 wing A
receives the complete copy of ALL datadata processing and science analysisdata archiving and distribution to Universities and Laboratories
Ground Support Computers (GSC) at Marshall Space Flight Center receives data from NASA -> buffer -> retransmit to Science Center
Regional Centers Madrid, MIT, Yale, Bologna, Milan, Aachen, Karlsruhe, Lyon, Taipei, Nanjing, Shanghai,…
analysis facilities to support geographically close Universities
4Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS facilities
NASAfacilities
5Alexei Klimentov. AMS TIM @ CERN. July 2004.
6Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS Ground Centers at JSC Requirements to AMS Ground
Systems at JSC Define AMS GS HW and SW
components Computing facilities
– “ACOP” flight– AMS pre-flight– AMS flight– “after 3 months”
Data storage Data transmission Discussed with NASA in Feb 2004
http://ams.cern.ch/Computing/pocc_JSC.pdf
7Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS-02 Computing facilities at JSC
Center Location Function(s) Computers QtyPOCC Bldg.30, Rm 212 Commanding
Telemetry Monitoring
On-line processing
Pentium MS Win
Pentium Linux
19” Monitors
Networking switches
Terminal Console
MCC WS
4
28
19
8
2
2
SOC Bldg.30, Rm 3301 Data Processing
Data Analysis
Data, Web, News Servers
Data Archiving
Pentium Linux
IBM LTO tapes drives
Networking switches
17” color monitors
Terminal console
35
2
10
5
2
“Terminal room” tbd Notebooks, desktops 100
AMS CSR Bldg.30M, Rm 236 Monitoring Pentium Linux
19” color monitor
MCC WS
2
2
1
8Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS Computing at JSC (TBD)Year Responsible Actions
LR-8 months N.Bornas, P.Dennett, A.Klimentov, A.Lebedev, B.Robichaux, G.Carosi
Set-up at JSC the “basic version” of the POCC
Conduct tests with ACOP for commanding and data transmission
LR-6 months P.Dennett, A.Eline, P.Fisher, A.Klimentov, A.Lebedev, A.Eline, “Finns” (?)
Set-up POCC “basic version” at CERN
Set-up “AMS monitoring station” in MIT
Conduct tests with ACOP/MSFC/JSC commanding and data transmission
LR A.Klimentov, B.Robichaux Set-up POCC “flight configuration” at JSC
LR
L – 2 weeks
V.Choutko, A.Eline, A.Klimentov, B.Robichaux
A.Lebedev, P.Dennett
Set-up SOC “flight configuration” at JSC
Set-up “terminal room and AMS CSR”
Commanding and data transmission verification
L+2 months
(tbd)
A.Klimentov Set-up POCC “flight configuration” at CERN
Move part of SOC computers from JSC to CERN
Set-up SOC “flight configuration” at CERN
L+3 months
(tbd)
A.Klimentov, A.Lebedev, A.Eline, V.Choutko
Activate AMS POCC at CERN
Move all SOC equipment to CERN
Set-up AMS POCC “basic version” at JSC
LR – launch ready date : Sep 2007, L – AMS-02 launch date
9Alexei Klimentov. AMS TIM @ CERN. July 2004.
Data Transmission
Will AMS need a dedicated line to send data from MSFC to ground centers or the public Internet can be used ?
What Software (SW) must be used for a bulk data transfer and how reliable is it ?
What data transfer performance can be achieved ?G.Carosi ,A.Eline,P.Fisher, A.Klimentov
High Rate Data Transfer between MSFC Al and POCC/SOC, POCC and SOC, SOC and Regional centers
will become a paramount importance
10Alexei Klimentov. AMS TIM @ CERN. July 2004.
Global Network Topology
11Alexei Klimentov. AMS TIM @ CERN. July 2004.
12Alexei Klimentov. AMS TIM @ CERN. July 2004.
13Alexei Klimentov. AMS TIM @ CERN. July 2004.
A.Elin, A.Klimentov, K.Scholberg and J.Gong
‘amsbbftp’ tests CERN/MIT & CERN/SEU Jan/Feb 2003
14Alexei Klimentov. AMS TIM @ CERN. July 2004.
Data Transmission Tests (conclusions)
In its current configuration Internet provides sufficient bandwidth to transmit AMS data from MSFC Al to AMS ground centers at rate approaching 9.5 Mbit/sec
We are able to transfer and store data on a high end PC reliably with no data loss
Data transmission performance is comparable of what achieved with network monitoring tools
We can transmit data simultaneously to multiple cites
15Alexei Klimentov. AMS TIM @ CERN. July 2004.
Data and Computation for Physics Analysis
batchphysicsanalysis
batchphysicsanalysis
detector
event tags data
rawdata
eventreconstruction
eventreconstruction
eventsimulation
eventsimulation
interactivephysicsanalysis
analysis objects(extracted by physics topic)
event filter(selection &
reconstruction)
event filter(selection &
reconstruction)
processeddata (event summary dataESD/DST)
16Alexei Klimentov. AMS TIM @ CERN. July 2004.
Symmetric Multi-Processor (SMP) Model
Experiment
TapeStorage TeraBytes of disks
17Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS SOC (Data Production requirements)
Reliability – High (24h/day, 7days/week) Performance goal – process data “quasi-online” (with typical delay < 1 day) Disk Space – 12 months data “online” Minimal human intervention (automatic data handling, job control and book-keeping) System stability – months Scalability Price/Performance
Complex system that consists of computing components including I/O nodes, worker nodes, data storage and networking switches. It should perform as a single system.Requirements :
18Alexei Klimentov. AMS TIM @ CERN. July 2004.
Production Farm Hardware Evaluation
Processor Intel PIV 3.4+GHz, HT
Memory 1 GB
System disk and transient data storage 400 GB , IDE disk
Ethernet cards 2x1 GBit
Estimated Cost 2500 CHF
Processor Intel Pentium dual-CPU Xeon 3.2+GHz
Memory 2 GB
System disk SCSI 18 GB double redundant
Disk storage 3x10x400 GB RAID 5 array
or 4x8x400 GB RAID 5 array
Effective disk volume 11.6 TB
Ethernet cards 3x1 GBit
Estimated cost 33000 CHF (or 2.85 CHF/GB)
“Processing node”
disk server
19Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS-02 Ground Centers.Science Operations Center. Computing Facilities.
CERN/AMS Network
AMS Physics Services
N
Central Data Services
Shared Disk Servers
25 TeraByte disk6 PC based servers
25 TeraByte disk6 PC based servers
tape robotstape drivesLTO, DLT
tape robotstape drivesLTO, DLT
Shared Tape Servers
Homedirectories& registry
consoles&
monitors
Production Facilities,40-50 Linux dual-CPU
computers
Linux, Intel and AMD Linux, Intel and AMD
EngineeringCluster
EngineeringCluster
5 dual processor PCs5 dual processor PCs
Data Servers,
Analysis Facilities(linux cluster)
10-20 dual processor PCs10-20 dual processor PCs
5 PC servers
AMS regional Centers
batchdata
processing
batchdata
processing
interactivephysicsanalysis
Interactive and Batch physicsanalysis
20
Archiving and Staging (CERN CASTOR)
Analysis FacilitiesData Server
Cell #1
PC Linux3.4+GHz
PC Linux Server2x3.4+GHz, RAID 5 ,10TB
DiskServer
DiskServer
Gigabit Switch (1 Gbit/sec)
Gigabit Switch (1 Gbit/sec)
AMS dataNASA datametadata
AMS Science Operation Center Computing Facilities Production Farm
DiskServer
DiskServer
Sim
ula
ted
data
MC Data Server
PC Linux3.4+GHz
PC Linux3.4+GHz
PC Linux3.4+GHz
PC Linux3.4+GHz
Gigabit Switch
PC Linux3.4+GHz
PC Linux Server 2x3.4+GHz
Web, News Production, DB servers
AFS Server
Cell #7
PC Linux Server2x3.4+GHz, RAID 5 ,10TB
Tested, prototype in production
Not tested and no prototype yet
21Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS-02 Science Operations Center Year 2004
– MC Production (18 AMS Universites and Labs)» SW : Data processing, central DB, data mining, servers» AMS-02 ESD format
– Networking (A.Eline, Wu Hua, A.Klimentov)» Gbit private segment and monitoring SW in production since April
– Disk servers and data processing (V.Choutko, A.Eline, A.Klimentov)» dual-CPU Xeon 3.06 GHz 4.5 TB disk space in production since Jan» 2nd server : dual-CPU Xeon 3.2 GHz, 9.5 TB will be installed in Aug (3 CHF/GB)» data processing node : PIV single CPU 3.4 GHz Hyper-Threading mode in production since Jan
– Datatransfer station (Milano group : M.Boschini, D.Grandi,E.Micelotta and A.Eline)» Data transfer to/from CERN (used for MC production)» Station prototype installed in May» SW in production since January
Status report on next AMS TIM
22Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS-02 Science Operations Center
Year 2005– Q 1 : SOC infrastructure setup
» Bldg.892 wing A : false floor, cooling, electricity
– Mar 2005 setup production cell prototype » 6 processing nodes + 1 disk server with private Gbit ethernet
– LR-24 months (LR – “launch ready date”) Sep 2005» 40% production farm prototype (1st bulk computers purchasing)
» Database servers
» Data transmission tests between MSFC AL and CERN
23Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS-02 Computing Facilities .
Function Computer QtyDisks (Tbytes)
and Tapes
Ready(*)
LR-months
GSC@MSFC Intel (AMD) dual-CPU, 2.5+GHz 33x0.5TB
Raid ArrayLR-2
POCC
POCC prototype@JSCIntel and AMD, dual-CPU, 2.8+GHz 45 6 TB Raid Array LR
Monitor Station in MIT Intel and AMD, dual-CPU, 2.8+GHz 5 1 TB Raid Array LR-6
Science Operation Centre :
Production Farm Intel and AMD, dual-CPU , 2.8+GHz 50 10 TB Raid Array LR-2
Database Servers dual-CPU 2.8+ GHz Intel or Sun SMP 2 0.5TB LR-3
Event Storage and
ArchivingDisk Servers dual-CPU Intel 2.8+GHz 6
50 Tbyte Raid Array
Tape library (250 TB)LR
Interactive and
Batch AnalysisSMP computer, 4GB RAM, 300 Specint95 or Linux farm
10 1 Tbyte Raid Array LR-1
“Ready” = operational, bulk of CPU and disks purchasing LR-9 Months
24Alexei Klimentov. AMS TIM @ CERN. July 2004.
People and Tasks (“my” incomplete list) 1/4
Architecture POIC/GSC SW and HW GSC/SOC data transmission SW GSC installation GSC maintenance
A.Mujunen,J.RitakariA.Mujunen,J.Ritakari, P.Fisher,A.Klimentov
A.Mujunen, J.Ritakari
A.Klimentov, A.Elin
MIT, HUT
MIT
AMS-02 GSC@MSFC
Status : Concept was discussed with MSFC RepsMSFC/CERN, MSFC/MIT data transmission tests done
HUT have no funding for Y2004-2005
25Alexei Klimentov. AMS TIM @ CERN. July 2004.
People and Tasks (“my” incomplete list) 2/4
Architecture TReKGate, AMS Cmd Station Commanding SW and Concept Voice and Video Monitoring Data validation and online
processing HW and SW maintenance
P.Fisher, A.Klimentov, M.Pohl
P.Dennett, A.Lebedev, G.Carosi, A.Klimentov,
A.Lebedev
G.Carosi
V.Choutko, A.Lebedev
V.Choutko, A.Klimentov
More manpower will be needed starting LR-4 months
AMS-02 POCC
26Alexei Klimentov. AMS TIM @ CERN. July 2004.
People and Tasks (“my” incomplete list) 3/4
Architecture Data Processing and Analysis System SW and HEP appl. Book-keeping and Database HW and SW maintenance
V.Choutko, A.Klimentov, M.Pohl
V.Choutko, A.Klimentov
A.Elin, V.Choutko, A.Klimentov
M.Boschini et al, A.Klimentov
More manpower will be needed starting from LR – 4 months
AMS-02 SOC
Status : SOC Prototyping is in progress SW debugging during MC production
Implementation plan and milestones are fulfilled
27Alexei Klimentov. AMS TIM @ CERN. July 2004.
People and Tasks (“my” incomplete list) 4/4
INFN Italy IN2P3 France SEU China Academia Sinica RWTH Aachen … AMS@CERN
PG Rancoita et al
G.Coignet and C.Goy
J.Gong
Z.Ren
T.Siedenburg
M.Pohl, A.Klimentov
AMS-02 Regional Centers
Status : Proposal prepared by INFN groups for IGS and J.Gong/A.Klimentov for CGS can be used by other Universities.
Successful tests of distributed MC production and data transmission between AMS@CERN and 18 Universities.
Data transmission, book-keeping and process communication SW (M.Boschini, V.Choutko, A.Elin and A.Klimentov) released.
28Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS/CERN computing and manpower issues AMS Computing and Networking requirements summarized in Memo
– Nov 2005 : AMS will provide a detailed SOC and POCC implementation plan– AMS will continue to use its own computing facilities for data processing and
analysis, Web and News services– There is no request to IT for support for AMS POCC HW or SW – SW/HW ‘first line’ expertise will be provided by AMS personnel– Y2005 – 2010 : AMS will have guaranteed bandwidth of USA/Europe line– CERN IT-CS support in case of USA/Europe line problems– Data Storage : AMS specific requirements will be defined in annual basis– CERN support of mails, printing, CERN AFS as for LHC experiments. Any license
fees will be paid by AMS collaboration according to IT specs– IT-DB, IT-CS may be called for consultancy within the limits of available
manpower
Starting from LR-12 months the Collaboration will need more
people to run computing facilities
29Alexei Klimentov. AMS TIM @ CERN. July 2004.
Year 2004 MC Production Started Jan 15, 2004 Central MC Database Distributed MC Production Central MC storage and archiving Distributed access (under test) SEU Nanjing, IAC Tenerife, CNAF Italy joined production since Apr 2004
30Alexei Klimentov. AMS TIM @ CERN. July 2004.
Y2004 MC production centers
MC Center Responsible GB %
CIEMAT J.Casuas 2045 24.3
CERN V.Choutko, A.Eline,A.Klimentov 1438 17.1
Yale E.Finch 1268 15.1
Academia Sinica Z.Ren, Y.Lei 1162 13.8
LAPP/Lyon C.Goy, J.Jacquemier 825 9.8
INFN Milano M.Boschini, D.Grandi 528 6.2
CNAF & INFN Bologna D.Casadei 441 5.2
UMD A.Malinine 210 2.5
EKP, Karlsruhe V.Zhukov 202 2.4
GAM, Montpellier J.Bolmont, M.Sapinski 141 1.6
INFN Siena&Perugia, ITEP, LIP, IAC, SEU, KNU
P.Zuccon, P.Maestro, Y.Lyublev, F.Barao, C.Delgado, Ye Wei, J.Shin
135 1.6
31Alexei Klimentov. AMS TIM @ CERN. July 2004.
MC Production Statistics
Particle Million Events
% of Total
protons 7630 99.9
helium 3750 99.6
electrons 1280 99.7
positrons 1280 100
deuterons 250 100
anti-protons 352.5 100
carbon 291.5 97.2
photons 128 100
Nuclei (Z 3…28)
856.2 85
97% of MC production doneWill finish by end of July
URL: pcamss0.cern.ch/mm.html
185 days, 1196 computers8.4 TB, 250 PIII 1 GHz/day
32Alexei Klimentov. AMS TIM @ CERN. July 2004.
Y2004 MC Production Highlights Data are generated at remote sites, transmitted to AMS@CERN and available for the analysis (only 20% of data was generated at
CERN) Transmission, process communication and book-keeping programs have been debugged, the same approach will be used for
AMS-02 data handling 185 days of running (~97% stability) 18 Universities & Labs 8.4 Tbytes of data produced, stored and archived Peak rate 130 GB/day (12 Mbit/sec), average 55 GB/day (AMS-02 raw data transfer ~24 GB/day) 1196 computers Daily CPU equiv 250 1 GHz CPUs running 184 days/24h
Good simulation of AMS-02 Data Processing and AnalysisNot tested yet : Remote access to CASTOR Access to ESD from personal desktops TBD : AMS-01 MC production, MC production in Y2005
33Alexei Klimentov. AMS TIM @ CERN. July 2004.
AMS-01 MC Production
Send request to vitaly.choutko@cern.chDedicated meeting in Sep, the target date to start
AMS-01 MC production October 1st
top related