remote online farms sander klous 01 11 010 001 1101 1110 11001 01011 110110 001101 1111111 0111000...
TRANSCRIPT
Remote Online Farms
Remote Online Farms
Sander Klous
011101000111011110110010101111011000110111111110111000111010100100111011011100100010110111110100010101111100
111101001111010110000101
H
t
W
Z0
On behalf of theRemote Online Farms
Working Group
ACAT 2007 25 April
3
Data processing nightmareData processing nightmare
• There is no way to store all the info produced by ATLAS
40 million events per second x 1.5 MB/event = 60 TB per second
• In fact: 99.9995% of the data is thrown away
- So…The data processing nightmare is all about storage
Unfortunately… No
• Rigorous multilevel trigger system
- First level in hardware
- Higher levels in software
• But what if your favoritechannel is not in the 0.0005%?
4
Online bottleneckOnline bottleneck
• So…The data processing nightmare is all about CPU
LVL1 HLTScarc
e C
PU
resou
rces
Well, that’s a problem…
5
Some are more equal than others…Some are more equal than others…
Detector calibration
Physics selection
Networking enables us to
prioritize these activities
So…• The data processingnightmare is all aboutnetworking
6
Amsterdam
NIKHEF/SARA
Data Acquisition40 MHz
Level 1
Level 2
Accept 1 in 500
Accept 1 in 50
Accept 1 in 10
Level 3
Computing grid
Networkswitch
EconomicsEconomics
In fact, it is about balanceMaximize performance Minimize
costs
7
Gary Stix, editor of Scientific American
Is it worth the effort?Is it worth the effort?Ja
nu
ary
20
01
8
This can be difficult… The basicsThis can be difficult… The basics
ROB
Data Collection Network
L2PU
EFEvent Filter
North Area ATLAS DetectorsLevel 1 Trigger
Back End Network
SFO
Massstorage
Bdlg. 513
CopenhagenEdmontonKrakowManchesterAmsterdam
RF
Remote Event Processing Farms
RF
RFRF
Packet
Switched
(GEANT)
Switchli
ghtp
ath
Local Farm
The “Magni” Cluster
ROBROBROB
SFISFISFI
9
Input
ExtPT
ExtPT
Trash
Output Output
EFD
B
CD
LVL2RoiB
L2sv
L2pu
Ros/Robin pRos
DFM
SFI (partially) build
SFOStream
selection
Stream nStream 1 Stream 2
Ath / CALid
Ath / PESA
PT
From LVL1
EventLVL1 Info
Add toStreamTag1
Stripping
Ath / CalStr
Ath / CALIB
PT
Add toRoutingTag
CreateRoutingTag
Routing
Duplicating
Routing
(Partial) EventLVL1 Info
RoutingTagStreamTag 1
EventLVL1 Info
RoutingTagStreamTag 2
EventLVL1 Info
RoutingTagStreamTag 2
(Partial) EventLVL1 Info
RoutingTagStreamTag
(Partial) EventLVL1 Info
RoutingTagStreamTag 1
Add toStreamTag2
CreateStreamTag
Remote
Stream implementationStream implementation
10
Grid/Proxy implementationGrid/Proxy implementation
SFISFI
SFI
EFDBuffer
PT PTPTPT
PTPT
Local PT Farm
EFDBuffer
PTPTPTPTPTPT
PTPTPTPTPTPT
DispatcherBroker
CE
CE
int.eu.grid HEP VO
Infrastructure monitoring
ApplicationMonitoring
Remote PTs
Proxy PT
Events
HEP VODatabase
Worker Nodes
UI
Worker Nodes
13
Open issuesOpen issues
• Data management
• Software management
• Database access
• Authentication, Authorization and Accounting
• Performance and reliability
• Looking for PhD student…
- If you are interested:
Mail to [email protected]
14
ConclusionConclusion
• Remote online farms are interesting
- From a physics perspective
- From a computer science perspective
- From an organizational perspective
• Infrastructure is put in place
- Many open questions
- More news next year
• PhD candidates: mail to [email protected]