remote online farms sander klous 01 11 010 001 1101 1110 11001 01011 110110 001101 1111111 0111000...

14
Remote Online Farms Sander Klous 01 11 010 001 1101 1110 11001 01011 110110 001101 1111111 0111000 11101010 01001110 110111001 000101101 1111010001 0101111100 111101001111 010110000101 H t W Z 0 On behalf of the Remote Online Farms Working Group ACAT 2007 25 April

Upload: theodore-gray

Post on 03-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Remote Online Farms

Remote Online Farms

Sander Klous

011101000111011110110010101111011000110111111110111000111010100100111011011100100010110111110100010101111100

111101001111010110000101

H

t

W

Z0

On behalf of theRemote Online Farms

Working Group

ACAT 2007 25 April

2

Large Hadron Collider and ATLASLarge Hadron Collider and ATLAS

3

Data processing nightmareData processing nightmare

• There is no way to store all the info produced by ATLAS

40 million events per second x 1.5 MB/event = 60 TB per second

• In fact: 99.9995% of the data is thrown away

- So…The data processing nightmare is all about storage

Unfortunately… No

• Rigorous multilevel trigger system

- First level in hardware

- Higher levels in software

• But what if your favoritechannel is not in the 0.0005%?

4

Online bottleneckOnline bottleneck

• So…The data processing nightmare is all about CPU

LVL1 HLTScarc

e C

PU

resou

rces

Well, that’s a problem…

5

Some are more equal than others…Some are more equal than others…

Detector calibration

Physics selection

Networking enables us to

prioritize these activities

So…• The data processingnightmare is all aboutnetworking

6

Amsterdam

NIKHEF/SARA

Data Acquisition40 MHz

Level 1

Level 2

Accept 1 in 500

Accept 1 in 50

Accept 1 in 10

Level 3

Computing grid

Networkswitch

EconomicsEconomics

In fact, it is about balanceMaximize performance Minimize

costs

7

Gary Stix, editor of Scientific American

Is it worth the effort?Is it worth the effort?Ja

nu

ary

20

01

8

This can be difficult… The basicsThis can be difficult… The basics

ROB

Data Collection Network

L2PU

EFEvent Filter

North Area ATLAS DetectorsLevel 1 Trigger

Back End Network

SFO

Massstorage

Bdlg. 513

CopenhagenEdmontonKrakowManchesterAmsterdam

RF

Remote Event Processing Farms

RF

RFRF

Packet

Switched

(GEANT)

Switchli

ghtp

ath

Local Farm

The “Magni” Cluster

ROBROBROB

SFISFISFI

9

Input

ExtPT

ExtPT

Trash

Output Output

EFD

B

CD

LVL2RoiB

L2sv

L2pu

Ros/Robin pRos

DFM

SFI (partially) build

SFOStream

selection

Stream nStream 1 Stream 2

Ath / CALid

Ath / PESA

PT

From LVL1

EventLVL1 Info

Add toStreamTag1

Stripping

Ath / CalStr

Ath / CALIB

PT

Add toRoutingTag

CreateRoutingTag

Routing

Duplicating

Routing

(Partial) EventLVL1 Info

RoutingTagStreamTag 1

EventLVL1 Info

RoutingTagStreamTag 2

EventLVL1 Info

RoutingTagStreamTag 2

(Partial) EventLVL1 Info

RoutingTagStreamTag

(Partial) EventLVL1 Info

RoutingTagStreamTag 1

Add toStreamTag2

CreateStreamTag

Remote

Stream implementationStream implementation

10

Grid/Proxy implementationGrid/Proxy implementation

SFISFI

SFI

EFDBuffer

PT PTPTPT

PTPT

Local PT Farm

EFDBuffer

PTPTPTPTPTPT

PTPTPTPTPTPT

DispatcherBroker

CE

CE

int.eu.grid HEP VO

Infrastructure monitoring

ApplicationMonitoring

Remote PTs

Proxy PT

Events

HEP VODatabase

Worker Nodes

UI

Worker Nodes

11

12

13

Open issuesOpen issues

• Data management

• Software management

• Database access

• Authentication, Authorization and Accounting

• Performance and reliability

• Looking for PhD student…

- If you are interested:

Mail to [email protected]

14

ConclusionConclusion

• Remote online farms are interesting

- From a physics perspective

- From a computer science perspective

- From an organizational perspective

• Infrastructure is put in place

- Many open questions

- More news next year

• PhD candidates: mail to [email protected]