feedback from atlas speaker – doug benjamin (duke university) on behalf of the atlas collaboration...

Feedback from ATLAS

Speaker – Doug Benjamin(Duke University)

On behalf of the ATLAS collaboration

Contributors to talk: DB, Frank Berghaus, Alessandro De

Salvo, Asoka De Silva, Andrej Filipcic, Ian Gable, John Hover, Ryan Taylor, Alex

Undrus, Rod Walker

Outline of Talk

• ATLAS collaboration’s use of cvmfs file systemo Use cases and issues

• ATLAS cloud computingo How cernvm is used and scaleo Why cernvm is not a universal choice across ATLAS

• Shoalo Implemented and tested solution for proxy cache

discovery

CVMFS in ATLAS• Initially deployed in ATLAS Tier 3 community as a

way to trivially deploy software to sites with little computing support.

• Good ideas spread fast – ATLAS quickly realized it power and eventually to all ATLAS grid computing sites.

• CVMFS is used by all aspects of ATLAS offline computing from grid and batch jobs, users interactive analysiso LXPLUS, Tier 1 computing centers, local off grid clusters

to individual physics laptops

Types of data served via

CVMFS

• ATLAS offline software releaseso atlas.cern.ch Total size – 653 GB, grew 144 kB in last 3 months

• ATLAS conditions data too big for Oracle DBo atlas-cond.cern.ch Total size – 515 GB, grew 11 kB in last 3

mon.

• CERN SFT software repository• ATLAS nightly software releases• User interface software for interactive use• Site Configuration Data/ Configuration agents• Alessandro DeSalvo -“In general everything that you

want to be distributed to many sites while centrally managing a single master copy”

• CVMFS client stability/reliability –o Saw file corruptions with old CVMFS v2.0 clients corruptions o Since v2.1 – problem solved!!!

CVMFS over NFS

• Use in ATLAS pioneered and early testing by NORDU Grid

• SiGNET (4k cores) w/ nfs4 exports & single nfs server for >since 2 years works rather well.o Except: when the server gets in a strange state w/ nfs

problems) and is rebooted. o Entire cluster needs to remount nfs. (couple times per year)o Stability is improved with recent kernels (3.12) with better

fuse/nfs4 support on the NFS server. Client nodes unchanged.o If there are many sw validation jobs running (~100) with

different releases, the jobs would take very long time to complete (10h) and some might crash.

• DESY uses CVMFS over NFS• In CA Cloud – SFU uses it and is a test bed – happy w/ it• they have tested CVMFS over NFS on SL7 (the latest

test version of cvmfs that was announced a few weeks ago) on the nfs server and exported it successfully.

CVMFS and HPC’s(ie super computers)

• CVMFS works on many HPC sites already – cluster like sites, typically smaller

• Largest sites with thousands of nodes and cores have high speed very powerful parallel file systems but…o Nodes do not have TCP code stack nor connectivity to

outsideo Many simultaneous I/O operations negatively affect the

power parallel file systemso Fuse on the compute nodes not an option

• We need some straightforward tool to cache the files we need in the compute nodes memory if small enough --- New project perhaps?

CVMFS and ATLAS nightly code releases

(slide from Alex Undrus)• In order to aid in ATLAS software development and debugging,

ATLAS produces releases that change every night. • Nightly CVMFS repository demonstrates another practical CVMFS

use case: repository for short term storage. • The improved CVMFS server version 2.1.19 made this use case

easier to handle: o Reduced publish timeso Improved garbage handling

• with daily removal/additions the disk usage grew only twice during 4 months of operation

• With previous CVMFS server version the disk usage grew much faster

• However ATLAS can not operate CVMFS servers without CERN IT assistanceo Nightly CVMFS repository handled differently than the other repositories

– Should it be?

• We recommend CERN IT just invest a bit more effort to CVMFS support.o For ATLAS CVMFS and AFS are almost equally important

CVMFS and ATLAS nightly code releases(2)

(slide from Alex Undrus)• Use cases for ATLAS nightly release on CVMFS, they include:

o Use at remote Tier 3's : CVMFS is faster at remote locations o Use at systems that do not have AFS for some reasons o Use for limited production on the GRID (rarely used by ATLAS, but

possible) Use for nightlies validation at GRID sites (through HammerCloud system that fully supports nightlies testing)

• number of repositories: 1 with size (cvmfs/atlas-nightlies.cern.ch/data) : 0.27 TB

• amount of data served: ~ 0.3 - 0.4 TB (varies day by day) • repository growth: no growth: 6 nightly releases are installed daily• How do we deal with mistakes?

o mis-installation: due to transient nature of nightlies - releases with installation problems are simply removed (but such problems are rare)

• How often do we remove data from the repositories?o there are 6 releases of 6 nightly branches available (total 36 releases).

Every day 1 of 6 releases (of each nightly branch) is substituted with newly built nightly release.

o substantial amount of data is removed every day (but many files in substitute release are unchanged, percentage really varies from 100% to ~30%)

CVMFS section conclusions

• CVMFS is a critical part of the ATLAS computing infrastructure.o User tools for interactive worko Offline software (both static releases and nightlies)o Conditions datao Site configuration code and data

• Large HPC’s represent new challenges and opportunities.o They can be much different than Grid or Cloudo Hopefully CVMFS can be part of the solution

ATLAS Cloud Computing

Sept 17, 2014 ATLAS Software & Computing Workshop 11/22

Cloud Scheduler

Condor Central Manager

User

Cloud Scheduler

Clouds boot Virtual Machines VM contextualizes to attach to the condor and

processes jobs Cloud scheduler retires VM when no jobs

require that VM

Sch

edul

er s

tatu

s co

mm

unic

atio

n

VirtualMachine

VirtualMachine

VirtualMachine

VirtualMachine

VirtualMachine

VirtualMachine

Ope

nSta

ck Cloud Interface

Ope

nSta

ck Cloud Interface

Goo

gle Cloud Interface

...

CernVM in ATLAS

(Frank Berghaus)• Features:

o Operating system and project software is made available over cvmfs

o Cloud-init used contextualization of images

o Same image works anywhere, on any Hypervisor, and on any Cloud Type

• Dynamic Condor Slot Configurationo Multi core and single core


ATLAS Cloud Production in 2014

Google

HelixNebula

Over 1.2M ATLAS Jobs Completed

Mostly Single core production

New ATLAS requirements:

Multi-core High memory

Completed Jobs in 20141.2M

Jan 2014 Sep 2014

CERN

North America

AustraliaUK

Amazon

As of September 2014

ATLAS Cloud 2014

CERN PROD – 394 CPU yrs (CernVM)IAAS – 202 CPU years (CernVM)BNL/Amazon – 118 CPU yearsATLAS@HOME – 89 CPU yrs (CernVM)

Jan ‘15

Jan ’14

CERN – HLT farm Point 1 – Sim@P1For Run 2 - run when LHC off for 24 hrs – under complete control of Online group1.5 hrs to saturate the farm with VM instantiationRunning 10 GB cvmfs cache w/o issues Jan

’14Jan ‘15

Daily Slots Running jobs

20 k

0

10 k

GRIDPP – 17 CPU yrs (CernVM)

NECTAR – 35 CPU years

CPU consumption

Why some sites do not use

CernVM?

( John Hover, BNL )• (RACF/USATLAS) builds own images for several reasons,

1. First and foremost we are OSG sites. Part of our contribution back to OSG is to provide methods and tools usable by any OSG VO, many from other sciences e.g. biology.

2. CernVM, as a CERN product, and perceived to be more-or-less HEP oriented. • How does one change this perception?

3. CernVM is an artifact produced by a service at CERN, and only re-producible by them.

4. USATLAS provides both build-time and run-time customization via a (user-usable) toolset—a product, freely available for everyone's usage, heavily based on existing 3rd-party utilities (imagefactory, oz, RPM, puppet, etc.).

• What would it take for USATLAS/OSG to adopt CernVM?o “If the CernVM team were to release the complete toolset, and support it so anyone

can generate their own customized VM, then we'd seriously consider it. “ – John Hovero Issue is really service vs. product. People are more willing to try products than

services

CernVM and Cloud computing Conclusions

• ATLAS routinely uses CernVM as part of its cloud portfolio

• Thought should be given how to attract others outside of HEP and Astrophysics to using CernVM.

ShoalDynamic web cache discovery system

Ian Gable

https://github.com/hep-gc/shoal

Frank Berghaus, Andre Charbonneau,

Mike Chester, Ron Demarais, Colson

Driemel, Rob Prior, Alex Lam, Randall

Sobie, Ryan Taylor






Advertise the existence of Squid caches every 30 seconds using AMQP messaging to the shoal server. VMs contact the server REST interface to get the closest squid.Uses GeoIP libraries to determine closest squid for each client that contacts the serverEach squid also sends network load information.

Squids that don’t send messages regularly are automatically removed.

Three components

a reference client that can be used to contact the REST interface of the shoal-server to configure CVMFS.

maintains the list of running squids. It uses RabbitMQ to handle incoming AMQP messages from squid servers. It provides a REST interface for programmatically retrieving a json formatted ordered list of squids. It also provides a web interface for viewing the list.

shoal-server

shoal-agent

shoal-client

runs on squid servers and publishes the load and IP of the squid server to the shoal-server using a json formatted AMQP message at regular intervals. This agent is designed to be trivially installed in a few seconds with python's pip tool or yum.

json rest interface

Human readable monitoring

WPAD interface

Reliability features

Each squid can be ‘verified’ by the server, i.e. confirm that you can access the CVMFS repos from that squid. Currently we test this by checking availability of the .cvmfswhitelist file from multiple Stratum 1 servers, for example:

"http://cvmfs.racf.bnl.gov:8000/cvmfs/atlas/.cvmfswhitelist",

this protects against accidentally misconfigured agents.

‘Verification’

Configurable bandwidth limit on agentSet the maximum bandwidth which you want your squid to server before shoal stops handing it out

Configurable access levels on agentsSquid can be advertised as: “No outside access”, “Global”, etc (see docs)

Scalability

With 800 squids advertising every 30 seconds the shoal server is able to serve 1000 REST request per second for the nearest squid server. No real attempt to optimize yet.

CHEP paper:

Using 2012 generation 16 core core server

http://dx.doi.org/10.1088/1742-6596/513/3/032035

http://dx.doi.org/10.1088/1742-6596/513/3/032035

http://dx.doi.org/10.1088/1742-6596/513/3/032035

Deploying

Used now for IaaS ATLAS clouds

All components available as SL RPMs available from a yum repo.

Shoal Agent and Client available from python pip.

Client in particular has trivial dependency of only python 2.4+.

All docs and installation instructions available from GitHub







Experimental Docker Support

Containers for all three components

Automated builds that start with the CentOS image (could be changed)

Very simple Dockerfile

Conclusions

• Shoal solves the need of squid discovery for cloud clients.

• ATLAS uses it at scale and welcomes others to use it as well.

• CVMFS and CernVM are software tools very important to the ATLAS experiment

• Our collaboration with CernVM team over the years has been and continues to be very successful and enjoyable

• We hope to continue this collaboration as we expand more and more into HPC’s

Backup Slides


Infrastructure-as-a-Service (IaaS) Clouds

IaaS Cloud: A pool of virtual machine hypervisors presenting a single controller interface Run many instances of one virtual machine configured for

ATLAS computing Advantages:

Isolate complex application software from site administration Minimize dependence on local system Flexible resource allocation

Examples: OpenStack Nimbus Commercial clouds: Amazon, Google, etc.

Running at labs (e.g., CERN), universities (e.g., Victoria),and research networks (e.g., GridPP)


Cloud Scheduler

Cloud Scheduler is a python package for managing VMs on IaaS clouds

Users submit HTCondor jobs Optional attributes specify virtual machine properties

Dynamically manages quantity and type of VMs in response to user demand

Easily connects to many IaaS clouds, and aggregates their resources

Provides IaaS resources in the form of an ordinary HTCondor batch system

Used by ATLAS, Belle II, CANFAR, and BaBar


Cloud Scheduler

Cloud Scheduler is a python package for managing VMs on IaaS clouds

Users submit HTCondor jobs Optional attributes specify virtual machine properties

Dynamically manages quantity and type of VMs in response to user demand

Easily connects to many IaaS clouds, and aggregates their resources

Provides IaaS resources in the form of an ordinary HTCondor batch system

Used by ATLAS, Belle II, CANFAR, and BaBar

Code https://github.com/hep-gc/cloud-scheduler

Website http://cloudscheduler.org/

Publication http://arxiv.org/abs/1007.0050

https://github.com/hep-gc/cloud-scheduler

http://cloudscheduler.org/

http://arxiv.org/abs/1007.0050


Cloud Job Flow (on the Grid)

User

PandaQueue

PilotFactory Clouds

VirtualMachine

HTCondorCloud

Scheduler

User Job

Pilot Job

Easy to connect and use many clouds

Boot Instances

feedback from atlas speaker – doug benjamin (duke university) on behalf of the atlas collaboration...

Documents