Overview of the ATLAS distributed computing system

Johannes Elmsheuser; Alessandro Di Girolamo

doi:10.1051/epjconf/201921403010

EPJ

a
b
c
d
e
ap
st
h
plus
ds
pv
ti
qt
am
n

Proceedings

Open Access

EPJ Web of Conferences 214, 03010 (2019)
https://doi.org/10.1051/epjconf/201921403010

Overview of the ATLAS distributed computing system

Johannes Elmsheuser¹^* and Alessandro Di Girolamo² for the ATLAS collaboration

¹ Brookhaven National Laboratory, Upton, NY, USA
² CERN, Geneva, Switzerland

^* e-mail: johannes.elmsheuser@cern.ch

Published online: 17 September 2019

Abstract

The CERN ATLAS experiment successfully uses a worldwide computing infrastructure to support the physics program during LHC Run 2. The Grid workflow system PanDA routinely manages 250 to 500 thousand concurrently running production and analysis jobs to process simulation and detector data. In total more than 370 PB of data is distributed over more than 150 sites in the WLCG and handled by the ATLAS data management system Rucio. To prepare for the ever growing LHC luminosity in future runs new developments are underway to even more efficiently use opportunistic resources such as HPCs and utilize new technologies. This paper will review and explain the outline and the performance of the ATLAS distributed computing system and give an outlook to new workflow and data management ideas for the beginning of the LHC Run 3. It will be discussed that the ATLAS workflow and data management systems are robust, performant and can easily cope with the higher Run 2 LHC performance. There are presently no scaling issues and each subsystem is able to sustain the large loads.

© The Authors, published by EDP Sciences, 2019

This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.