https://doi.org/10.1051/epjconf/202429503041
Towards a distributed heterogeneous task scheduler for the ATLAS offline software framework*
Lawrence Berkeley National Laboratory, 1 Cyclotron Rd, Berkeley, CA 94720, USA
Published online: 6 May 2024
With the increased data volumes expected to be delivered by the HLLHC, it becomes critical for the ATLAS experiment to maximize the utilization of available computing resources ranging from conventional GRID clusters to supercomputers and cloud computing platforms. To run its data processing applications on these resources, the ATLAS software framework must be capable of efficiently executing data processing tasks in heterogeneous distributed computing environments. Today, using the Gaudi Avalanche Scheduler, whose implementation is based on Intel TBB, we can efficiently schedule Athena algorithms to multiple threads within a single compute node. We aim to develop a new framework scheduler capable of supporting distributed heterogeneous environments, based on technologies like HPX or Ray. After the initial evaluation phase of these technologies, we began the development of a prototype distributed task scheduler for the Athena framework. This contribution describes this prototype scheduler and the preliminary results of performance studies within ATLAS data processing applications.
© The Authors, published by EDP Sciences, 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.